Merging two large sorted files into one unique file

651 Views Asked by At

I have two very large files (rows in billions), and the rows are sorted and unique, I want an efficient way to merge these two files into one file where its rows are sorted and unique. I thought about merging the two files and using the command

sort -u

but that doesn't seem very convenient, because I won't take advantage of the fact that the two files are both sorted.

1

There are 1 best solutions below

0
Anil_M On

First of all this is a linux related question, Hence the correct forum is stackeExchange.

Next, it depends on how you want rows sorted.

  1. If you want lines in file1 and file2 to be sorted in a combined way, then sort-u is way to go.

  2. If you just want to combine already pre-sorted file2 , file2 , you can simply concat them e.g. cat file1 file2 >file3

  3. You can implement a custom sort by looping through each line and employing any one of sorting algorithms. Although it will be similar to and slower than option (1), so why do it hard way.