I have two very large files (rows in billions), and the rows are sorted and unique, I want an efficient way to merge these two files into one file where its rows are sorted and unique. I thought about merging the two files and using the command
sort -u
but that doesn't seem very convenient, because I won't take advantage of the fact that the two files are both sorted.
First of all this is a linux related question, Hence the correct forum is stackeExchange.
Next, it depends on how you want rows sorted.
If you want lines in file1 and file2 to be sorted in a combined way, then
sort-uis way to go.If you just want to combine already pre-sorted file2 , file2 , you can simply concat them e.g.
cat file1 file2 >file3You can implement a custom sort by looping through each line and employing any one of sorting algorithms. Although it will be similar to and slower than option (1), so why do it hard way.