Merging k sorted arrays - Priority Queue vs Traditional Merge-sort merge, when to use which?

1.6k Views Asked by At

Assuming we are given k sorted arrays (each of size n), in which case is using a priority heap better than a traditional merge (similar to the one used in merge-sort) and vice-versa?

Priority Queue Approach: In this approach, we have a min heap of size k (initially, the first element from each of the arrays is added to the heap). We now remove the min element (from one of the input arrays), put this in the final array and insert a new element from that same input array. This approach takes O(kn log k) time and O(kn) space. Note: It takes O(kn) space because that's the size of the final array and this dominates the size of the heap while calculating the asymptotic space complexity.

Traditional Merge: In this approach, we merge the first 2 arrays to get a sorted array of size 2n. We repeat this for all the input arrays and after the first pass, we obtain k/2 sorted arrays each of size 2n. We repeat this process until we get the final array. Each pass has a time complexity of O(kn) since one element will be added to the corresponding output array after each comparison. And we have log k passes. So, the total time complexity is O(kn log k). And since we can delete the input arrays after each pass, the space used at any point is O(kn).

As we can see, the asymptotic time and space complexities are exactly the same in both the approaches. So, when exactly do we prefer one over the other? I understand that for an external sort the Priority Queue approach is better because you only need O(k) in-memory space and you can read and write each element from and back to disk. But how do these approaches stack up against each other when we have enough memory?

1

There are 1 best solutions below

0
On

The total number of operations, compares + moves, is about the same either way. A k-way merge does more compares but fewer moves. My system has an 8 way cache (Intel 3770K 3.5 ghz), which in the case of a 4 way merge sort, allows for 4 lines of cache for the 4 input runs and 1 line of cache for the merged output run. In 64 bit mode, there are 16 registers that can be used for working variables, 8 of them used for pointers to the current and end position of each "run" (compiler optimization).

On my system, I compared a 4 way merge (no heap, ~3 compares per element moved) versus a 2 way merge (~1 compare per move, but twice as many passes), the 4 way has 1.5 times the number of compares, but 0.5 times the number of moves, so essentially the same number of operations, but the 4 way is about 15% faster due to cache issues.

I don't know if 16 registers is enough for a 6 way merge to be a tiny bit faster, and 16 register is not enough for an 8 way merge (some of the working variable would be memory / cache based). Trying to use a heap probably wouldn't help as the heap would be memory / cache based (not register based).

A k-way merge is mostly useful for external sorts, where compare time is ignored due to the much larger overhead of moves.