I saw in postgresql that there are two separate algorithms called external sort and external merge for sorting. I was under the impression that both are same. As far as i know external sorting is a collection of sorting algorithms that deals with sorting of large amounts of data when the entire lot cannot be sorted in memory(RAM) and has two phases where the first phase is to sort the small chunks of data and store it in temporary files and the second phase is to merge all these sub-files to get the final data set.
I also know that external merge sort algorithm is an example of external sorting technique.
So in my case, aren't both external sort and external merge pretty much the same? I would like to know the difference and also when each of these algorithms are used(on what type of data).
PS: On the same type of data, external merge takes way longer time than external sort.ql
I think you are confused.
"Merge sort" is definitely a type of sorting algorithms. Other sorting algorithms are quicksort, bubble sort, heapsort and so on.
However, in databases (and I would say in general in DAGs),
mergerefers to something slightly different. A merge is taking two sorted datasets and combining them. It can do this by comparing elements one by one as it walks through the two in parallel.This is related to the merge sort algorithm. Such merges are what is happening under the hood. But the merge operator in this case is merging already sorted lists.
I should also point out that
mergeis a statement in many database that allows insertions and updates in a single statement. That is not true in Postgres (which useson conflict), but it is another use of "merge" in this domain.