The code for_siftup
at github - python/cpython/Lib/heapq.py has a final call to _siftdown
:
def _siftup(heap, pos): endpos = len(heap) startpos = pos newitem = heap[pos] # Bubble up the smaller child until hitting a leaf. childpos = 2*pos + 1 # leftmost child position while childpos < endpos: # Set childpos to index of smaller child. rightpos = childpos + 1 if rightpos < endpos and not heap[childpos] < heap[rightpos]: childpos = rightpos # Move the smaller child up. heap[pos] = heap[childpos] pos = childpos childpos = 2*pos + 1 # The leaf at pos is empty now. Put newitem there, and bubble it up # to its final resting place (by sifting its parents down). heap[pos] = newitem _siftdown(heap, startpos, pos)
It seems like the logic in _siftup(...)
is enough to place the newitem
in the correct position maintaining the heap invariant? Why is a call to _siftdown()
required?
This is the consequence of a particular choice the authors made in the algorithm.
More common is an algorithm where this final
_siftdown()
is not necessary, but then the loop must stop whennewitem < heap[childpos]
, after whichpos
will be a valid spot fornewitem
and no more sifting is needed.In this version however, the loop continues until a leaf is found, and
newitem
is placed at a leaf spot. This may not be a valid spot fornewitem
, so the extra call is needed to go back up to a valid spot.In the comment block that precedes this function, the authors have explained why they made this choice, which at first seems to be less efficient, but in practice turns out to result in fewer comparisons:
See also Wikipedia - bottom-up heapsort:
It is a pitty that Wikipedia discusses this in the context of heapsort, since it applies to heap interactions even when the heap does not serve a heapsort process.