Heap
A heap is a binary tree data structure. Because it is a tree, in comparison to a priority queue, insertion and deletion can be done in O(n log n) time, as opposed to O(n)/O(1) time for linked lists or arrays.
What Makes A Heap A Heap
What makes a heap a heap is that:
- it is a complete binary tree, meaning that it is always filled from top left towards the bottom right with no spaces in between, and
- the value of each node must be no greater than (or no less than) the values of their child nodes (also read as: the value of each child node must be less than or equal to [or greater than or equal to] the value of the parent node).
A heap is not necessarily sorted, and being sorted is not a requirement for a heap.
A min heap is where the top of the heap is the smallest item. A max heap is where the top of the heap is the biggest item.
Operation | Time Complexity |
---|---|
Peek (get value of top node) | O(1) |
Insertion and deletion | O(log n) |
Maximum/minimum (max/min heap) | O(1) |
Minimum/maximum (max/min heap) | O(log n) |
Insertion
When inserting into a heap, we always follow this algorithm (using a max heap in this case):
- Insert the new value as a node to the bottom-right of the heap.
- If the new value is greater than its parent, swap the parent node with the new node.
- Repeat step 2 until the new value is not greater than its parent, or until it is the new root node.
A min heap works the same way, except checking if the value is less than its parent.
Deletion (Pop)
When deleting the top node from a heap (the max/min value of a max/min heap), we always follow this algorithm (using a max heap in this case):
- Replace the top of the heap with the bottom right value in the heap, naming the new node at the top of the heap our "current node",
- Remove the last node of the heap.
- If our current node value is not greater than or equal to their child node values, swap the current node for the largest child node.
- Repeat step 3 until the current node value is greater than or equal to their child node values, or until there are no more child nodes.
A min heap works the same way, except checking if the current node value is less than or equal to the child node values and if not, swapping for the smaller of the two child nodes.
Determining If A Node Is A Leaf
a leaf node is a node with no children. If you know the heap size and the target node index, you can determine if it is a leaf with the formula (target node index) > (heap size) // 2
, with //
being floor division.
For example, if you have a heap with 7 elements and you want to determine if the node at index 5 is a leaf, you would calculate 5 > 7 // 2 = 5 > 3 = true
.
Python Implementation[1]
Heaps in Python are implemented as arrays which have methods applied to them from the built in heapq
library. This is by default a min heap; to do a max heap, you have to kludge[3].
import heapq
min_heap = []
# Transform the list `min_heap` into a heap in linear time
heapq.heapify(min_heap)
heapq.heappush(3)
heapq.heappush(10)
heapq.heappush(1)
heapq.heappush(9)
# `min_heap` is [1, 9, 3, 10]
value = heapq.heappop()
# value == 1
# `min_heap` is [3, 9, 10]
Max Heap
To implement a max heap with number values, the easiest way[3] is to invert every value on input and then again on output (e.g. a max heap of [4, 2, 1]
is actually represented in the Python heap as [-4, -2, -1]
). An easy way to do this is to Create a wrapper class to instantiate the array and heapify
it, and perform the inversion on all operations via abstraction methods.
References
- https://docs.python.org/3/library/heapq.html
- https://leetcode.com/explore/learn/card/heap/
- https://stackoverflow.com/a/2501527/14857724
Last modified: 202401040446