CS Roadmap (Bonus) — Heaps and Priority Queues: The Economics of Partial Order
- CS Roadmap (0) — Why CS Knowledge Matters More Than Ever in the AI Era
- CS Roadmap (1) — Arrays and Linked Lists: Reading the Terrain of Memory
- CS Roadmap (2) — Stack, Queue, Deque: Powerful Abstractions Born from Restriction
- CS Roadmap (3) — Hash Tables: Conditions and Limits of O(1) Lookup
- CS Roadmap (4) — Trees: Order, Balance, and Guaranteeing O(log n)
- CS Roadmap (5) — Graphs: The Network of Relationships, the Science of Paths
- CS Roadmap (6) — Memory Management: Stack & Heap, GC, and the Things That Eat Your Frames
- CS Roadmap (Bonus) — Heaps and Priority Queues: The Economics of Partial Order
- A priority queue is a problem where only the minimum (or maximum) needs to be fast — a full sort (O(n log n)) is overkill. Heaps maintain only a 'partial order' to achieve O(log n) push/pop
- A binary heap is the magic of representing a complete binary tree as an array — with zero pointers, you navigate the tree using only index arithmetic: parent = (i-1)/2, left = 2i+1
- Floyd's (1964) bottom-up build-heap runs in O(n). Counterintuitively, it's not O(n log n) — deeper levels have many nodes but only shallow sift-downs
- In practice, A* open-set decrease-key is handled via lazy deletion (ignoring stale entries when popped). Fibonacci heap's O(1) decrease-key actually loses to binary heap on grid A* because of the heavy constant factor
Introduction
This article is a bonus installment of the CS Roadmap series.
In Part 5 on Graphs, when we covered Dijkstra and A, we left one sentence hanging: *“Using a priority queue (binary heap), we get $O((V+E) \log V)$.” But what that “binary heap” is, why it’s binary, and how it achieves $O(\log n)$ — all of that was deferred. This bonus pays back that debt.
After Phase 1 (Data Structures and Memory) wrapped up in six installments, heaps and priority queues remained as “that one core piece we skipped.” They’re an application of trees and an arithmetic of arrays; they’re the engine of A* and the heart of event schedulers. A small topic, but one that shows up too often in game development to skip over.
What this article covers:
- Why partial order suffices instead of a full sort
- How a binary heap operates on top of an array
- The mathematical basis of sift-up / sift-down and a C# implementation
- Floyd’s (1964) $O(n)$ build-heap
- d-ary heap, pairing heap, Fibonacci heap — when to use which
- The decrease-key problem in A*’s open set and its practical solution
- Analysis of .NET’s
PriorityQueue<TElement, TPriority>
Part 1: The Priority Queue Problem
“Fast minimum only” — A Full Sort Is Overkill
Consider this problem.
You have 1,000,000 integers. You want to extract only the 1,000 smallest in order.
Intuitively, the solution that comes to mind is sort everything, then take the first 1,000. Sorting the whole thing costs $O(n \log n)$, and the output takes $O(k)$. But think about it — that’s wasteful. We never use the order of the remaining 999,000 elements. What we actually need is just two operations: “extract minimum” and “insert.”
This is the priority queue problem. Like a regular queue (FIFO), it has a “push and pop” interface, but the pop order is not “first in” — it’s “highest priority first.”
1
2
3
4
5
Regular queue (FIFO): Priority queue:
[4, 7, 1, 9, 3] [4, 7, 1, 9, 3]
↓ dequeue ↓ extract-min
4 (insertion order) 1 (smallest value)
Where this problem shows up in games:
- A* / Dijkstra: “Pop the node with the smallest f-value”
- Event scheduler: “Pop the event with the earliest execution time”
- AI action selection: “Pop the action with the highest utility score”
- DSP / audio mixing: “Play only the top N most important sound channels”
Tradeoffs of Naive Implementations
To appreciate heaps, we need to first ask: “Why not use an array or a BST?”
| Structure | insert | extract-min | Notes |
|---|---|---|---|
| Unsorted Array | $O(1)$ | $O(n)$ | Append to the end; scan the whole thing on pop |
| Sorted Array | $O(n)$ | $O(1)$ | Binary search to find position, but shift cost is $O(n)$ |
| Sorted Linked List | $O(n)$ | $O(1)$ | Linear search, no shifting, but still |
| BST (balanced) | $O(\log n)$ | $O(\log n)$ | 4–5 pointers/node, cache misses |
| Binary Heap | $O(\log n)$ | $O(\log n)$ | Implemented as an array, zero pointers |
Looking at the table, a natural question arises: “If the heap has the same complexity as a BST, why is it better?” The answer lies in constants and cache — the same story from Part 1.
The Heap’s Core Idea: Partial Order Is Enough
The key insight in the priority queue problem is this.
We only extract one minimum at a time. We don’t care about the relative order of the remaining elements.
A full sort determines “$a < b$ or $a > b$” for every pair. But extracting the minimum only requires the far weaker condition: “the root is the overall minimum.” This “weakness” is what gives room for $O(\log n)$.
The heap’s promise (min-heap):
- The root is the overall minimum
- Each parent is less than or equal to both of its children (no order enforced between siblings)
This is called the heap property. Unlike the BST’s “left < parent < right,” a heap only enforces the parent-child relation, leaving sibling and subtree ordering arbitrary. This “looseness” is the secret behind fast insert/delete.
Part 2: The Structure of a Binary Heap
Complete Binary Trees
In Part 4 on Trees, we defined a complete binary tree: “a tree where every level except possibly the last is completely filled, and the last level is filled from the left.” A binary heap must be a complete binary tree.
1
2
3
4
5
6
7
8
9
10
1
/ \
3 2
/ \ / \
7 5 4 8
/ \
9 10
height = ⌊log₂(n)⌋ = ⌊log₂(8)⌋ = 3
the last level is filled from the left
Why complete? There are two reasons.
- Height is always $\lfloor \log_2 n \rfloor$ — ensures sift-up/down is $O(\log n)$
- It can be represented as an array — no empty slots, so the array packs tightly
The second reason is the magic of this data structure.
A Tree Represented as an Array — A Tree Without Pointers
If you store the nodes of a complete binary tree in level order (top-to-bottom, left-to-right) in an array, parent-child relations become index arithmetic.
Using 0-based indexing:
\[\text{parent}(i) = \left\lfloor \frac{i-1}{2} \right\rfloor, \quad \text{left}(i) = 2i+1, \quad \text{right}(i) = 2i+2\]The 1-based form is prettier (Williams’s original paper used it):
\[\text{parent}(i) = \left\lfloor \frac{i}{2} \right\rfloor, \quad \text{left}(i) = 2i, \quad \text{right}(i) = 2i+1\]Why is this so powerful?
- Contiguous memory — the entire tree is a single array. The prefetcher pulls in the next nodes ahead of time
- Zero pointers — one value per node. Memory density far exceeds a BST that uses 4–5 words per node
- Cache line efficiency — 16
intelements fit in a 64-byte cache line. The top few levels stay resident in L1
In Part 1 on arrays and linked lists, we said “arrays store data in contiguous memory and enjoy the maximum benefit of cache locality.” A heap is a data structure that transplants this principle into a tree.
Wait, let’s clarify this
Q. The 1-based index has prettier formulas, so why does everyone use 0-based in practice?
Because arrays are 0-based in most languages. Leaving
array[0]empty and starting at 1 wastes one cell and causes confusion. Most implementations (C++std::priority_queue, .NETPriorityQueue) use 0-based. The formulas are just slightly less elegant — the behavior is the same.Q. Does it have to be a complete binary tree? Why not build it freely like a BST?
Breaking completeness kills the array representation — you’d need to express “empty slots,” which means sentinels or pointers in the array, which is essentially retreating back to a BST. Also, if the tree leans to one side, the height becomes $O(n)$ and operations slow down. “Packed into an array with no gaps” is the heap’s identity.
Part 3: Core Operations — sift-up and sift-down
These are the two fundamental operations for preserving the heap property. Both insertion and deletion reduce to these.
sift-up (percolate up) — Insertion
The strategy when inserting a new element:
- Append to the end of the array (preserving completeness)
- Compare with the parent; if smaller, swap
- Repeat until the heap property is restored (or the root is reached)
1
2
3
4
5
6
7
8
Insert 1: Step 1: compare Step 2: restored
[1] [1]
[1] / \ / \
/ \ [3] [2] [3] [2]
[3] [2] / \ / \ / \ / \
/\ /\ [7][5][4][8] [7][5][4][8]
[7][5][4][8] / \ / \
[9][1] ← new [1][9] ← compare 1 with 5 → swap
Hold on, the above example is a bit simplified. When we actually insert 1 at index 9:
1
2
3
4
5
6
7
8
9
Initial: [1, 3, 2, 7, 5, 4, 8, 9]
Add 1: [1, 3, 2, 7, 5, 4, 8, 9, 1] ← index 8
sift-up:
i=8, parent(8)=3, heap[3]=7 > 1 → swap
[1, 3, 2, 1, 5, 4, 8, 9, 7]
i=3, parent(3)=1, heap[1]=3 > 1 → swap
[1, 1, 2, 3, 5, 4, 8, 9, 7]
i=1, parent(1)=0, heap[0]=1 ≤ 1 → stop
C# implementation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
void SiftUp(int i) {
while (i > 0) {
int parent = (i - 1) / 2;
if (heap[i].CompareTo(heap[parent]) >= 0)
break; // heap property satisfied
(heap[i], heap[parent]) = (heap[parent], heap[i]);
i = parent;
}
}
public void Push(T value) {
heap.Add(value);
SiftUp(heap.Count - 1);
}
Time complexity: In the worst case, we climb all the way to the root: $O(\log n)$. The tree’s height is fixed at $\lfloor \log_2 n \rfloor$.
sift-down (percolate down) — Deletion / Extract-Min
The operation that pops the root. A bit trickier.
- Save the root (minimum) as the return value
- Move the last array element to the root position
- Decrement the array length (discard the tail)
- From the root, compare with the smaller of the two children and swap if larger
- Repeat until the heap property is restored (or a leaf is reached)
Step 4 is the key — since the parent must be smaller than both children, we must swap with the “smaller child.” Swapping with the larger one would make that child exceed its sibling and break the property.
C# implementation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
void SiftDown(int i) {
int n = heap.Count;
while (true) {
int left = 2 * i + 1;
int right = 2 * i + 2;
int smallest = i;
if (left < n && heap[left].CompareTo(heap[smallest]) < 0)
smallest = left;
if (right < n && heap[right].CompareTo(heap[smallest]) < 0)
smallest = right;
if (smallest == i) break; // heap property satisfied
(heap[i], heap[smallest]) = (heap[smallest], heap[i]);
i = smallest;
}
}
public T Pop() {
if (heap.Count == 0) throw new InvalidOperationException("Heap is empty");
T min = heap[0];
int last = heap.Count - 1;
heap[0] = heap[last];
heap.RemoveAt(last);
if (heap.Count > 0) SiftDown(0);
return min;
}
public T Peek() {
if (heap.Count == 0) throw new InvalidOperationException("Heap is empty");
return heap[0]; // just look at the root, O(1)
}
Time complexity: Also $O(\log n)$. One detail worth noting: sift-down performs roughly twice as many comparisons as sift-up — at each level, you must compare both children before picking the smaller one. This subtlety affects “why Floyd’s build-heap is $O(n)$” later on.
Full min-heap class (for reference)
A generic min-heap combining the two operations above.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public class MinHeap<T> where T : IComparable<T> {
private readonly List<T> heap = new();
public int Count => heap.Count;
public void Push(T value) {
heap.Add(value);
SiftUp(heap.Count - 1);
}
public T Pop() {
if (heap.Count == 0) throw new InvalidOperationException();
T min = heap[0];
int last = heap.Count - 1;
heap[0] = heap[last];
heap.RemoveAt(last);
if (heap.Count > 0) SiftDown(0);
return min;
}
public T Peek() => heap.Count > 0 ? heap[0]
: throw new InvalidOperationException();
private void SiftUp(int i) {
while (i > 0) {
int p = (i - 1) / 2;
if (heap[i].CompareTo(heap[p]) >= 0) break;
(heap[i], heap[p]) = (heap[p], heap[i]);
i = p;
}
}
private void SiftDown(int i) {
int n = heap.Count;
while (true) {
int l = 2 * i + 1, r = 2 * i + 2, s = i;
if (l < n && heap[l].CompareTo(heap[s]) < 0) s = l;
if (r < n && heap[r].CompareTo(heap[s]) < 0) s = r;
if (s == i) break;
(heap[i], heap[s]) = (heap[s], heap[i]);
i = s;
}
}
}
Part 4: build-heap and heapsort
Floyd’s (1964) $O(n)$ build-heap
Consider the problem of taking an array of $n$ elements and turning it into a heap. The simplest approach is to Push elements one by one.
1
2
var heap = new MinHeap<int>();
foreach (int x in array) heap.Push(x); // O(n log n)
Each Push costs $O(\log n)$, so the total is $O(n \log n)$. Same as sorting.
But Robert Floyd’s surprising 1964 discovery: turning an already-populated array into a heap can be done in $O(n)$.
The algorithm is simple — apply sift-down to each non-leaf node from back to front, starting from the last non-leaf.
1
2
3
4
5
public static void BuildHeap<T>(T[] arr) where T : IComparable<T> {
// Index of the last non-leaf is n/2 - 1
for (int i = arr.Length / 2 - 1; i >= 0; i--)
SiftDown(arr, i, arr.Length);
}
Why is it $O(n)$? The key is the asymmetry: “deeper nodes are numerous but their sift-down distance is short.”
At height $h$, there are at most $\lceil n / 2^{h+1} \rceil$ nodes, and each sift-down can descend at most $h$ levels. The total cost is:
\[T(n) = \sum_{h=0}^{\lfloor \log_2 n \rfloor} \left\lceil \frac{n}{2^{h+1}} \right\rceil \cdot O(h) = O\left(n \sum_{h=0}^{\infty} \frac{h}{2^h}\right)\]And since the infinite series $\sum_{h=0}^{\infty} h / 2^h = 2$ converges to a constant:
\[T(n) = O(2n) = O(n)\]Intuitive picture: half of the tree consists of leaves (no movement), 1/4 at height 1 (moves one step), 1/8 at height 2 (moves two steps)… The dominant contributors are the many leaves, and they do almost no work.
Wait, let’s clarify this
Q. Why do we build the heap with sift-down instead of sift-up?
Building the heap with sift-up gives $O(n \log n)$. Sift-up “climbs to the parent”, and nodes near the leaves — which are the most numerous — have to climb all the way up to the root, $\log n$ steps. Conversely, sift-down “descends to children”: nodes near the root are few but go $\log n$ down, while nodes near the leaves are many but hardly move. The product of node count and travel distance stays linear.
Q. Then should we always use build-heap instead of
Pushing $n$ times?If the array is prepared in advance, yes. But with streaming input (elements arrive one at a time), you can’t use build-heap, so you have to accept the $O(n \log n)$ from
Push.
Heapsort — Sorting with a Heap
Once you understand build-heap, sorting comes for free. Williams (1964) published this idea.
- build-heap turns the array into a max-heap — $O(n)$
- Swap the root (maximum) with the last array element
- Sift-down excluding the last position — $O(\log n)$
- Repeat, and the array is sorted in ascending order
1
2
3
4
5
6
7
8
9
10
11
12
13
public static void HeapSort<T>(T[] arr) where T : IComparable<T> {
int n = arr.Length;
// 1. Build a max-heap — O(n)
for (int i = n / 2 - 1; i >= 0; i--)
SiftDownMax(arr, i, n);
// 2. Swap root with end, then shrink — O(n log n)
for (int end = n - 1; end > 0; end--) {
(arr[0], arr[end]) = (arr[end], arr[0]);
SiftDownMax(arr, 0, end);
}
}
- Time complexity: best/average/worst all $O(n \log n)$
- Space complexity: $O(1)$ — in-place sort
- Stability: unstable — swaps scramble the order of equal keys
Why It Lost to Quicksort
Theoretically, heapsort’s worst case is better (quicksort’s worst is $O(n^2)$). Yet quicksort variants dominate in practice. Reasons:
- Cache locality is bad — sift-down jumps from
ito2i+1or2i+2, doubling the index. For large arrays, these accesses leave the cache line frequently. - More comparisons and swaps — each sift-down step compares two children. Quicksort/mergesort move roughly once per comparison.
- Branch prediction friendliness is lower.
Choices by modern standard libraries:
| Language / Standard | Sort algorithm |
|---|---|
C# Array.Sort | Introsort (quicksort + heapsort fallback) |
C++ std::sort | Introsort |
Python sorted | Timsort (merge + insertion, stable) |
Java Arrays.sort (primitives) | Dual-pivot quicksort |
Java Arrays.sort (objects) | Timsort |
Rust slice::sort | Timsort variant (stable) |
Introsort (Musser 1997) runs as quicksort normally, but switches to heapsort when recursion depth exceeds $2\log n$. Heapsort plays the role of “safety net for when quicksort breaks down” — a role familiar to game studios when “guaranteed behavior in bad cases” is needed.
Part 5: Heap Variants — When Not Binary
If you know binary heaps, you’ve solved 95% of practical problems. Still, let’s look at a few variants for the remaining 5%.
d-ary Heap
A heap where each node has $d$ children. The parent/child index formulas become:
\[\text{parent}(i) = \left\lfloor \frac{i-1}{d} \right\rfloor, \quad \text{child}_k(i) = di + k + 1 \ (k = 0, 1, ..., d-1)\]The tree height drops to $\log_d n$, but each sift-down must compare $d$ children.
| Operation | Binary ($d=2$) | 4-ary | 8-ary |
|---|---|---|---|
| sift-up | $\log_2 n$ | $\log_4 n$ | $\log_8 n$ |
| sift-down | $2 \log_2 n$ | $4 \log_4 n = 2 \log_2 n$ | $8 \log_8 n \approx 2.67 \log_2 n$ |
Interesting observation: theoretical comparison counts for $d=2$ and $d=4$ are essentially equal ($2\log_2 n$). Yet LaMarca & Ladner’s (1996) experiments show that 4-ary heaps are actually faster than binary heaps — the reason is cache.
- In a binary heap, a parent’s two children are at
2i+1,2i+2— adjacent - But going from grandparent to grandchild requires two jumps
- A 4-ary heap’s four children are at
4i+1, 4i+2, 4i+3, 4i+4— all within one cache line
When 4-ary or 8-ary beats binary in practice:
- When keys are small (all children fit in one cache line)
- When sift-downs vastly outnumber sift-ups (extract-min-heavy workloads)
- Example: large-scale Dijkstra/A*
Pairing Heap and Fibonacci Heap
Heaps that offer $O(1)$ amortized decrease-key. They improve Dijkstra’s theoretical complexity to $O(E + V \log V)$ (vs $O((V+E) \log V)$ for binary heap).
| Heap | insert | extract-min | decrease-key | Structure |
|---|---|---|---|---|
| Binary Heap | $O(\log n)$ | $O(\log n)$ | $O(\log n)$ | array |
| 4-ary Heap | $O(\log n)$ | $O(\log n)$ | $O(\log n)$ | array |
| Pairing Heap | $O(1)$ | $O(\log n)$* | $O(\log n)$* | pointer tree |
| Fibonacci Heap | $O(1)$ | $O(\log n)$* | $O(1)$* | pointer tree |
*amortized
Fibonacci heap (Fredman & Tarjan 1987) is rarely used in practice despite its theoretical superiority:
- Large constants — even $O(1)$ amortized, the constant factor is large. Binary heap’s $O(\log n)$ beats it almost always except when “the graph is huge and decrease-keys are frequent”
- Complex implementation — many things to manage: parent pointers, child lists, mark bits, rank, cascading cuts
- Cache disaster — pointer-based structures wander the heap area, same issue as linked lists from Part 1
- decrease-key itself is rare — in A*, it can be avoided with lazy deletion
Pairing heap (Fredman, Sedgewick, Sleator, Tarjan 1986) is a simplified version of Fibonacci. Its theoretical guarantees are slightly weaker but its constants are smaller, so measured performance often beats Fibonacci. High-performance libraries like Boost and LEDA offer it as an option.
Wait, let’s clarify this
Q. Is Fibonacci heap always faster for Dijkstra?
No. Chen et al. (2007) and numerous empirical studies have repeatedly shown that on real datasets, binary heaps or 4-ary heaps outperform Fibonacci heaps. Only in extreme cases where $E \gg V$ (dense graphs near $O(V^2)$) does Fibonacci show meaningful benefits.
Q. So when is a Fibonacci heap actually used?
Almost never in practice. It mostly plays the role of an “existence proof” for theoretical analysis. You’ll see it in research papers, algorithm textbooks, and specific cases where amortized analysis is needed.
Part 6: Heaps in Game Development
A* open set — The decrease-key Problem
Let’s revisit the A* code from Part 5 on Graphs:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
List<Vector2Int> AStar(Vector2Int start, Vector2Int goal) {
var openSet = new PriorityQueue<Vector2Int, float>();
var gScore = new Dictionary<Vector2Int, float>();
// ...
while (openSet.Count > 0) {
var current = openSet.Dequeue();
// ...
foreach (var next in GetNeighbors(current)) {
float tentativeG = gScore[current] + Cost(current, next);
if (tentativeG < gScore.GetValueOrDefault(next, float.MaxValue)) {
gScore[next] = tentativeG;
openSet.Enqueue(next, tentativeG + Heuristic(next, goal));
}
}
}
}
The problem: what if the same node next is already in the open set, but we want to add it with a smaller f-value?
The textbook answer is decrease-key — find the node in the open set and lower its priority. But this is $O(n)$ in a binary heap (you have to find the element). Tracking node positions in a separate Dictionary makes it $O(\log n)$.
The practical answer is simpler: lazy deletion.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// lazy deletion pattern
while (openSet.Count > 0) {
var (current, f) = openSet.Dequeue();
if (f > gScore[current]) continue; // stale entry — a better value already exists
if (current == goal) return ReconstructPath(...);
foreach (var next in GetNeighbors(current)) {
float tentativeG = gScore[current] + Cost(current, next);
if (tentativeG < gScore.GetValueOrDefault(next, float.MaxValue)) {
gScore[next] = tentativeG;
openSet.Enqueue(next, tentativeG + Heuristic(next, goal));
// leave the previous stale entry alone — check when popping
}
}
}
- Pushing the same node multiple times is fine
- On pop, check
f > gScore[current]to detect “no longer valid” - Simple implementation, a binary heap alone is enough
- Uses a bit more memory, but mostly negligible
Most practical libraries — Unity’s A* Pathfinding Project, Recast/Detour, etc. — use this pattern.
Event / Timer Scheduler
In a game server or editor tooling, when you need to process “execute this task at time $t$” at scale, a heap is a good fit.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Simple timer queue — keeps (time, task) pairs ordered by time ascending
public class TimerQueue {
private readonly MinHeap<(float time, Action action)> heap = new();
public void Schedule(float time, Action action) {
heap.Push((time, action)); // O(log n)
}
public void Tick(float now) {
while (heap.Count > 0 && heap.Peek().time <= now) {
var (_, action) = heap.Pop();
action(); // O(log n) per dispatch
}
}
}
This pattern shows up in game engine animation events, cooldowns, and buff/debuff expiration handling. A naive list-and-scan approach grows linearly with event count; a heap only pops “what’s actually expiring now.”
Note: heap.Peek() is also $O(1)$, so you can cheaply query “when is the next event?” Useful for configuring sleep timers.
AI Action Scores
In utility AI, when you want to consider only the top N highest-scoring actions, a fixed-size-$k$ min-heap does the job.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Keep the top k actions
public class TopKActions {
private readonly MinHeap<(float score, Action action)> heap = new();
private readonly int k;
public TopKActions(int k) { this.k = k; }
public void Consider(float score, Action action) {
if (heap.Count < k) {
heap.Push((score, action));
} else if (score > heap.Peek().score) {
heap.Pop(); // remove the lowest-scoring
heap.Push((score, action));
}
}
public IEnumerable<Action> GetTopK() =>
heap.ToArray().OrderByDescending(x => x.score).Select(x => x.action);
}
The point is using a min-heap. For a size-$k$ top-k problem, the min-heap’s root is “the lowest among the current k.” When a new candidate beats it, swap. This technique is broadly used for general top-k problems.
Unity NativeContainer Perspective
Unfortunately, Unity doesn’t ship a built-in NativeHeap. You have to implement one yourself using the patterns from JobSystem Part 4: NativeContainer Deep Dive.
Key considerations:
- Use
NativeList<T>as the underlying storage (array-based, pairs well with a heap) - The
[NativeContainer]attribute andAtomicSafetyHandleintegration — tie into the Safety System - Burst compatibility: consider passing a
struct IComparer<T>instead of genericIComparable - Parallel insert is hard — sift-up races toward the root, so locking is needed. A common pattern is collect into a queue, then pour in single-threaded
In practice, official/community examples like Unity’s BurstAStarGridExample, or public implementations like Gilzu’s NativeMinHeap, are faster to reach for.
Part 7: Analyzing .NET’s PriorityQueue<TElement, TPriority>
From .NET 6, System.Collections.Generic.PriorityQueue<TElement, TPriority> became the official standard. Internally, it’s a 4-ary min-heap — chosen over binary for the reasons we just discussed.
1
2
3
4
5
6
7
8
9
var pq = new PriorityQueue<string, int>();
pq.Enqueue("task-a", 3);
pq.Enqueue("task-b", 1); // lower priority number = earlier pop
pq.Enqueue("task-c", 2);
while (pq.Count > 0) {
string task = pq.Dequeue();
// pops in order: task-b, task-c, task-a
}
Key features of the official implementation:
- TElement and TPriority are separate — the stored value and the sort key are decoupled (better ergonomics)
- 4-ary heap — confirmed by
const int Arity = 4in thedotnet/runtimesource EnqueueDequeue— optimized push-then-pop (useful for top-k patterns)UnorderedItems— enumerates the heap in storage order, not sorted order- No
Dictionary<TElement, ...>— does not provide decrease-key. Use lazy deletion
Caveats:
- Not thread-safe — external locking is required for multi-threaded use
- Duplicate keys allowed — multiple Enqueues with the same priority yield non-deterministic order (unstable)
- If you need stable sort, make priority a tuple:
(originalPriority, tieBreaker)
For a simple A* open set, PriorityQueue<Vector2Int, float> is a one-liner. Reasons left to roll your own: (1) custom comparer, (2) special allocation patterns, (3) Burst / Job System integration.
Closing Thoughts
Core Summary
- A priority queue is the partial problem “only min/max matters” — a full sort is overkill; partial order yields $O(\log n)$
- A binary heap is a complete binary tree mapped to an array — zero pointers, navigate parents/children with index arithmetic
- sift-up and sift-down are everything — push, pop, every variant operation is a combination of these two
- Floyd’s $O(n)$ build-heap — follows from the asymmetry of “many deep nodes, shallow moves”
- 4-ary heaps beat binary heaps in practice — the layout advantage of clustering children in a cache line
- Fibonacci heaps are theoretical — beaten by binary/4-ary in measurements, only implementation complexity is high
- A* decrease-key is sidestepped by lazy deletion — check staleness on pop, allow duplicates on push
- .NET
PriorityQueue<TElement, TPriority>is a 4-ary heap — not many reasons to roll your own
What Layer of Insight Is This
The proposition from Part 1 — “cache can matter more than complexity” — resurfaces here. Fibonacci heap’s theoretical $O(1)$ decrease-key loses to binary heap in practice because we underestimated the constants between theory and reality.
And heaps are an example of “a weaker structure than BST being stronger for specific problems.” You pay only for partial order, and in return you get free array mapping. In data structure design, “how little can we promise” actually expands your design freedom — heap over BST, top-k over full sort. Many CS ideas flow in this direction.
With this bonus, the remaining debts around Phase 1 (Data Structures and Memory) are cleared. From here, we return to the original plan: Phase 2 — OS and Concurrency. Processes, threads, locks, and the many disasters that appear on multi-core systems.
References
Primary sources and key papers
- Williams, J.W.J., “Algorithm 232: Heapsort”, Communications of the ACM 7(6), pp. 347–348 (1964) — the original binary heap paper
- Floyd, R.W., “Algorithm 245: Treesort 3”, Communications of the ACM 7(12), p. 701 (1964) — $O(n)$ build-heap
- Johnson, D.B., “Priority queues with update and finding minimum spanning trees”, Information Processing Letters 4(3), pp. 53–57 (1975) — d-ary heap
- Fredman, M.L., Sedgewick, R., Sleator, D.D., Tarjan, R.E., “The pairing heap: A new form of self-adjusting heap”, Algorithmica 1(1), pp. 111–129 (1986)
- Fredman, M.L. & Tarjan, R.E., “Fibonacci heaps and their uses in improved network optimization algorithms”, Journal of the ACM 34(3), pp. 596–615 (1987)
- LaMarca, A. & Ladner, R.E., “The influence of caches on the performance of heaps”, ACM Journal of Experimental Algorithmics 1, Article 4 (1996) — empirical evidence that 4-ary beats binary
- Dutton, R.D., “Weak-heap sort”, BIT Numerical Mathematics 33(3), pp. 372–381 (1993)
- Brodal, G.S., “Worst-case efficient priority queues”, Proceedings of SODA (1996)
- Chen, M., Chowdhury, R.A., Ramachandran, V., Roche, D.L., Tong, L., “Priority Queues and Dijkstra’s Algorithm”, UT Austin Technical Report TR-07-54 (2007) — empirical Fibonacci vs binary comparison
Textbooks
- Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C., Introduction to Algorithms (CLRS), 4th Edition, MIT Press — Chapter 6 (Heapsort), Chapter 19 (Fibonacci Heaps)
- Sedgewick, R. & Wayne, K., Algorithms, 4th Edition, Addison-Wesley — Chapter 2.4 Priority Queues
- Knuth, D., The Art of Computer Programming Vol. 3: Sorting and Searching, 2nd Edition, Addison-Wesley — Section 5.2.3 Sorting by Selection (heapsort)
- Weiss, M.A., Data Structures and Algorithm Analysis in C++, 4th Edition, Pearson — Chapter 6 Priority Queues
Implementation references
- .NET
PriorityQueue<TElement, TPriority>— dotnet/runtime source: official 4-ary min-heap implementation - C++
std::priority_queue— libstdc++, libc++: binary heap based - Java
java.util.PriorityQueue— OpenJDK: binary heap based - Python
heapq— CPython source: binary min-heap, priority encoded via tuples - Boost Heap — boost.org/libs/heap: offers binary, d-ary, pairing, Fibonacci, skew heap
Game development context
- Millington, I. & Funge, J., Artificial Intelligence for Games, 3rd Edition, CRC Press — A*, open set management
- Recast & Detour — github.com/recastnavigation: heap usage in a production NavMesh library
- A* Pathfinding Project (Unity) — arongranberg.com/astar: lazy-deletion-based open set
