Jakub Arnold's Blog


Binary Heap (Priority Queue) in JavaScript

A binary heap is a simple data structure most often used for implementing priority queues. In a more general sense, a heap is a tree-based data structure which satisfies the heap property, and a binary heap is a heap which uses a binary tree to store its data. Any arbitrary binary tree which satisfies the following two properties can be considered a binary heap:

  1. heap property: If P is a parent node of N, then P.key <= N.key, meaning the parent always has a lower key than its children. This gives us a MIN heap, where the root node has the minimum key of the whole heap. If we want a MAX heap, we simply flip the property to P.key >= N.key. Everything that is true for a MIN heap is true for a MAX heap, so from now on, we’ll consider a MIN heap only.
  2. All levels of the tree are completely filled, except for the last one, which is filled from the left.

Here’s an example of a valid heap. Note that the last level is missing one element on the right. It’s also important to distinguish that a binary heap is not a search tree. While a binary search tree satisfies the property that the left child has a lower value than the parent and right child has a higher value than the parent, a binary heap has no such property. This also means that a binary heap can not be searched.

Here is another example, but this time of a tree which doesn’t follow the heap property, and as such is not a binary heap.

And here’s an example of a tree which doesn’t satisfy the same property, as the last layer is not filled from the left.

The first property (the heap property) gives us the ability to access the MIN element in constant time, because it must be in the root of the tree. If the MIN was somewhere down the tree, its parent would need to have a larger key value, which would break the heap property. If it had a smaller value, the MIN element wouldn’t be a true minimum, which is a contradiction with our choice of MIN as the minimum element of the whole heap.

The second property isn’t so obvious, but it allows us to store the tree not as a network (graph) of nodes with edges, but as an array of numbers, where the edges can be calculated implicitly. The second property allows us to think about the binary tree as if it was a complete binary tree. There can’t be any holes (missing elements) in the middle of the tree, only at the very right edge in the last layer. To figure out how to store the tree in an array we can first ignore the missing elements in the last layer and think about the tree as if it was complete. Here’s how such tree might look:

If we write it out layer by layer in an array, we simply get:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

Since JavaScript arrays are 0 indexed, we can do a little trick and add a blank element to the beginning of the array, which will make every number be equal to its index. I’m using null to make it clear that the first element is not really part of the data structure and only acts to fill in space. In reality, we could use something like Uint32Array and leave the first index set to 0.

[null, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15];

If you look closely at the binary tree, you can see that the left child of each node has double the value of its parent, and the right child has double the value but plus one. Looking at 7 for example, the left child is 2*7 = 14 and the right child is 2*7 + 1 = 15. This property is only true when we have a complete binary tree, which is exactly why the binary heap requires all but the last layers to be full.

Now looking back at the array, we can also see something interesting. Because we know that the root is at index 1, its left child must be at index 2*1 = 2 and right child at index 2*1 + 1 = 3. This will also be true for any other element, seeing that the array copies the structure of the tree. If the left child has double the value in the tree, and the array has values mapped to their index, then the left child in the array (having double the value) will be at double the index.

We can also get rid of the initial blank element in the array by simply shifting all results off by 1 to the right, giving us 2*i + 1 and 2*i + 2. Let’s try this:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

The left child of 1 is 2, which we can get to by calculating 2*i + 1, where i = 0 (because 1 is at 0th index). We would get 2*0 + 1 = 1, which is the index of its left child, the number 2. Going further down, getting the right child of 2 (which is a 5) we take the index of 2 (which is 1) and plug it in the 2*i + 2 formula, giving us 2*1 + 2 = 4, which is the index of the value 5. Let’s try getting the right child of 5. We take the index, which is 4 and do 2*4 + 2 = 10, giving us the index of the right child, the number 11.

It is important to note that all the calculations are done on indexes, not the actual values. We only used numbers from 1 to 15 to make it easy to see the pattern for calculating left/right children. We could just as well do the same on a completely different binary heap, and it would work, because there isn’t any point in the computation where the value is being used, only the index.

We can now also see, that this way of calculating children will work even if we leave out a few nodes in the last layer, considering it is still filled from the left. Going the other way, it’s not hard to see that if we left out one node in the middle of the tree, suddenly all of this would stop working and we would no longer have a simple formula for calculating children.

Lastly, we need a way to navigate back up the heap, going from children to the parent. This is easy to figure out, because if left = 2*i + 1, then i = (left - 1) / 2 when going from the left child, and if right = 2*i + 2, then i = (right - 2) / 2. This gives us

$$\text{parent} = \frac{\text{left} - 1}{2} \text{or} \frac{\text{right} - 2}{2}$$

Let’s do a little renaming first, since we’re going from an index of the child, let’s call that i, giving us a more general equation

$$\text{parent} = \frac{i - 1}{2} \text{or} \frac{i - 2}{2}$$

Now considering the expression \(\lfloor \frac{i - 1}{2} \rfloor\), equivalent to Math.floor((i - 1) / 2). If we had our original 2*i and 2*i + 1, then Math.floor(i / 2) would definitely work, because we’re basically getting rid of the +1 in the second case. In the more complicated 2*i + 1 and 2*i + 2 we can think of the i - 1 as moving to the simpler case, and then using the same floor function.

Let’s create an Array for the following heap:

We simply write down the numbers going layer by layer, left to right, getting:

// indexes: 0  1  2  3  4  5 
var heap = [1, 3, 5, 4, 9, 6];

function left(i) { return 2*i + 1; }
function right(i) { return 2*i + 2; }
function parent(i) { return Math.floor((i - 1) / 2); }

console.log(heap[left(0)]);
// 3
console.log(heap[right(0)]);
// 5

// We can even use .indexOf() to make this a bit clearer

console.log(heap[left(heap.indexOf(3))]);
// 4
console.log(heap[right(heap.indexOf(3))]);
// 9
console.log(heap[left(heap.indexOf(5))]);
// 6

After this, it should be crystal clear that we can think in terms of a tree, but do the actual operations on an Array representing the same thing.

Heap operations

Now that we understand how to store the heap, we can take a look at the operations the heap supports, their time complexity, and how to implement them. First, here’s an overview of the supported operations:

Earlier in this article, we implemented the heap as an Array of Numbers. But in real life, we will almost always want to store actual objects in the heap, and order them by a given key. This can be done in multiple ways. We can either store an Array of objects that look something like { key: X, value: the_actual_object }, or we can use a mapping function that the heap uses to map each element to its key. For example function(user) { return user.id; } could be used to map users to their respective user.id. I’m going to assume the first example, since it makes the implementation a bit shorter, but both ways should be equally easy to implement.

Min()

Returning the MIN element is an easy operation thanks to the heap property. We already know it is present in the root, which means we simply return the element at index 0 in our array. The time complexity of this operations is \(O(1)\).

function min(heap) {
    // Note that we're expecting the `{ key: X, value: the_actual_object }` format,
    // which is why we're returning `.value` here.
    return heap[0].value;
}

Insert(key, value)

Adding an element is rather simple. First we add it to the end of our Array, which is semantically equivalent to adding a new leaf to the last layer of the tree (as far left as possible). This however breaks the heap property, saying that the key of each node must be lower than the keys of its children. We can easily fix this by bubbling the newly added node up, checking if its key is lower than its parent, and swapping them if it is.

function insert(heap, key, value) {
    var node = { key: key, value: value };

    heap.push(node);
    var index = heap.length - 1;

    while (index > 0) {
        var parentIndex = parent(index); // here comes the `Math.floor((i - 1) / 2)`

        if (heap[index].key < heap[parentIndex].key) {
            var tmp = heap[index];
            heap[index] = heap[parentIndex];
            heap[parentIndex] = tmp;

            index = parentIndex;
        } else {
            // We can stop up-propagating since the rest of the tree already
            // obeys the heap property and the upper nodes would have even lower keys
            // than our direct ancestor.
            break;
        }
    }
}

var heap = [];

insert(heap, 2, "b");
insert(heap, 3, "c");
insert(heap, 1, "a");

console.log(heap);
// 0: {key: 1, value: "a"}
// 1: {key: 3, value: "c"}
// 2: {key: 2, value: "b"}

We can see that 1 is in the root as expected, 3 is as the left child, and 2 is as the right child. This is because after the first two inserts, right when we added 1 to the leaf the heap looked like this (the nodes are labeled as key[value] for simplicity):

But the insert call runs the up-propagation, it swaps 2 and 1, creating the final shape of the tree.

Adding 0 to the heap would cause yet another round of up-propagation.

var heap = [];

insert(heap, 2, "b");
insert(heap, 3, "c");
insert(heap, 1, "a");
insert(heap, 0, "~");

console.log(heap);
// 0: {key: 0, value: "~"}
// 1: {key: 1, value: "a"}
// 2: {key: 2, value: "b"}
// 3: {key: 3, value: "c"}

Initially before the up-propagation the heap would look like this, breaking the heap property:

but after up-propagating the 0[~] element up we get a proper MIN-heap:

Because of the second property of a binary heap, we can think of a complete binary tree of the same depth as an upper bound on the shape of the heap. From this, we can easily derive that the depth of a complete binary tree is \(\log n\). The Insert operation only traverses the tree once, going from a leaf to the root, which is \(\log n\) layers. The whole Insert operation is thus also \(O(\log n)\).

ExtractMin()

Removing the MIN element is easier than it might seem at first. We swap the root node with the last element on the last level, remove the last element from the Array altogether (this removes the MIN element from the heap), and then propagate the new root down to fix the heap property. We have to be a bit careful here though. While propagating up only required to compare with the parent, when propagating down we have to check against both the children, and swap with the smaller one. Why? A simple example will explain:

Let’s say a > b and a > c, which means a needs to be propagated down. If we picked one of the children at random (or always the left one for example), we could break the heap property. Here’s how the tree would look after swapping a with b.

We might have fixed the relationship between a and b, but if we also had b > c originally, we would still have a tree that does not satisfy the heap property. We can fix this easily by looking at both the children, comparing them against each other, and swapping with the smaller one. That way, we would’ve swapped a with c, making c the new MIN root. This would be fine, because c < b.

To clarify this better, let’s look at an example of doing the whole ExtractMin() operation on a small heap.

First ExtractMin() swaps the root with the last element in the last layer.

Then it removes the 1 from the heap altogether, but still having the heap property broken.

Swapping down with the smaller of the two children at the second layer. Note that this is legal, because 2 < 3.

And continuing further down, until there are no more swaps needed, or until it reachest the last layer.

Knowing how the operation works, we can implement it in a fairly straightforward way. This example also includes all of the necessary above code to make it easier to test out in the console and understand the heap as a whole.

function left(i) { return 2*i + 1; }
function right(i) { return 2*i + 2; }
function parent(i) { return Math.floor((i - 1) / 2); }

function min(heap) {
    // Note that we're expecting the `{ key: X, value: the_actual_object }` format,
    // which is why we're returning `.value` here.
    return heap[0].value;
}

function insert(heap, key, value) {
    var node = { key: key, value: value };

    heap.push(node);
    var index = heap.length - 1;

    while (index > 0) {
        var parentIndex = parent(index); // here comes the `Math.floor((i - 1) / 2)`

        if (heap[index].key < heap[parentIndex].key) {
            var tmp = heap[index];
            heap[index] = heap[parentIndex];
            heap[parentIndex] = tmp;

            index = parentIndex;
        } else {
            // We can stop up-propagating since the rest of the tree already
            // obeys the heap property and the upper nodes would have even lower keys
            // than our direct ancestor.
            break;
        }
    }
}

function extractMin(heap) {
    var result = min(heap);

    // If there is only one element in the heap, that being the minimum, we can just clear it.
    if (heap.length == 1) {
        heap.splice(0, 1);
        return result;
    }
    
    // We copy the last element to the root and remove the last element.
    // There is no need to do an actual swap as we showed in the examples above,
    // since the last element is going to get removed immediately afterwards.
    heap[0] = heap[heap.length - 1];
    heap.splice(heap.length - 1, 1);

    bubbleDown(heap, 0);

    return result;
}

function bubbleDown(heap, index) {
    var leftIndex = left(index);
    var rightIndex = right(index);

    var smallest = index;

    if (leftIndex < heap.length && heap[leftIndex].key < heap[smallest].key) {
        smallest = leftIndex;
    }
    if (rightIndex < heap.length && heap[rightIndex].key < heap[smallest].key) {
        smallest = rightIndex;
    }

    if (index != smallest) {
        var tmp = heap[index];
        heap[index] = heap[smallest];
        heap[smallest] = tmp;

        bubbleDown(heap, smallest);
    }
}

var heap = [];

insert(heap, 2, "b");
insert(heap, 3, "c");
insert(heap, 1, "a");
insert(heap, 0, "~");

console.log(extractMin(heap));
// "~"
console.log(heap);
// 0: {key: 1, value: "a"}
// 1: {key: 3, value: "c"}
// 2: {key: 2, value: "b"}

console.log(extractMin(heap));
// "a"
console.log(heap);
// 0: {key: 2, value: "b"}
// 1: {key: 3, value: "c"}

console.log(extractMin(heap));
// "b"
console.log(heap);
// 0: {key: 3, value: "c"}

console.log(extractMin(heap));
// "c"
console.log(heap);
// []

Same as with the Insert case, ExtractMin also traverses the tree at most once, going from the root to a leaf, which means the time complexity is also \(O(\log n)\).

Conclusion

In the beginning we mentioned that a binary heap is also often used as a priority queue. A good example here might be a task scheduler which always takes the task with the highest priority and runs it. Adding new tasks to the priority queue and extracting the highest priority one would both have \(O(\log n)\) time complexity, which makes it very fast even as the number of tasks grows larger. Another thing to note, the Array based implementation has all the benefits of a tree data structure without having a bunch of objects floating around on the memory heap.

Lastly, the binary heap is not the only heap there is, even though it’s probably the most common one. Two other examples are Binomial and Fibonacci heaps, which are very different from the binary heap, and provide some very interesting time complexities on their operations. For example, inserting an element into a Fibonacci heap is \(O(1)\). There are also two operations which we didn’t cover in this article, Decrease which changes the key of an element in the heap, and Merge which takes two heaps and merges them together into one. The reason we didn’t cover them is that Decrease requires some additional handling to be useful, and Merge in and of itself isn’t so common. But those two are also an area where a Fibonacci heap provides constant time complexity, while the binomial heap is only \(O(\log n)\) for Decrease and \(O(n)\) for Merge.

References

Related
Algorithms