Priority QueuesCS240E: Data Structures and Data Management (Enriched)2019 Winter, David DuanAbstract Data TypesADTStack ADTQueue ADTPriority QueueUsing a Priority Queue to SortRealizations of Priority QueuesUnsorted Arrays / Unsorted Linked ListsSorted Arrays / Sorted Linked ListsHeapsRepresentationOperations on HeapsInsertionDelete MaxPriority Queue Realization Using HeapsSorting using HeapsHeapSortBuilding Heaps by Bubble-upBuilding Heaps by Bubble-downHeapSort ImplementationOther Heap OperationsGetMaxChangePriorityDeleteHeap Merge/JoinWorst-Case Heap JoiningExpected MergePseudocodeGraphical IllustrationExpected Runtime AnalysisBinomial HeapBinomial TreeBinomial Heap Binomial Heap OperationsAmortized Analysis (via Potential Function)Improvement on Worst-Case PerformanceMagic Trick
Abstract Data Type: A description of information and a collection of operations on that information.
The information is accessed only through the operations.
We can have various realizations of an ADT, which specify:
Stack: An ADT consisting of a collection of items with operations:
push
: inserting an itempop
: removing the most recently inserted itemsize
, isEmpty
, top
.Items are removed in Last-In First-Out order.
Applications: procedure calls, web browser back button.
Realizations: arrays or linked lists.
Queue: An ADT consisting of a collection of itesm with operations:
enqueue
: inserting an itemdequeue
: removing the least recently inserted itemsize
, isEmpty
, front
.Items are removed in First-In First-Out order.
Items enter the queue at the rear and are removed from the front.
Applications: CPU scheduling, disk scheduling
Realizations: (circular) arrays, linked lists.
PQ: An ADT consisting of a collection of items (each having a priority or key) with operations
insert
: inserting an item tagged with a prioritydeleteMax
(or extractMax
, getMax
): removing the item of highest priorityThe above definition is for a maximum-oriented priority queue. A minimum-oriented priority queue is defined in the natural way, by replacing the operation deleteMax by deleteMin.
Applications: to-do list, simulation systems, sorting.
1PQ-Sort(A[0..n-1])
21. initialize PQ to an empty priority queue
32. for k <- 0 to n-1 do
43. PQ.insert(A[k])
54. for k <- n-1 down to 0 do
65. A[k] <- PQ.deleteMax()
insert
and deleteMax
.insert
and deleteMax
.A heap is a tree-based data structure satisfying two properties:
Remarks
We usually do not store heaps as binary trees but structured arrays.
We should hide implementation details using helper functions such as root(), parent(i), last(n), hasLeftChild(i)
, etc.
Place the new key at the first free leaf, then perform a fix-up
:
â€‹x1fix-up(A, k)
2k: an index corresponding to a node of the heap
3â€‹
41. while parent(k) exists and A[parent(k)] < A[k] do
52. swap A[k] and A[parent(k)]
63. k <- parent(k)
The new item bubbles up until it reaches its correct place in the heap. Insertion has runtime .
We replace root, which is the max item of the heap, by the last leaf and remove last leaf, then perform a fix-down
:
xxxxxxxxxx
121fix-down(A, n, k)
2A: an array that stores a heap of size n
3k: an index corresponding to a node of the heap
4â€‹
51. while k is not a leaf do
62. // Find the child with the larger key
73. j <- left child of k
84. if (j is not last(n) and A[j+1] > A[j])
95. j <- j + 1
106. if A[k] >= A[j] break
117. swap A[j] and A[k]
128. k <- j
The new root bubbles down until it reaches its correct place in the heap. DeleteMax has runtime .
We can store items in a PQ in array and keep track of its size:
xxxxxxxxxx
61insert(x)
2â€‹
31. increase size
42. l <- last(size)
53. A[l] <- x
64. fix-up(A, l)
xxxxxxxxxx
101def last(n):
2 """Return the index of last item given a n-item heap."""
3 return n - 1
4â€‹
5def insert(x):
6 """Insert item x to the heap."""
7 size += 1 # Increase size
8 idx = last(size) # Get index of first free leaf, which is n-1
9 A[idx] = x # Insert x to this position
10 fix_up(A, idx) # Perform fix_up to get it to the correct position.
xxxxxxxxxx
71deleteMax()
2â€‹
31. l <- last(size)
42. swap A[root()] and A[l]
53. decrease size
64. fix-down(A, size, root())
75. return (A[l])
xxxxxxxxxx
111def root():
2 """Return the index of the root of a heap."""
3 return 0
4â€‹
5def deleteMax():
6 """Pop max of a heap"""
7 idx = last(size) # Get position of the last element
8 A[root()], A[idx] = A[idx], A[root()] # Swap last leaf with root
9 size -= 1 # Update size
10 fix_down(A, size, root()) # Perform fix_down to get new root to correct position
11 return (A[idx]) # Return max
Recall that any PQ can be used to sort in time . Using the binary-heaps implementation of PQs, we obtain:
xxxxxxxxxx
71PQ-SortWithHeaps(A)
2â€‹
31. initialize H to an empty heap
42. for k <- 0 to n-1 do
53. H.insert(A[k])
64. for k <- n-1 down to 0 do
75. A[k] <- H.deleteMax()
Since both operations run in for heaps, PQ-Sort using heaps takes time.
We can improve this with two simple tricks:
Heapsort
.Problem: Given times in , build a heap containing all of them.
Solution 1: Start with an empty heap and inserting item one at a time. This corresponds to doing fix-ups, so worst-case .
Problem: Given times in , build a heap containing all of them.
Solution 2: Using fix-downs
instead:
xxxxxxxxxx
61heapify(A)
2A: an array
3â€‹
41. n <- A.size()
52. for i <- parent(last(n)) downto 0 do
63. fix-down(A, n, i)
A careful analysis yields a worst-case complexity of , which means a heap can be built in linear time.
Idea: PQ-Sort with heaps
Improvement: use same input array for storing heap.
xxxxxxxxxx
121HeapSort(A, n)
2â€‹
31. //heapify
42. n <- A.size()
53. for i <- parent(last(n)) downto 0 do
64. fix-down(A, n, i)
75. // repeatedly find maximum
86. while n > 1
97. // do deleteMax
108. swap items at A[root()] and A[last(n)]
119. decrease n
1210. fix-down(A, n, root())
The for
-loop takes time and the while-loop takes time.
xxxxxxxxxx
81changePrority(item, newKey)
2- Change the priority of <item> to <newKey>. O(log n) time.
3- Helper: fix-up(item): if the item's priority is greater than its parent, switch. In the worst case we need to do log(n) comparisons, so runtime is O(log n).
4- Helper: fix-down(item): if the item's priority is less than its children, switch. For the same reason, runtime is O(log n).
5â€‹
61. key(item) <- newKey // Set item's key to newKey
72. fix-up(item) // Bubble up/down the item until it's at the
83. fix-down(item) // ... correct place
xxxxxxxxxx
121delete(item)
2- Delete the item from an addressable max-heap. O(log n) time.
3- As a remark, `infinity` here represents an arbitrarily large value.
4- Helper: changePriority - See above, O(log n)
5- Helper: deleteMax - See above, O(log 1)
6â€‹
71. changePriority(item, infinity) // Set item's key to a large number
82. deleteMax() // Since changePriority makes sure that the
9// ... item is at the correct place, which,
10// ... given its new priority is infinity,
11// ... it must be the root of the heap.
12// ... Thus, deleteMax() removes it.
Problem join(H1, H2)
: create a heap that stores items of both and .
The task is easy for arrays/lists: . (Khan Academy: Linear Time Merging).
It is also easy for binary heaps: .
Solution We present three ideas:
An algorithm works for tree-based implementation using worst case time.
A randomized algorithm (destroys structural property of heaps) where .
An algorithm using higher-degree heaps with amortized runtime .
Given , , , , we want to create a heap combining both heaps.
Case I. Both heaps are full and they have the same height:
deleteMax()
. Since deleteMax()
also takes care of structural property, we are done.The runtime is dominated by deleteMax()
which has time.
Case II. Both heaps are full and has a smaller height:
Phase I: Merge
Let represent 's height. Since , we can find a subheap of with the same height as . In the left diagram, subheaps , , , and are such heaps.
Since , we use Case I's strategy to merge into :
We arrive at the right diagram.
Phase II: Adjust
To get rid of node, consider the following three nodes:
We have the following observations:
and are valid heaps by construction, hence both satisfy the heap property.
By heap ordering property of , was 's parent previously implies , so subheap (and its root ) will not violate the heap property.
The only problem is we don't know about the relationship between and .
deleteMax()
.To fix the problem (when ):
For all ancestors of , call fix-down(a)
.
fix-down(a)
requires comparisons (again, ), so overall this step takes .Call deleteMax()
: After those fix-down
's, will be at the root position. Calling deleteMax()
effectively removes .
Overall, the runtime is dominated by those fix-down
's and hence has runtime .
Case III. is full but is not.
This takes time. We omit this case.
Case IV. Both are not full.
We split into heaps that are full.
xxxxxxxxxx
91A
2/ \
3B K
4/ \ / \
5C D k K
6/ \ / \ / \ / \
7C C E F K K K K
8/ \ / \ /
9C C C C G
k
) is a full heap, and the node F
itself is a heap (because we have nothing to group it with).A -> B -> D -> E -> F
. For each full subheap, merge them into using Case III. Since each merge takes and we have full heaps, we end up with time.
Conclusion Overall, the runtime is dominated by Case IV: time.
xxxxxxxxxx
141heapMerge(r1, r2)
2r1, r2: roots of two heaps, possibly NIL
3return: root of merged heap
4â€‹
51. if r1 is NIL return r2
62. if r2 is NIL return r1
73. if key(r2) > key(r1)
84. randomly pick one child c of r1
95. replace subheap at c by heapMerge(c, r2)
106. return r1
117. else
128. randomly pick one child d of r2
139. replace subheap at d by heapMerge(r1, d)
1410. return r2
Suppose we want to merge the following two heaps.
We see that root of left heap is greater than root of right heap (see line 7).
We randomly pick a child of and ended up with (see line 8).
We see (see line 3).
We randomly pick a child of and ended up with (see line 4).
We keep doing this:
Finally, we see that r1
is NIL
and move (subheap with root ).
Return from call stack:
Return from call stack & compare with : we move the subheap with root .
Return from call stack & compare with : we move the subheap with root .
We are done!
Recall the expected runtime of a randomized algorithm can be expressed as:
We want to design the randomization such that all instances of size have equally good performance because the expected runtime depends on the of every instance.
Before we analyze the expected runtime of heap merge, let us first consider a simpler problem:
Problem Given a binary tree, you do a random walk downtime until you reach NIL
. What is the runtime for this? We make no assumption on the tree, i.e., it may be very unbalanced.
Lemma The expected runtime (i.e., on a tree of nodes) is .
Proof. Assume that the time to go from a node to a randomly chosen child is at most . We claim that . We prove by induction.
Base: .
NIL
. Step: .
Let , represent the number of nodes in the left and right subtrees, respectively. Then .
The total runtime can be expressed s:
Recall concave functions has the property that
Thus,
Back to the heap merge analysis, we can view it as going down in both heaps randomly until one hits a NIL
child. Then
Recall that merge is easy for list ( for concatenation). To achieve our goal of worst-case merging, we can store a list of trees that satisfy the heap-property.
A binomial tree is defined recursively as follows:
Because of its unique structure, a binomial tree of order can be constructed from two trees of order trivially by attaching one of them as the left-most child of the root of the other tree. This feature is central to the merge operation of a binomial heap, which is its major advantage over the conventional heaps.
Remark. The name comes from the shape: a binomial tree of order has nodes at depth .
Claim. A binomial tree of order has nodes with height .
Proof. We prove by induction. The base case is trivial. Suppose this is true for . Then binomial tree contains two copies of , so has nodes. Because the max depth of a node in is one greater than the max depth in , by the inductive hypothesis, the maximum depth is .
A binomial heap is implemented as a set (or list, as presented in our class) of binomial trees that satisfy the binomial heap properties:
The first property ensures that the root of each binomial tree contains the maximum in the tree, which applies to the entire heap.
The second property implies that a binomial heap with nodes consists of at most binomial trees. In fact, the number of orders of these trees are uniquely determined by the number of nodes : each binomial tree corresponds to the bits in the binary representation of number . For example, if a heap contains nodes, then this binomial heap with nodes will consist of three binomial trees of order , and .
Let be the list of trees and be the degree at the root of tree (take the max).
Merge: as we are basically concatenating lists.
Insert: for the same reason. In fact, insertion can be seen as merging as well.
DeleteMax: ).
Locate the max: search among all roots, of them so for searching.
Remove the max and merge the subtrees into the heap: suppose the tree has children, then we need to perform insertions. Hence time for merging.
Cleanup: we want to merge trees together so after this step, no two trees have the same degrees in the list.
xxxxxxxxxx
201binomialHeapCleanup(R, n)
2R: stack of trees (of a binomial heap of size n)
3â€‹
41. C <- array of length (log n + 1), initialized NIL
52. while R is non-empty:
63. T <- R.pop() // extract a tree to work with
74. d_T <- degree of root of T // check how many children root of T has
85. if C[d_T] is NIL // if we haven't found another tree of the same degree
96. C[d_T] <- T // we put a pointer to T in the C[d_T]
107. else // not NIL means we have saw another tree of the same degree
118. T' <- C[d_T]; C[d_T] <- NIL // extract the previous result
129. if key(root(T)) > key(root(T'))
1310. make root(T') a child of root(T)
1411. R.push(T)
1512. else
1613. make root(T) a child of root(T')
1714. R.push(T')
1815. for i<-0 to log n: // Finally, we merge the binomial trees together
1916. if C[i] != NIL
2017. R.push(C[i])
Lemma. If all trees were built with these operations, .
Proof. . This also implies .
Define , where is the number of trees at time stamp and is a constant bigger than all constants in all other terms. Recall the amortized runtime of step is defined to be .
To show merge(R1, R2)
has amortized time :
Time to analyze deleteMax()
:
Actual time: .
Potential difference: , number of trees after minus number of trees before.
Putting them together:
where the finds the max degree of all trees that could be in a binomial heap with items.
So far we have achieved amortized runtime for deleteMax()
. If we are willing to trade runtime for merge
/insert
for deleteMax()
, we can achieve worst-case for all three operations.
By calling clean-up
after every operation (insert
/merge
), we can keep . Recall clean-up
takes . Enforcing this would make insert
/merge
/deleteMax
all achieve worst-case runtime.
We can use a binary tree to store a tree with an arbitrary degree!
The new binary tree has the same nodes as . For node in , its left child in is the left-most child in and right child in is its sibling in .