6.2 Heaps Sorting
Heaps are based on the notion of a
complete tree,
for which we gave an informal definition earlier.
Formally:
A binary tree is
completely full
if it is of
height,
h, and has 2
h+1-1 nodes.
A binary tree of height,
h, is
complete iff
- it is empty or
- its left subtree is complete of height h-1 and its
right subtree is completely full of height h-2 or
- its left subtree is completely full of height h-1 and its
right subtree is complete of height h-1.
A complete tree is filled from the left:
- all the leaves are on
- the same level or
- two adjacent ones and
- all nodes at the lowest level are as far to the left
as possible.
Heaps
A binary tree has the
heap property iff
- it is empty or
- the key in the root is larger than that in either
child and both subtrees have the heap property.
A heap can be used as a priority
queue: the highest priority item is at the root and is trivially
extracted.
But if the root is deleted, we are left with two sub-trees
and we must
efficiently
re-create a single tree with the heap property.
The value of the heap structure is that we can both extract
the highest priority item and insert a new one in O(logn)
time.
How do we do this?
Let's start with this heap.
A deletion will remove the T
at the root.
|
|
To work out how we're going to maintain the heap property,
use the fact that a complete tree is filled from the left.
So that the position which must become empty is the one
occupied by the M.
Put it in the vacant root position.
|
|
This has violated the condition that the
root must be greater than each of its
children.
So interchange the M with the larger
of its children.
|
|
The left subtree has now lost the
heap property.
So again interchange the M with the larger
of its children.
|
|
This tree is now a heap again, so we're finished.
We need to make at most
h interchanges of a
root of a subtree with one of its children to
fully restore the heap property.
Thus deletion from a heap is
O(h)
or
O(logn).
Addition to a heap
To add an item to a heap, we follow the reverse
procedure.
Place it in the next leaf position and
move it up.
Again, we require O(h)
or O(logn)
exchanges.
|
|
Storage of complete trees
The properties of a complete tree lead to a very efficient
storage mechanism using
n sequential locations in an
array.
|
If we number the nodes from 1 at the root and
place:
- the left child of node k at position 2k
- the right child of node k at position 2k+1
Then the 'fill from the left' nature of the complete
tree ensures that the heap can be stored in
consecutive locations in an array.
|
Viewed as an array, we can see that the nth
node is always in index position n.
|
|
The code for extracting the highest priority item from a
heap is, naturally, recursive. Once we've extracted the
root (highest priority) item and swapped the last item
into its place, we simply call
MoveDown recursively
until we get to the bottom of the tree.
Click
here to load heap_delete.c
Note the macros
LEFT
and
RIGHT which simply encode
the relation between the index of a node and its left
and right children.
Similarly the
EMPTY macro encodes
the rule for determining whether a sub-tree is empty or not.
Inserting into a heap follows a similar strategy,
except that we use a
MoveUp function to
move the newly added item to its correct place.
(For the
MoveUp function,
a further macro which defines the
PARENT of a node would normally be added.)
Heaps provide us with a method of sorting,
known as heapsort.
However, we will examine and analyse the simplest
method of sorting first.
Animation
In the animation, note that both the array representation (used in
the implementation of the algorithm) and the (logical) tree representation
are shown.
This is to demonstrate how the tree is restructured to make a heap again
after every insertion or deletion.
- Complete Tree
- A balanced tree in which the distance from the root to any leaf
is either h or h-1.
- 7.2 Heap Sort
We noted earlier, when discussing
heaps,
that, as well as their use in priority queues,
they provide a means of sorting:
- construct a heap,
- add each item to it (maintaining the heap property!),
- when all items have been added,
remove them one by one (restoring the heap property as
each one is removed).
Addition and deletion are both O(logn)
operations. We need to perform n
additions and deletions, leading to an
O(nlogn) algorithm.
We will look at another efficient sorting
algorithm,
Quicksort,
and then compare it with Heap sort.
Animation
The following animation uses a slight modification of the above approach
to sort directly using a heap.
You will note that it places all the items into the array
first, then takes items at the bottom of the heap and restores
the heap property, rather than restoring the heap property as
each item is entered as the algorithm above suggests.
(This approach is described more fully in Cormen et al.)
Note that the animation shows the data
- stored in an array (as it is in the implementation of the
algorithm) and also
- in the tree form - so that the heap structure can be clearly
seen.
Both representations are, of course, equivalent.
|
No comments:
Post a Comment