Test 2 Review Exercises

- Describe the O(NlogN) heap construction algorithm and justify its complexity.
The heap is constructed by simply inserting the items into the heap one at a time, where the heap insertion algorithm for a min heap is as follows:

- Add the new element as the last leaf.
- If it is smaller than its parent, exchange it with its parent. Repeat until it is larger than its parent or it is at the root.

- Describe the O(N) heap construction algorithm and justify its complexity.
The heap is constructed as follows:

- Create a complete tree containing all the elements in arbitrary order.
- Starting at the leaves, heapify subtrees from the bottom up:
- Each leaf is already its own heap.
- Move to the trees rooted one level above the leaves. For each of these trees both the left and right subtrees are already heaps, so each tree can be made into a heap by bubbling down the root (exchanging with smallest child until it's smaller than either child).
- Repeat, moving up one level each time, until arrive at the root of the entire tree.

- Put items into complete tree -- O(N)
Heapify leaves -- no work. Heapify next level -- N/4 nodes, each of which could travel at most 1 level = N/2 Heapify next level -- N/8 nodes, each of which could travel at most 2 levels = 2*N/8 = N/4 Heapify next level -- N/16 nodes, each of which could travel at most 3 levels = 3*N/16 = 3N/16 And so on... total is N*(Sum for i=1..logN of (i-1)/2

^{i}) = O(N).

- Describe the heapsort algorithm and give and justify its complexity.
The heapsort algorithm is as follows:

- Build a min heap with the items to be sorted (O(N))
- Repeatedly remove the root of the heap and reheapify by replacing the root with the last leaf and bubbling it down (exchanging with smallest child until it's smaller than either child). Time to remove the first half of the nodes this way is between N/2*(logN) and N/2*(log(N/2)), which is O(NlogN). Time to remove second half is less than this (tree is smaller), so time to remove all is O(NlogN).

- Give the search time for each structure below. Justify your answers,
and explain any assumptions you are making.
- heap
O(N). An arbitrary element could be in either subtree of a heap, so can't do better than traversing the tree and looking at every item.

- hash table/chaining
To search in N elements stored in a hash table of size b, assuming that items are evenly distributed and that linked lists are used from each location, time required is O(N/b). This is because on average N/b items will be stored at each location, and a linear search of those items will be required. If the table size is constant this is O(N); if the table size kN, k>0, this is O(1).

- hash table/open addressing
For a good hash function and table size >= 3N/2, open addressing gives O(1) lookup time. The analysis is beyond the scope of this class.

- binary trie (argue that search is O(1))
Assume that the trie contains a branch for every bit, not just enough to distinguish from other entries. Then search depends on the length of the bitstring being searched for and is independent of what else is stored in the trie. For example, finding the string 1011001 would always require following 7 branches, regardless of the number of elements in the tree. This is clearly O(1).

- binary trie (argue that search is O(log N))
Suppose you want to store N items in a binary trie. Then each must have a unique binary representation, which requires log N bits. Since search depends on the length of the bitstring, this is clearly O(log N).

- heap