CPSC170 Lab 11

Binary Search Trees

Yesterday in class, we discussed a new type of data structure called a "Tree." Trees give us a hierarchical mechanism for storing data. Today in lab, we will implement a specific class of Tree, referred to as Binary Search Trees.

Creating Binary Trees

The file binary_search_tree.py contains a very sparse outline for a creating a binary tree. Recall that each node in a binary tree contains at most two children: a left child, and a right child.

What separates binary search trees from normal binary trees are how elements are stored within the tree. Normally, there are not any restrictions on where data can be stored within the tree. In a binary search tree, we will impose some ordering among the elements:

The above tree is a binary search tree. Consider the root of the tree "5." All descendants in the left side of the tree are valued less than "5." All descendants in the right side of the tree are valued greater than "5." We refer to the tree rooted at the left child as the Left Sub-tree. The Right Sub-tree is the tree rooted at the right child.

You should write your insert method such that it enforces the above conditions. You should first create a tree node for the new data. Then you should find the location to put the new tree node. If the root is None, then the new node becomes the root. Otherwise, compare the data you are adding to the root. value of the new data is less than that in the root, the value belongs in the left sub-tree. If the left reference is None, then this node becomes the new left child. Otherwise, get the node at the left child, and attempt the same process again.

Use a similar process if the new data is greater than that in the root.

Before you continue, you should test your insertion code thoroughly. How can you verify your insertion code works?

Tree Structures

Right now, it is pretty difficult to get a visual representation of a tree. To remedy this situation, you can draw the trees you generate by hand. The file test_binary_tree.py contains three test trees. For each of the test trees, draw out (BY HAND! ON ACTUAL PAPER!) what the resulting trees would look like. Hand your hand drawn trees in at the end lab.

Your hand drawings will be useful for testing your code. Add some print statements to your insertion code, and try to build the trees contained in the above code. Does it seem like the trees being generated match your drawings? We will produce a more visual representation of the tree later, but for now you should be able to create some argument about the correctness of your insertion code.

Tree Traversals

You can easily test your code above by hand, but it is easier if you can produce some string representation of the tree. One mechanism for generating these strings use techniques known as tree traversals. As we discussed in class yesterday, there are two different approaches to traversing a tree: Breadth First and Depth First.

print_depth_first contains the code we discussed at the end of class yesterday. It uses a stack (and the stack "interface") to keep track of what elements it should be printing next. This method prints out a representation of the tree by fully exploring one path to a leaf in the tree, and back-tracking to explore all other paths to the leafs of the tree.

Breadth first traversals are very similar to the depth first code listed above, but the results are quite different. Breadth first traversals require using a Queue, which you wrote last week in lab. It starts by adding the root to the queue. You continually dequeue a node out of the queue, and print its data. Then, add both children (in some arbitrary but consistent order) to the queue. This continues until you have visited every node in the tree.

Draw a sample binary tree, using letters as the values stored within the tree. What string should the above algorithm produce? After you have written out this test case, go ahead and code this algorithm in print_breadth_first. This method should print a representation of the tree using the above algorithm. Does your algorithm meet your expectations?

Classes of Depth First Traversals

There are actually three types of depth first traversals. They are preorder, inorder, and postorder. All three of these depth first techniques can be written using the implicit system stack, by using recursion. As you can see in binary_search_tree.py, each of the depth first techniques requires the root of the tree that is being traversed. In each method, you will recursively call the traversal technique on each sub-tree. The only difference between the three are where you actually print the data.

Preorder traversals print data when you first visit a node. This is accompished by printing the data in the current root before you recursively call preorder on either sub-tree. Inorder prints all of the elements in the left sub-tree before printing the data in the current root. Postorder prints all of the elements in both sub-trees before printing the data in the current root. Finish the code in print_in_order, print_pre_order, and print_post_order.

Test the above methods. Which traversal technique makes most sense for the test case that you generated before writing breadth first traversal? Which makes the least sense? Can you easily infer the structure of the tree given the output and the traversal technique used?

Searching

You are now able to create binary search trees. Now, how do we figure out if an element is contained within the tree? Consider the tree above. Given the root of the tree, we know that all values in the left sub-tree are valued less than the root, and all values in the right sub-tree are valued greater than the root. We can use that information to determine where to continue searching for the requested value.

For example, If we were looking for the value "3", we know that it MUST be a descendant of the root through the left child. We can continue searching from there, using the node valued "2" as the new root.

The algorithm for searching in a binary search tree is as follows: Let D be the value we are searching the tree for. If the value in the node you are currently exploring is D, then return true, as you have successfully found the element. Otherwise, if D is less than the current node, then move to the left child, and repeat the process. If D is greater than the current node, then move to the right child and repeat the process. If you ever reach a node that is None, then D is not in the tree and you return False.

Write this algorithm up in the method called search. Will this algorithm always produce valid results? Does it work on the Trees you generated in the first part of this lab? Make sure you test this algorithm before moving on to the next part.

What is the Big-O complexity of the above search technique? Does this algorithm remind you of another algorithm we have used before? Write up your analysis in a file called search_complexity.txt, and include it in your submitted zip files.

Submission

Submit a zip file of your code on the course Inquire site that uses your last names as the file name. Don't forget to turn in your tree drawings!