Lecture 18 - Quick Sort


As usual, create a directory to hold today's activities:

$ mkdir ~/cs170/labs/lab18
$ cd ~/cs170/labs/lab18

Quiz!

This quiz is not on Inquire, but will be submitted to cseval. In a file called quiz.py in your lab18 directory. You will submit this along side the lab activities for today. I want you to write the following recursive functions.

  1. sum_n(n) - Takes an integer n as a parameter, and returns the sum of the first n integers. This sum includes the value n. Do not use the closed form expression for this. Write the recursive function.
      >>> sum_n(4)
      10
    
  2. is_palindrome(a_string) - Takes a string a_string as a parameter, and returns True if a_string is a palindrome. Recall that a palindrome is a string that is the same forwards and backwards. You should compute this recursively by checking the first and last characters of the string, and checking the string without the first and last characters.
      >>> is_palindrome("bob")
      True
      >>> is_palindrome("hello")
      False
      >>> is_palindrome("racecar")
      True
      >>> is_palindrome("racecars")
      False
      >>> is_palindrome("b")
      True
      >>> is_palindrome("")
      True
    


Lab Activity 1
Quick Sort

Quick sort earned its name by being an incredibly quick sorting algorithm. The idea is roughly the same as merge sort: Split the list into two halves and recursively call the same procedure. The difference is when the actual work of sorting is taking place.

In merge sort, the work is being done AFTER the list has been split up into its smaller pieces. In quick sort, the work is being done BEFORE the list is split up. The process of "sorting" is actually creating the two "halves" of the list. By finding the partition elements correct location in the list, you just have to worry about sorting the list before the partition element, and the list after the partition element.

Details

In the file sorting.py file, create two functions: quick_sort(a_list, start, end) and partition(a_list, start, end). Both of these functions take a list of comparable elements and two integers in the range [0, len(a_list)]. Neither of these functions should return a list, they should simply modify the order of the elements in place. Partition, however, should return the index of where it placed the partition element.

partition uses the last element of the list as the partition element. It iterates through the rest of the list, performing a swap operation when it finds an item that is not in the correct half of the list. We will arbitrarily say that all of the elements equal to the partition element will go into the half of the list less than the partition element.

quick_sort is very simple: It calls partition, then recursively calls quick_sort on the left and right halves of the list.

Once you have the function written, write your estimation of the big-oh class in a comment of the file near the function. How does this compare to Merge Sort? Are either the best we can do?

Example

>>> my_list = [5, 4, 3, 2, 1]
>>> partition_return = partition(my_list, 0, len(my_list))
0
>>> print(my_list)
[1, 4, 3, 2, 5]
>>> quick_sort_return = quick_sort(my_list, 0, len(my_list))
>>> print(my_list)
[1, 2, 3, 4, 5]
>>> print(quick_sort_return)
None

Hint

  • Your base case for quick sort is when the length of the list is one. A list of length one is already in sorted order.

  • Notice that the end index for my examples above are non-inclusive: You specify the index that is one past the end of the sub list. I do this because it allows for easy specification of the range of your for loop for partition. However, the book doesn't do it this way. You may choose which one makes more sense for you.

  • Remember, the pseudo code in the text book is based off a 1's based index scheme for lists. You need to modify it slightly to work with a 0's based index.

 

Challenge

The version of quick sort above is pretty efficient in run time, but might be a little difficult conceptually. An easier to conceptualize version would be to actually create two different lists in partition. Then, your functions only ever have to take a single list.

Write functions quick_sort_copy(a_list) and partition_copy(a_list). These functions take a list of comparable elements. quick_sort_copy returns a copy of a_list is sorted order. partition returns a 3-tuple: the partition element, the list of elements less than (or equal to) the partition element, and the list of elements greater than the partition element.


Lab Assignment 18
Plotting

Now you have a version of quick sort, and a version of merge sort written. You should also have estimations of the big-oh classes of both of them. So, let's try to verify your estimation by timing each of these algorithms!

Details

Create a file called timed_rec_sorts.py, which is going to time the recursive sorts you have now written. This will be very similar to the timings we performed in lecture 15. You should time both Merge and Quick sorts on random lists in the range(100, 1000, 100), and plot the results. Does their growth seem to match your big-oh estimates?

Example

Note, this is just an example of what your plot code should display. These are not the real plots for quick or merge sort!

Hint

This should mostly be pattern matching with the previously written timing code. There is one difference though: You need to make a copy of the list BEFORE you call these sorting algorithms, since they sort in place.

Make sure you are timing the worst case for these algorithms. It seems counter-intuitive, but the worst case for both is an already sorted list.

 

Challenge

Tim Sort is the algorithm that Python actually uses to sort when you call the .sort method of lists. As you can tell from the wikipedia page, it is quite complex, even though its big-oh class is still the same as merge sort.

The key component of Tim Sort is that instead of splitting the list in half until the lists are of size 1, it stops at lists that are larger. Insertion sort is performed on these lists, and the merge procedure executes essentially the same way.

Write a function called tim_sort(a_list), which follows this simplified version of the tim sort algorithm. You should perform insertion sort when the list sizes are 10 elements or fewer. Plot this sorting algorithm as well. Does it perform better than merge sort? Does it match the .sort method of lists?


Submission

When you have finished, create a tar file of your lab18 directory. To create a tar file, execute the following commands:

cd ~/cs170/labs
tar czvf lab18.tgz lab18/

To submit your activity, go to cseval.roanoke.edu. You should see an available assignment called Lab Assignment 18. Make sure you include a header listing the authors of the file.



Last modified: Wed Feb 26 13:33:56 EST 2014