CPSC170 Lab 5

Sorting

Insertion Sort

The file list_util.py contains the sorting function that we created in class. The function has been renamed to selection_sort because it uses the selection sort algorithm. An algorithm is a sequence of computational steps for completing a task. In this case the task is to order all of the elements in an array. There are many different algorithms for performing this same task; each has its own advantages and disadvantages over other algorithms. For this section of the lab you will write a function that sorts a list using a different algorithm, insertion sort.

The insertion sort algorithm stores elements that have been inserted on the left side of the list and elements that have not been sorted yet on the right side of the list. Each iteration of the algorithm an element from the right, unsorted, side of the list is inserted (hence the name insertion sort) into the left side of the list such that the left side of the list is still sorted. When the algorithm begins, the unsorted side of the array is all of the elements in the input array, and the sorted side of the array contains nothing. The algorithm is complete when the unsorted side of the array is empty and the sorted side of the array contains all of the elements.

Before writing any code, sort a short list of integers on paper. Write five integers, with no duplicates, in non-sorted order. Draw an arrow or line to delineate where the sorted side of the list ends and where the unsorted side begins. Perform the first iteration of the algorithm by inserting the left-most element of the unsorted side of the list into sorted the sorted side. This is easy because the sorted list does not contain any element. The result of the first iteration is that the separator moves to the right by one element. The next iteration is also easy, the left-most element of the unsorted side of the list can only go into two locations when it is inserted into the sorted side of the list. It can either go before or after the only element in the sorted side of the list. Perform the insertion operation three more times, so that the list is sorted, before proceeding.

Some other things that you should consider before proceeding. How will you represent where the sorted elements in the list end, and where the unsorted elements begin? Would the program be any easier to write if you first create a private helper fucntion to do some of the work? Be sure to test your code. Create a program that imports selection_sort from list_util.py, creates an un-sorted but non-random list of integers, and prints the list before and after sorting.

Timing Sorting Algorithms

Can you tell which sorting algorithm, selection or insertion, is faster by looking at the code? Create a program to test your intuition. The python module timeit can be used to time the execution of code. Read over the documentation for the timeit module. In particular you will need the Timer initialization and the timeit method. Also look over the file timeit_test.py for an example of how to use the timeit module.

In order to time the sorting functions, create a new file called sort_timer.py that imports list_util.py and contains a class called SortTimer. The class should have a list instance variable that contains a thousand integers. Initialize the list by using list(range(1000)). Add a method to the class that shuffles the list using the shuffle function of the random module. Also add two method that sort the list, one with selection_sort and the other with insertion_sort. Finally add a method that uses a Timer to time the two sorting functions on shuffled lists of 1000 integers.

Run the program several times and notice that the elapsed time is different every time. Do you know why this is? To compensate for this variation, the test must be run enough times for the variation to become small in comparison to the result. Find a value for the number of executions of timeit such that the faster algorithm's execution time is at least a millisecond. Is the difference in the algorithms what you expected? Can you explain the difference?

Plotting

Last week, you generated plots using LibreOffice. This week, you will automatically generate plots using Turtle Graphics. Your program will time selection and insertion sort for list sizes ranging from 100 to 1,000, in increments of 50. You should run each timing a few times, and average them. You should then create a turtle window, an plot the times you generated.

There are some complexities to consider when plotting your data. The first issue is that turtle places the origin in the center of the window. For your plots, you want the origin to be in the lower left-hand corner of the window. The easy solution is to manually shift your data points so that what you draw treats the lower left-hand corner as (0,0). Another solution is to change the world coordinates for turtle, so that it assumes (0,0) is in the lower left-hand corner of the window. You can do this through turtles setworldcoordinates method.

Another complexity is scale. It is very likely that the default scale in turtle, based on integer values, is too small to actually see the graph you are drawing. Changing the x-axis scale is simple, since you are in full control of that. You can essentially hard code that scale in your program. However, the y-axis scale will depend on the actual timings generated. To solve this problem, you should find the largest time you need to plot, define that to be the full height of the window, and proportionally scale the rest of your data based on that assumption.

Submission

Submit a zip file of your code on the course Inquire site that uses your last names as the file name.