The file list_util.py contains the sorting function
that we created in class. The function has been renamed to
selection_sort
because it uses the selection sort algorithm. An
algorithm is a sequence of computational steps for completing a task.
In this case the task is to order all of the elements in an array.
There are many different algorithms for performing this same task;
each has its own advantages and disadvantages over other
algorithms. For this section of the lab you will write a function that
sorts a list using a different algorithm, insertion sort.
The insertion sort algorithm stores elements that have been inserted on the left side of the list and elements that have not been sorted yet on the right side of the list. Each iteration of the algorithm an element from the right, unsorted, side of the list is inserted (hence the name insertion sort) into the left side of the list such that the left side of the list is still sorted. When the algorithm begins, the unsorted side of the array is all of the elements in the input array, and the sorted side of the array contains nothing. The algorithm is complete when the unsorted side of the array is empty and the sorted side of the array contains all of the elements.
Before writing any code, sort a short list of integers on paper. Write five integers, with no duplicates, in non-sorted order. Draw an arrow or line to delineate where the sorted side of the list ends and where the unsorted side begins. Perform the first iteration of the algorithm by inserting the left-most element of the unsorted side of the list into sorted the sorted side. This is easy because the sorted list does not contain any element. The result of the first iteration is that the separator moves to the right by one element. The next iteration is also easy, the left-most element of the unsorted side of the list can only go into two locations when it is inserted into the sorted side of the list. It can either go before or after the only element in the sorted side of the list. Perform the insertion operation three more times, so that the list is sorted, before proceeding.
Some other things that you should consider before proceeding. How will
you represent where the sorted elements in the list end, and where the
unsorted elements begin? Would the program be any easier to write if
you first create a private helper fucntion to do some of the work? Be
sure to test your code. Create a program that imports selection_sort
from list_util.py, creates an un-sorted but non-random list of
integers, and prints the list before and after sorting.
Can you tell which sorting algorithm, selection or insertion, is
faster by looking at the code? Create a program to test your
intuition. The python module
timeit can be used
to time the execution of code. Read over the documentation for the
timeit module. In particular you will need the Timer
initialization
and the timeit
method. Also look over the file
timeit_test.py for an example of how to use the
timeit module.
In order to time the sorting functions, create a new file called
sort_timer.py that imports list_util.py and contains a class called
SortTimer
. The class should have a list instance variable that
contains a thousand integers. Initialize the list by using
list(range(1000))
. Add a method to the class that shuffles the
list using the shuffle
function of the random module. Also add two
method that sort the list, one with selection_sort
and the other
with insertion_sort
. Finally add a method that uses a Timer
to
time the two sorting functions on shuffled lists of 1000 integers.
Run the program several times and notice that the elapsed time is
different every time. Do you know why this is? To compensate for this
variation, the test must be run enough times for the variation to
become small in comparison to the result. Find a value for the number
of executions of timeit
such that the faster algorithm's execution
time is at least a millisecond. Is the difference in the algorithms
what you expected? Can you explain the difference?
Last week, you generated plots using LibreOffice. This week, you will automatically generate plots using Turtle Graphics. Your program will time selection and insertion sort for list sizes ranging from 100 to 1,000, in increments of 50. You should run each timing a few times, and average them. You should then create a turtle window, an plot the times you generated.
There are some complexities to consider when plotting your data. The first issue is that turtle places the origin in the center of the window. For your plots, you want the origin to be in the lower left-hand corner of the window. The easy solution is to manually shift your data points so that what you draw treats the lower left-hand corner as (0,0). Another solution is to change the world coordinates for turtle, so that it assumes (0,0) is in the lower left-hand corner of the window. You can do this through turtles setworldcoordinates method.
Another complexity is scale. It is very likely that the default scale in turtle, based on integer values, is too small to actually see the graph you are drawing. Changing the x-axis scale is simple, since you are in full control of that. You can essentially hard code that scale in your program. However, the y-axis scale will depend on the actual timings generated. To solve this problem, you should find the largest time you need to plot, define that to be the full height of the window, and proportionally scale the rest of your data based on that assumption.
Submit a zip file of your code on the course Inquire site that uses your last names as the file name.