CPSC170
Fundamentals of Computer Science II

pre-lab video

Lab 18: Iterators

Given a container, a basic operation is to go through all the contents of the container. For example, a function that prints all the contents of a container would require going through and printing each element. So, how do we go through (iterate) each element of a a container? For a vector, this is easy, each element in the vector has a position, so, a for-loop would do. How about a list, how would we go through each element of a list? We can use functions (methods) front and back to access the first and last elements of the list, respectively. But these functions don't help us access all the other elements in between. So, how do we access all the elements in a container?

Iterators provide a universal method of accessing the individual elements of a container. They are designed so that iterating through a container doesn't depend on the container. So, essentially the same code can print a vector, list, set, map, and other containers in the STL.

The design of iterators is influenced by pointers. In a way, an iterator is a fancy "pointer". A pointer is to an array what an iterator is to a container. Consider the following code for printing the contents of an array.


const int size = 5;
int array[5] = { 0, 1, 2, 3, 4 };
int* start = array;      // a pointer to the first element of the array.
int* end = array + size; // a pointer to the end of the array. Note that end doesn't
                         // point to an element in the array. It points to the first byte after
                         // array.

for (int* i = start; i != end; i++) { // starts at the first element,
    cout << *i << endl;               // stops after the last element in the array.
}

I'm sure you can all tell what the above code does without needing to bring out the big guns (hand-tracing). But I'll just say what it for completeness sake. The program prints out the contents of an array. We know the name of an array is the address of its first element i.e. a pointer to the first element. A pointer is simply an address. When we add the number one to a pointer of a variable of type T, we get the address i + sizeof(T). This is the address of the first byte that comes after the variable pointed to by i. Finally, when printing the element of the array we use the dereference operator * on i. This is because i is a pointer, but *i is the value pointed to by i.

If you understand the above code then you already have the big-idea behind iterators. Let us now see the equivalent of the above code using vectors and iterator.


vector<int> vec { 0, 1, 2, 3, 4 };
vector<int>::iterator start = std::begin(vec); // an iterator to the first element of the vector.
vector<int>::iterator end = std::end(vec);     // an iterator the end of the array. Note that end doesn't
                                               // "point" to an element in the array.
                                               // It "points" to the end of the container.

for (vector<int>::iterator i = start; i != end; i++) { // starts at the first element,
   cout << *i << endl;                                 // stops after the last element in the vector.
}

The main change from the first program is the substitution of int* (an integer pointer) for vec<int>::iterator (an iterator to a vector of integers). The class vector<int>::iterator is an inner class i.e. a class within a class. It is a class defined in the class vector<int>. To access the iterator class defined within vector<int> we use the scope operator ::. This is why the type is vector<int>::iterator.

With the above code you can print all the elements in a list or set (and other containers we didn't cover) by substituting either list or set for any occurrence of vector.

Another important change from the first program is how we get a "pointer" to the first and last elements of the container. For an array, we had to compute it ourselves, but for a container, we just ask the container. We use the begin and end functions in std, they return an iterator to the beginning and ending of the container, respectively.

Besides those changes, the rest of the program remains essentially the same. We can advance the iterator using the ++ operator, we can access the element "pointed" to by the iterator using the dereference operator *. We will only work with two classes of iterators: bidirectional and random-access iterators. Bidirectional iterators all us to move the iterator forward and backwards with the operators ++ and --, respectively. Random-access iterators allows us to perform integer arithmetic. So, if i is a random-access iterator, i+n and i-n are valid expressions. They behave just as they would on array i.e. i+n is an iterator to the nth element that comes after i, i-n is an iterator to the nth element that comes before i. It also supports the operator []. For example, if i is random-access iterator and j is an integer, then i[j] is equivalent to *(i+j).

Exercise

Write a function that takes a list of integers (by reference? by constant reference? by value?) and prints it in reverse order.

Verify that a vector's iterator is a random-access iterator. Do this by doing integer arithmetic on the iterator and using the [] operator. Verify that this operator don't work on a list's iterator.

Overwriting Contents of a Container

In the same way we can use a pointer to a variable to change the value it points to, we can use an iterator to change the value it "points" to. The following code squares the numbers stored in a list of integers.


list<int> my_list{ 0, 1, 2, 3 };

for (list<int>::iterator i = std::begin(my_list); i != std::end(my_list); i++) {
   *i *= *i;     // equivalent to (*i) = (*i) * (*i);                           
}
// my_list now contains 0, 1, 4, 9.

A lot of functions in the standard C++ library use iterators, so it's very important to understand them. Iterators are very useful for writing generic code i.e. code that runs on a wide range of types. For example, we could write a function that can print any of the container classes. Let us now view a few functions in the standard library that use iterators.

Suppose I wanted to insert the number 0 in the middle of the list { 0, 1, 2, 3 }. I could do so using the list class's insert member function as follows:


list<int> my_list{ 0, 1, 2, 3 };
list<int>::iterator itr = std::begin(my_list); // itr points to 0.

itr++;                                         // itr points to 1.
itr++;                                         // itr points to 2.
my_list.insert(itr, 0);                        // 0 is inserted before of 2.

We can also use insert to add a range of elements into a list. This version takes three iterators, the first iterator "points" where the elements will be inserted. The second and third iterators "point" to the beginning and ending (respectively) of the elements to be added. Note that the whatever is pointed to by end isn't added. Here's and example.


list<int> my_list{ 0, 1, 2, 3 };
vector<int> my_vec{ -4, -3, -2, -1 };

list<int>::iterator itr = std::begin(my_list);       // itr points to 0.
vector<int>::iterator itr_start= std::begin(my_vec); // points to 0, the start my_vec.
vector<int>::iterator itr_end = std::end(my_vec);    // points to the end of my_vec

my_list.insert(itr, itr_start, itr_end);  // inserts the list [itr_start, itr_end) before the value
                                          // pointed to by itr.

for (list<int>::iterator i = std::begin(my_list); i != std::end(my_list); i++) {
	cout << *i << endl;
}
// my_list now contains {-4, -3, -2, -1, 0, 1, 2, 3}

Exercise

Create a function called size it should take a list of strings and return the size of the of the list. Don't use the size function, use only iterators.

Overload the size function by creating a version that takes a vector of integers. (To overload a function is to create a function of the same name but different signature). Is there any significant difference between the two functions? If there is, send me and explain it to me for extra credit.

Using only iterators, create a function called reverse. It should take a list of integers and return a copy of the list, but in reversed order. Don't use push_back or push_front, use insert.

Using only iterators, create a function called remove_duplicates. This function should take a list of strings by reference and return nothing. When the function returns, the list given to the function should have no duplicates. For example, given the list {"hungry", "hungry" "hippos"}, when the function returns this list should be {"hungry" "hippos"}.