Given a container, a basic operation is to go through all the contents of the container.
For example, a function that prints all the contents of a container would require going
through and printing each element. So, how do we go through (iterate) each element of a
a container? For a vector, this is easy, each element in the vector has a position, so,
a for-loop would do. How about a list, how would we go through each element of a list?
We can use functions (methods) front
and back
to access the
first and last elements of the list, respectively. But these functions don't help us
access all the other elements in between. So, how do we access all the elements in a
container?
Iterators provide a universal method of accessing the individual elements of a container. They are designed so that iterating through a container doesn't depend on the container. So, essentially the same code can print a vector, list, set, map, and other containers in the STL.
The design of iterators is influenced by pointers. In a way, an iterator is a fancy "pointer". A pointer is to an array what an iterator is to a container. Consider the following code for printing the contents of an array.
const int size = 5;
int array[5] = { 0, 1, 2, 3, 4 };
int* start = array; // a pointer to the first element of the array.
int* end = array + size; // a pointer to the end of the array. Note that end doesn't
// point to an element in the array. It points to the first byte after
// array.
for (int* i = start; i != end; i++) { // starts at the first element,
cout << *i << endl; // stops after the last element in the array.
}
I'm sure you can all tell what the above code does without needing to bring out the big guns (hand-tracing).
But I'll just say what it for completeness sake. The program prints out the contents of an array.
We know the name of an array is the address of its first element i.e. a pointer to the first element.
A pointer is simply an address. When we add the number one to a pointer of a variable of
type T
, we get the address i + sizeof(T)
. This is the address of the first
byte that comes after the variable pointed to by i
. Finally, when printing the element of
the array we use the dereference operator *
on i
. This is because i
is a pointer, but *i
is the value pointed to by i
.
If you understand the above code then you already have the big-idea behind iterators. Let us now see the equivalent of the above code using vectors and iterator.
vector<int> vec { 0, 1, 2, 3, 4 };
vector<int>::iterator start = std::begin(vec); // an iterator to the first element of the vector.
vector<int>::iterator end = std::end(vec); // an iterator the end of the array. Note that end doesn't
// "point" to an element in the array.
// It "points" to the end of the container.
for (vector<int>::iterator i = start; i != end; i++) { // starts at the first element,
cout << *i << endl; // stops after the last element in the vector.
}
The main change from the first program is the substitution of int*
(an integer pointer) for
vec<int>::iterator
(an iterator to a vector of integers). The class vector<int>::iterator
is an inner class i.e. a class within a class. It is a class defined in the class vector<int>
.
To access the iterator class defined within vector<int>
we use the scope operator ::
. This is why the
type is vector<int>::iterator
.
With the above code you can print all the elements in a list
or set
(and other containers we didn't cover) by substituting either list
or set
for any occurrence of vector
.
Another important change from the first program is how we get a "pointer" to the first and last elements of the container.
For an array, we had to compute it ourselves, but for a container, we just ask the container. We use the begin
and end
functions in std
, they return an iterator to the beginning and ending of the container, respectively.
Besides those changes, the rest of the program remains essentially the same. We can advance the iterator using the
++
operator, we can access the element "pointed" to by the iterator using the dereference operator
*
. We will only work with two classes of iterators: bidirectional and random-access iterators.
Bidirectional iterators all us to move the iterator forward and backwards with the operators ++
and --
, respectively. Random-access iterators allows us to perform integer arithmetic. So,
if i
is a random-access iterator, i+n
and i-n
are valid expressions.
They behave just as they would on array i.e. i+n
is an iterator to the n
th element
that comes after i
, i-n
is an iterator to the n
th element
that comes before i
. It also supports the operator []
. For example, if i
is random-access iterator and j
is an integer, then i[j]
is equivalent to *(i+j)
.
Exercise
Write a function that takes a list
of integers (by reference? by constant reference? by value?)
and prints it in reverse order.
Verify that a vector's iterator is a random-access iterator. Do this by doing integer arithmetic on
the iterator and using the []
operator. Verify that this operator don't work
on a list's iterator.
Overwriting Contents of a Container
In the same way we can use a pointer to a variable to change the value it points to, we can use an iterator to change the value it "points" to. The following code squares the numbers stored in a list of integers.
list<int> my_list{ 0, 1, 2, 3 };
for (list<int>::iterator i = std::begin(my_list); i != std::end(my_list); i++) {
*i *= *i; // equivalent to (*i) = (*i) * (*i);
}
// my_list now contains 0, 1, 4, 9.
A lot of functions in the standard C++ library use iterators, so it's very important to understand them. Iterators are very useful for writing generic code i.e. code that runs on a wide range of types. For example, we could write a function that can print any of the container classes. Let us now view a few functions in the standard library that use iterators.
Suppose I wanted to insert the number 0 in the middle of the list { 0, 1, 2, 3 }
.
I could do so using the list class's insert
member function as follows:
list<int> my_list{ 0, 1, 2, 3 };
list<int>::iterator itr = std::begin(my_list); // itr points to 0.
itr++; // itr points to 1.
itr++; // itr points to 2.
my_list.insert(itr, 0); // 0 is inserted before of 2.
We can also use insert
to add a range of elements into a list. This version takes
three iterators, the first iterator "points" where the elements will be inserted. The second and third
iterators "point" to the beginning and ending (respectively) of the elements to be added.
Note that the whatever is pointed to by end isn't added. Here's
and example.
list<int> my_list{ 0, 1, 2, 3 };
vector<int> my_vec{ -4, -3, -2, -1 };
list<int>::iterator itr = std::begin(my_list); // itr points to 0.
vector<int>::iterator itr_start= std::begin(my_vec); // points to 0, the start my_vec.
vector<int>::iterator itr_end = std::end(my_vec); // points to the end of my_vec
my_list.insert(itr, itr_start, itr_end); // inserts the list [itr_start, itr_end) before the value
// pointed to by itr.
for (list<int>::iterator i = std::begin(my_list); i != std::end(my_list); i++) {
cout << *i << endl;
}
// my_list now contains {-4, -3, -2, -1, 0, 1, 2, 3}
Exercise
Create a function called size
it should take a list of strings and return the size of
the of the list. Don't use the size function, use only iterators.
Overload the size function by creating a version that takes a vector of integers. (To overload a function is to create a function of the same name but different signature). Is there any significant difference between the two functions? If there is, send me and explain it to me for extra credit.
Using only iterators, create a function called reverse
. It should take a list of integers and
return a copy of the list, but in reversed order. Don't use push_back
or push_front
,
use insert
.
Using only iterators, create a function called remove_duplicates
. This function should take
a list of strings by reference and return nothing. When the function returns, the list given to the function should
have no duplicates. For example, given the list {"hungry", "hungry" "hippos"}
, when the function returns
this list should be {"hungry" "hippos"}
.