We have seen that when we declare an array, the name of the array is really a variable that holds the address for the first element of the array. Thus,
int List[10];
actually allocates 11 words (4 bytes for each word on our lab computers.) of memory for the program: ten words for
the ten elements of the array, and one word that holds the address
of the first of the ten words of the array. Thus the
variable, List
is implicitly a pointer, i.e., a
variable that itself isn't a data object, but points to a data
object.
C++ provides a mechanism to explicitly declare variables that are pointers to data objects. The asterisk is the syntax used for such a declarations. For example,
int * numPtr; Fraction * fractPtr;
declare two pointers: numPtr
is a variable capable of
pointing to an integer, and fractPtr
is a variable
capable of pointing to a Fraction
object.
Note that with the above two declarations, the two pointer
variables are not pointing to any valid integer
or Fraction
object. C++ provides a mechanism for a
pointer variable to point to an existing data object using the
ampersand syntax. For example,
int aNum; Fraction aFract; numPtr = &aNum; fractPtr = &aFract;
assigns to numPtr
the address of the
variable aNum
, and to fractPtr
the address
of the variable aFract
.
It is sometimes useful to see the actual address of a variable in a
program. If asked to print a pointer variable, the cout
object of the iostream
library will print the contents of the pointer variable, i.e., the
address it holds, as an integer. These are often very big
numbers, and so it is useful to print them out in
hexadecimal. The keyword hex
can be used to
ask cout
to print any integers in hexadecimal, until
the dec
is used. For example,
cout << hex << numPtr << dec << aNum << endl;
will print the value of numPtr
in hexadecimal, and the
value of aNum
in decimal.
The asterisk can also be used to dereference a pointer. Thus,
*numPtr = 7;
will have the same effect as
aNum = 7;
it is useful to use parentheses to disambiguate what is qualified by the asterisk, particularly when using member functions for objects of a user-defined class type. For example,
aNum = (*fractPtr).getNumerator();
Without the parenthesis, it is unclear whether the asterisk is to be
applied to fractPtr
, or
to fractPtr.getNumerator()
.
To avoid such ambiguity, C++ provides an alternate syntax when dealing with accessing member objects for objects being pointed to by variables. This is the arrow syntax (typed as a dash followed by the greater than symbol). For example,
aNum = fractPtr->getNumerator();
has the same effect as the statement above that uses the asterisk syntax.
When accessing member objects for objects that have a pointer to them, it is best to use the arrow syntax.
The more common reason pointers are used in a C++ program is to be able to allocate space dynamically. Often times, when one writes a program, it is not clear how many data items are to be processed, and there may not be any good upper bound on the size of the data. For example, when writing a program to sort grades for students, the number of students is not always known when writing the program. In fact, any chosen upper bound may prove inadequate, and then one may have to change the bound and then recompile. Instead, it would be better to write a program that uses as much space as necessary for a given run. This is called dynamic memory allocation, i.e., getting memory allocated at run-time, rather than at compile time. Such dynamically allocated memory is given to the program by the system from heap memory rather than stack memory.
The operator new
of C++ is the mechanism to get memory
allocated at run-time. For example,
int * numPtr; numPtr = new int;
will allocate one word for the pointer variable numPtr
on the stack at compile time. Then, when the program is executed,
the second statement will allocate one word of memory on the heap,
and the new operator will return the address of this word of
memory. This address is stored in numPtr
.
Arrays can be allocated space similarly at run-time. For example, suppose the IS contains the number of student grades followed by the grades themselves, then the following function will read in the number of grades first, then get space allocated for the grades, and then read in the grades.
void readGrades () { int numGrades; int * grades; cin >> numGrades; grades = new int[numGrades]; // allocate space for the grades. for (int index = 0; index < numGrades; index++) { cin >> grades[index]; } }
Note that when the above function ends, space allocated for the
variable numGrades
and the pointer
variable grades
will be deallocated (given back to
the system), but the space allocated dynamically for the array
will remain allocated to the program. Nonetheless, the address for
these words of memory is lost (when grades
was
deallocated) and so we cannot access that memory in the program
anymore. Moreover, that memory will not be allocated by the system
to this program, or to any other program on the computer until
this program finishes. The above is an example of a memory
leak, i.e., memory that is allocated to a program but
un-accessible to the program.
To correctly use dynamic memory without causing memory leaks, once
a program is done using any dynamic memory it got allocated, the
program must explicitly deallocate that memory using
the delete
operator. For example,
delete numPtr; delete [] grades;
Note the use of the square brackets when deleting memory that is an array. Note also that the operator delete is given the pointer to the memory locations that need to be deallocated.