CPSC/MATH 402 - Computer Exercises

Log into the Linux system to do this exercise. This handout and a program are on-line -- go to http://cs.roanoke.edu/~cpsc/CPSC402A and follow the link.

Investigating Floating Point Representation and Arithmetic

Investigate the floating point representation on the Linux system by doing the following:

The program floatRep.cc is a C++ program that determines approximations of unit roundoff (machine epsilon) for type float and type double using two different techniques. One is the technique described in Computer Problem 1.3 on page 44 of the text; the other is the alternate characterization of machine epsilon described on page 20 (as the smallest positive number epsilon such that fl(1 + epsilon) > 1). The program FloatRep.java is a Java version of the same thing.

Do the following:
1. Save the programs to your directory (you can do this with the File/Save As feature in the browser or copy and paste).
2. Compile and run the C++ version. In case you forgot, the commands are:
```
        g++ -o epsilon floatRep.cc
```
  to compile and put the executable in a file named epsilon. Just type ./epsilon to run (you need the ./ to refer to your current directory). Alternately, use the command
```
       g++ floatRep.cc
```
  to compile then ./a.out to run.
3. Study the output. Compare the different answers for the different approximation techniques. How do they compare to theory (for IEEE floating point representation)? NOTE: If you want a printed copy of the output, redirect to a file then print the file. The following commands do that if the name of the executable is epsilon and the name of the file is eps.out:
```
          ./epsilon > eps.out

          nenscript -2rG -p- eps.out | lpr -h
```
4. Now compile and run the Java version. The commands to do this are as follows:
```
          javac FloatRep.java      -- compiles to bytecode
          java FloatRep            -- interprets/runs the bytecode
```
5. Compare these results to those of the C++ program.
6. Add to one of the programs above (you choose) code to determine the smallest power of two that is stored in both double and single precision (UFL). Print both the decimal form of the power and the exponent.
7. Now add code to find the overflow level. (See Computer Problem 1.2 on page 44)
The program tenth.cc illustrates the hazards of testing for equality of two floating point numbers. Save it to your directory and read the documentation and the code to see what it does.
1. Compile and run the program with the following input: (a) n = 8; (b) n = 4096; (c) n = 10. What happened and why was the behavior different for different input?
2. Change the loop control to total > epsilon (epsilon is already computed -- it is the approximation to machine epsilon). Rerun the program for the input above plus other values. Does it work correctly?
Do computer problem #1.6 on page 45.
The program poly.cc is essentially Computer Problem 1.14 on page 47 (it just prints values rather than plotting them). Save it to your directory.
1. Read the problem and the program then compile and run the program. Study the output. Notice the difference in answers for the two different calculations. Also notice the computed value of (x - 1)^6 for x = 1.
2. The later problem is related to what you were supposed to understand from doing #1.6. Modify the program using what you learned from #1.6 so the answer for x = 1 is correct.
3. The expanded form of the polynomial did not fare so well in the above calculations. In general, a polynomial should not be evaluated in expanded form but not all polynomials have nice factored forms. The recommended method of evaluation (in the absence of a nice factorization) is called Horner's method (which your book tells you about on page 316). Horner's method uses a nested evaluation. See the general formula on page 316. For this particular polynomial, Horner's method is
```
        1 + x(-6 + x(15 + x(-20 + x(15+ x(-6 + x))))
```
  Add code to the program to evaluate the polynomial this way (assign to poly3) and print out the result (as a fourth column).

HOMEWORK DUE Thursday, January 31, 2002:

Answer questions (a) and (c) in Computer Problem 1.3, page 44.
Determine from the results of your calculation of UFL whether or not gradual underflow is used on this system? Give reasons. (HINT: IEEE Double precision floating point representation is used, so you should be able to compute theoretically what the smallest number would be without subnormals and what the smallest positive number would be without them.)
What did you get for OFL?
Why did the input 10 cause problems in the tenth.cc program but 8 and 4096 worked as expected?
Answer question (a) in Computer Problem 1.6.
What did you use as an example in Computer Problem 1.6(b) to illustrate the difference between the two methods? How different were the results (compute the largest absolute error and the largest relative error you got)?
In Computer Problem #1.14, discuss the behavior for the three different methods for computing (x - 1)^6 for x near 1. Explain the behavior of the three methods. Compare the behavior of the methods for x really close to 1 versus a bit further away. What is the condition number of the function and does it help understand what is going on?

Also hand in a copy of your program that computed UFL and OFL.