CPSC 220 Fall 2002
Program 4: Chasing Links
Overview
In this part of the project you will modify your ProcessQueries
class so that instead of reading a list of URLs from a file,
it reads a single URL from the command line and follows links from
that page to find other pages to search. You did much of the work
for this program in the lab about finding links in HTML files.
Program Structure
Write a class called URLQueue with the following public methods:
- URLQueue(URL url) -- Takes a URL and creates a queue of
URLs resulting from a breadth-first search of the graph formed by
the links in the given page and in the pages linked to. Note that
initially the queue holds only the given url; it is filled incrementally when
each url in it is dequeued.
- URL dequeue() -- Returns the next URL in the queue and adds
each link in the page returns to the queue.
- boolean isEmpty() -- Returns true if the queue is empty.
Note that no public enqueue method is needed.
Modify your ProcessQueries class so that it takes a string representing
a URL (along with the ignore file) from the command line, creates
a URLQueue, and uses the URLs taken off the queue to search for the queries.
As a third command line argument, take the maximum number of URLs to
search -- otherwise the tree-building process will not terminate!
What to Turn In
Turn in hardcopy for each of your new classes, and any previous classes that
you modified (including ProcessQueries).
Also tar your directory and e-mail it to bloss@roanoke.edu.
The subject should be
cpsc220 prog4.