CPSC 220 Fall 2003
Program 6: Assorted Improvements

In this final part of the project you will incorporate a number of improvements into your search engine, including the following:
  1. Make the items in return list of URLs actual links.

  2. Count the number of references to each page and report this number along with the relevance (but continue to rank by relevance).

  3. Search for links in both HREF and SRC attributes.

  4. Fix the problem of using a URL with an empty file as a base URL (that is, make it work for these). Also treat http://x and http://x/ as the same URL.

  5. Provide a button on the results page that allows the user to sort current results by relevance or by # of links.

  6. Avoid re-processing web pages whenever possible as follows:

  7. There are many other possible improvements, any of which you can undertake for extra credit. The amount of additional credit will depend on the difficulty and impact of the feature. Be sure to clearly document any such improvements that you undertake.

What to Turn In

Turn in hardcopy for any new class and for any previous classes that you modified. You do not need to e-mail your code to me, as I will view it as a servlet.