Harvest Web Indexing

The Harvest indexer offers a distributed solution to the problems of indexing data made available on the web. With each web server running a local Gatherer feeding into a central Broker many of the problems of web crawling are avoided. The Harvest Indexer can fetch and index data made available by HTTP, Gopher, FTP, or NNTP. It has summarisers capable of indexing data in a wide variety of file formats.

Harvest-NG is reimplementation of the Harvest gatherer in perl. Details are available from its documentation site.

The Harvest Indexer is based upon software originally developed by the IRTF-RD as part of the ARPA-funded Harvest Project