SNT Report.com home

May 10, 2005

GPO, LOC to Utilize Web Harvesting

"Government Printing Office officials, who have a significant role in preserving government information, want to capture fugitive publications, which are documents that federal agencies have published on the Web but for which no copy or record exists in GPO's database.

"To recover such documents for preservation, GPO officials are interested in new software technologies such as Web harvesting, and they are reviewing proposals from companies that make such software.

"Web harvesting, sometimes called crawling or spidering, is more than searching for and discovering information. Harvesting techniques are used for downloading code, images, documents and any files essential to reproduce a Web site after it has been taken down."

Aliya Sternstein. Fugitive Documents Elude Preservationists. FCW. May 9, 2005.

See also:
Susan M. Menke. GPO and its Collection of Last Resort. GCN.com. April 20, 2004.

SNTReport.com™ The Online Journal for Social Software, Digital Collaboration & Information Policy. A Seso Group™ Venture.

Posted by Carol Schwartz at May 10, 2005 08:32 AM | Send to a friend! | Topic(s): Digitization, E-Government, Libraries & Information Centers, Search & Data Mining
Comments