BackRub is a "web
crawler" which is designed to traverse the web.
Currently we are developing techniques to improve
web search engines. We will make various services available as soon as
Sorry, many services are unavailable due to a local network faliure beyond our control.
We are working to fix the problem and hope to be back up soon. 12/4/97
We have a demo that searches the titles of over 16 million urls:
BackRub title search demo
BackRub search with comparison (type in top box, ignore cgi-bin error)
New systems will be coming soon.
Some documentation from a talk about the system is here.
BackRub is a research project of the Digital
Library Project in the Computer
Science Department at Stanford University.
Some Rough Statistics (from August 29th, 1996)
Total indexable HTML urls: 75.2306 Million
Total content downloaded: 207.022 gigabytes
Total indexable HTML pages downloaded: 30.6255 Million
Total indexable HTML pages which have not been attempted yet: 30.6822 Million
Total robots.txt excluded: 0.224249 Million
Total socket or connection errors: 1.31841 Million
BackRub is written in Java and Python and runs on several Sun Ultras
and Intel Pentiums running Linux. The primary database is kept on an Sun
Ultra II with 28GB of disk. Scott Hassan and Alan Steremberg
have provided a great deal of very talented implementation help. Sergey Brin has also been very involved and deserves many thanks.
Before emailing, please read the FAQ. Thanks.
-Larry Page firstname.lastname@example.org