Alexa Home
 Search the web:     Powered by Google
 Browse:     
Alexa Services for Webmasters
Alexa Data Services
Alexa and Amazon.com Associates
Company Information
Frequently Asked Questions & Help
 

Alexa Crawl
 
The world's largest crawl and massive archive is available

Imagine the entire contents of the world wide web... on disk.

The Alexa crawl gives you the ability to tap the world's largest crawl index.

Massive Archive.
Spanning seven years, filling over 500 Terabytes of online storage and expanding at a rate of 30 Terabytes per month, the Alexa archive represents the largest collection of Web information in the world today.

Largest bi-Monthly Crawl
Compare Alexa with the largest search indexes and you'll see, Alexa is the largest -- over 3.5 billion unique URLs, 3 billion unique pages, all updated every 60 days. All this can be yours.

Powerful Tools.
To explore information that is ten times the size of the Library of Congress, Alexa has developed a proprietary operating system and a powerful set of data mining tools that leverage excess process capacity on hundreds of parallel computers.

Specialized Collections
Specialized collections of web data may be developed on request and, on a subscription basis, updated up to several times per day. Collections can be used as a one-off research-oriented collection or as a continuous up-to-date collection for Archivists and Search Engines.


Access Alexa's massive crawl of the web in one of the following ways:

Free
Alexa, in partnership with the Internet Archive, offers free access to an archive of Alexa's crawl, going back to 1996 via the Internet Archive Wayback Machine. This unique service, the first of its kind, provides public access to over 10 Billion archived web pages.

Special Collection - Hosted
Specialized archive collections can be made for a reasonable cost. Working with you, Alexa would generate and maintain a custom index of web content available via web interface. This service is perfect for archivists or historians who would like to create a special collection of web documents available via the web. Example: September 11th Archive, commissioned by the Library of Congress.

Special Collection - Portable
When having a copy of the crawl at your location is the only option, Portable is for you. Alexa generates a special collection of archived documents, places it on disk and ships it to your location. Collections may be as small as a few hundred web pages or as large as several billion, depending on your needs.

Entire Web - Hosted
Alexa's entire crawl of the web can be made available to you on a subscription basis with access to Alexa's specialized set of datamining tools. This product provides the maximum performance, access and update frequency.

Entire Web - Portable
For organizations capable of hosting or mining an entire crawl index that exceeds 60 Terabytes in size, Alexa can ship the contents of the crawl to your location. Current customers include the Internet Archive and the Library of Alexandria in Egypt.

Frequently Asked Questions

Q: How large is the crawl?
A:
Very, very large. The crawl is over 60 Terabytes in size, spanning over 3.5 billion unique URLs. This is larger than Google, and approximately 4 times larger than Altavista's published size.
 
Q: How often is the crawl updated?
A:
The web-wide crawl takes approximately 2 months to complete. Special collections may be created on request and updated as often as needed.
 

Contact:
Alexa Business Development
bizdev@alexa.com
fax: (415) 561-6795

Alexa Internet
www.alexa.com
Building 37
Presidio of San Francisco
PO Box 29141
San Francisco, CA
94129-0141


About Alexa | Alexa in the News! | Download the Alexa Toolbar | Help

Privacy Policy | Terms of Use

© 1996-2003, Alexa Internet, Inc.

An Amazon.com Company