So, just what makes AltaVista so fast?
The part of AltaVista that actually does the searching and finds the answers to the queries is the most powerful computers that Digital Equipment Corporation makes -- 16 AlphaServer 8400 5/440's (and still growing). Each with 8 GB of memory -- or as much as 500 brand new personal computers with 16MB would have combined. The RAID disk is actually a set of hard disks that work together, so if one of them fails, another can step right up and take over without having the index crash.
So, if AltaVista is that big, how come it's so fast when I submit a search query?
Think of it like this: Even though the whole index of the Web is more than 200 GB, most of the queries are for a very small group of words - less than 10 percent of the total amount of stuff AltaVista keeps stored. When you submit a query, you're really just accessing a small portion of AltaVista's index, so the question and answer part goes quickly - less than a second in most cases. Oh, and we use 64-bit addressing. It refers to the way the Alpha computers think - very efficiently.
So, why can I access AltaVista faster than other search services?
It's the network, of course. AltaVista has 100 megabits per second access to the Internet through DIGITAL Equipment Corporation's Palo Alto gateway --the best connected corporate Internet gateway in the world.
Where does the index come from? How do you decide what information gets indexed?
The AltaVista index is created by our Web spider, called Scooter (isn't he cute?), that roams the Web collecting Web pages-approximately six million per day. Scooter then takes the pages back to AltaVista and gives them to our NI2 indexing software, which then indexes each word from every page. The index saves each instance of each word, including instances of different capitalization, as well as the URL of the page on which it appears and some information about its location in that document. Also, the index software indexes words with non-Latin characters using English-equivalent letters. These details are what allow you to search for individual words, phrases in which word order is essential, and words or phrases with specific capitalization or accents.