c-- styles for logos and headline links do not modify internet, red, or black styles -->

Intranet Journal Earthweb
Events Jobs Premium Services Media Kit Network Map E-mail Offers Vendor Solutions Webcasts

   Intranet Journal Subjects
Search Earthweb

Privacy Policy

Internet News
Small Business
Personal Technology

Search internet.com
Corporate Info
Tech Jobs
E-mail Offers

internet commerce
Be a Commerce Partner


[ Home | Discussion Forum | How Do I... | Lotus Notes Intranets | Microsoft SharePoint | Products | Shopping  ]

free news!


Why is searching the Web like rolling dice? Feature
Improving Information Retrieval with Human Indexing

Special to Intranet Design
By Kevin Broccoli, Broccoli Information Management


As company intranets grow in content, it becomes increasingly difficult to find the exact information that an intranet user may be looking for. Companies have traditionally used search engines to locate information on their intranets. However, many are finding that search engines (even the newer, so-called "intelligent" ones) are just not enough.

For example, perhaps you are looking up information on a particular subject. You type the word into the search engine interface and click "GO!" Within seconds you have a list of retrieved documents. But there are 87 of them! And there is little indication as to which document might be the one that you need. You have a choice of clicking on some of the entries with the hopes that the needed information will be within first few documents, or spending literally hours combing through each one of them.

Why is it so difficult to find what you need?

Intelligent - Not!

One reason is the manner in which search engines operate. Generally search engines look for every occurrence of the word which was typed into the search interface. Upon finding them, it lists each and every document containing that word. However, the topic may only be mentioned within some of the documents, with no information of real value.

Also, you may be searching for more specific information regarding the topic, but are not sure how to narrow your search. Or perhaps the documents use certain words or phrases within the text, and although you are typing in synomynous terms, they are not the exact terms needed. Or perhaps a word is simply misspelled.

A search engine, like other computer automata, can't allow for such errors.

For example, if you work in an insurance company, you may be looking for information regarding "theft." Some of the documents use this precise word, so the search engine grabs those pages. But it does not retrieve any of the pages using the term "robbery" or "thievery." You may not even understand why the search engine retrieved certain documents. In many instances, only the title of the document is listed, which doesn't tell you much.

One way of improving the relevance of search results is to look for keywords that can be inserted as "metadata" within the pages of the intranet. This is one of the promises of eXtended Markup Language (XML), which (among other things) lets authors tag pages precisely so users can more easily find them.

But metadata is no panacea. For one thing, the user may still be usure how to narrow a search, resulting in an overabundance of irrelevant hits. Moreover, word processing tools like Microsoft Word have long given authors the ability to add metadata to documents. Yet how many times have you filled in those Summary Info fields? Any information retrieval scheme that relies on people to categorize their ideas will at best be limited, and at worst may interfere with the creation of intellectual capital.

Indexes and Outlines

What, then, is the solution to the above mentioned problems? Simply put, an index with main headings and subcategories. Users are instantly aided in narrowing down their search by choosing from such available subcategories. The interface of an index is also quite familiar to everyone, having been employed in the back of reference and trade books practically since the beginning of print.

Even in online help for software programs, there is always a table of contents and an index. Software producers would never consider having ONLY a search engine available to find information within their help files. Instead, they know that users need to find relevant information right away, and that they will need better navigation cues than full-text searching can provide.

more ...


Of Interest
Intranet eXchange Discussion Board

Advice and Opinions