For latest info see News
(Last update: 07 Jan '03)

Initiative for the Evaluation of XML retrieval


April 2002 - December 2002

Overview

The widespread use of XML in digital libraries, product catalogues, scientific data repositories and across the Web prompted the development of appropriate searching and browsing methods for XML documents. As part of a large-scale effort to improve the efficiency of research in information retrieval and digital libraries, the INEX initiative, organised by the DELOS Network of Excellence for Digital Libraries, initiates an international, coordinated effort to promote evaluation procedures for content-based XML retrieval. The project provides an opportunity for participants to evaluate their retrieval methods using uniform scoring procedures and a forum for participating organizations to compare their results.

The aim of this initiative is to provide means, in the form of a  large testbed (test collection) and appropriate scoring methods, for the evaluation of retrieval of XML documents. The test collection will consists of XML  documents, tasks/queries, and relevance judgments given by (potential) users with these tasks. Participating organizations contribute to the construction of this test collection in a collaborative effort to derive the tasks/queries and relevance judgments for a large collection of XML documents. The test collection will also provide participants a means for future comparative and quantitative experiments. Due to copyright issues, only participating organizations will have access to the constructed test collection. 

Based on the constructed test collection and using uniform scoring procedures the retrieval methods of participating organization will be evaluated and compared against each other. Participants will present their approaches and final results at the final workshop in December. All results will be published in the workshop proceedings and on the Web. 

Documents

The initiative is supported by the IEEE Computer Society. The set of documents for the test collection is made up of scientific articles from journals and proceedings of the IEEE Computer Society covering range of topics in the field of computer science. The collection contains approximately 10-15 thousand articles from over 20 different journals/proceedings from the seven-year period of 1995-2001.

Topics/Queries

The queries/topics are created by the participating groups. Queries/topics can be of two types: content-only or content-and-structure. They may be broad or narrow topic queries and should form a representative range of real user needs over the XML collection. Each group creates a set of candidate topics/queries from which the final, approximately 50, topics is selected as part of the test collection.

Tasks

The task, to be performed with the data and the final topics/queries, is the ad-hoc retrieval of XML documents. The answer to a query is a ranked list of XML elements, the top 100 elements of which is submitted as the retrieval result. The retrieval results of each participating group is then pooled and returned to the topic authors for relevance assessment.

Relevance assessments

The relevance judgments are of critical importance to a test collection. Relevance assessments will be provided by the participating groups and should be made by the persons who originally created that topic. Each assessor will judge approximately 3-4 topics, either the topics that they originally created or if these were removed from the final set of topics, then topics that were close to their original queries.

Evaluation

Evaluation of the retrieval effectiveness of the retrieval engines used by the participants will be based on the constructed test collection and uniform scoring techniques, including recall/precision measures, which take into account the structural nature of XML documents. Other measures, which consider "near misses", when an element near one that has been assessed relevant has been retrieved, will also be used.

The results will be returned to all participants. Participating organizations will present their approaches and compare their results at the workshop in December. All results will be published in the proceedings of the workshop and on the Web.