Multimedia Information Retrieval Reducing information OveRload

Like the real-life equivalent, our miRRor shows one similar objects

Index: research statement, introduction, research topics, approach, publications, other information.


In this PhD project, we study multimedia query processing, and in particular its implications on database design. We assume a modern extensible database system such as Illustra or Monet. By extending the database, new representations of the multimedia data can be used, and advanced search techniques can be incoorporated in the database architecture.

From a user perspective, the main unsolved problem is how to make use of these different representations and techniques to fulfill an information need. We propose that a multimedia query processor must provide an iterative query process using relevance feedback. Also, the query processor must identify which of the available representations are most promising for answering the query. In addition, it should combine evidence from different sources.

Recently, we have started to design and implement a prototype database system that can provide this functionality to the user. In particular, we focus on information retrieval using Bayesian reasoning over a concept space of automatically generated clusters.


Information superhighways are hyped by the media. However, these worldwide computer networks are really nothing more than data highways. Television and radio channels blast newsreports, documentaries, and talk shows through the air. Thousands of magazines and papers are printed all over the world. The amount of data around us is so huge that it has become impossible to deal with in an efficient manner. People call this the problem of information overload.

This research program focuses on the application of database technology to multimedia data. Ideally, hundreds of television and radio broadcasts would be covered by a database application. This database system can notify its clients whenever interesting data becomes available.


Most information systems that claim to be multimedia databases are not more than huge collections of data like video, audio, text and images. The only query facility provided uses manually added textual descriptions of these multimedia data.

We believe that true multimedia database systems provide content-based retrieval facilities. Multimedia objects in such a database are first-class citizens. It should be possible to formulate queries referring to several types of data simultaneously. For example, a user may be interested in web pages about inference networks containing a photograph of Bayes.

Full-text retrieval systems have been developed since the early sixties. Many ideas from this field can relatively easy be applied to multimedia data. Text retrieval systems were the first information systems to deal with approximate queries, using similarity between documents to drive the retrieval process. Unfortunately, full-text retrieval systems are never integrated in database systems. Therefore, one of the purposes of this project is to integrate information retrieval techniques into the traditional database environment.

Querying a multimedia database requires new querying strategies. Multimedia queries are hard to formulate explicitely. For example, try to explain what music you like. Often, it is far more easy to show an example document and have the system identify similar documents. This querying strategy is better known as query-by-example. The concept of relevance feedback plays an important role. A somewhat related strategy to find interesting documents in a multimedia database is the navigational querying paradigm. Similar documents are grouped together and by wandering through the search space you can find answers to the queries. One of the research questions to be addressed in this project, is whether such querying paradigms are really beneficial to the users.


This project is still in progress. Several steps have already been taken. This section is an attempt to reflect the progress of the research project. However, some of the focus has developed in a slightly different direction over the last three years. The most recent information can be found in the summary of what should be my thesis after another year of writing, research, and development. The core ideas about multimedia retrieval are still reflected in the other subsections of this paragraph.

More information

VLDB99 Demo Webpages

We gave a demo of the Mirror DBMS (using an image retrieval application) at VLDB '99. A tour of this demo has now been made available (to be included on the ACM SIGMOD cdrom).

More information


We chose for the integration of IR and database technology to address the problems introduced above. The research topics with respect to the black box in the middle of the Mirror architecture are the red line through my PhD work. One of the hypotheses that I try to prove in my work is whether the techniques from the next subsection can provide the functionality to fill the black box.

More (old) information

Bayesian inference networks

The information retrieval system INQUERY uses Baysian inference networks to find documents fulfilling an information need. This approach seems very suitable to describe retrieval processes in multimedia databases. However, the INQUERY system is a dedicated system. We investigate whether the inference process can be expressed as database queries.

More (old) information

Student projects

Several groups of students work on aspects of the MIRROR research topics. We provide an overview of the student activities on a separate web page. Projects include research into advanced indexing techniques, audio retrieval and the design of television for the future.



Grant award

We are proud to mention that we participate in the Informix Engines for Innovation Research Grant Program. We investigated possible advantages of extensible database technology for our projects and try to identify requirements for further refinement.
Index: research statement, introduction, research topics, approach, publications, other information.
Last updated: $Id: mmdb.html,v 1.26 1999/08/27 20:58:05 arjen Exp $
Maintained by: