World events according to Wikipedia statistics

From Devwiki

Jump to: navigation, search

Mathias Schindler

In December 2007, Wikimedia released hourly based request statistics on its articles, thanks to the work of Domas. Before that, we had a few experiments running to get similar data by counting every n-th (de: 500, en:3000) request. The figures currently released at dammit.lt/wikistats are an invaluable source for research. The talk will describe the current sources for request numbers and methods to make use of the data.

While Wikipedia is a project to create a multilingual encyclopedia, its rather unusual editorial process results in a high flexibility to deal with breaking news and global events. Almost immediately after events have hit the international media (and sometimes even before that), a group of authors updates the articles and keeps track of any further progress. At the same time, people start to access Wikipedia articles in search for background information and a precise summary of what has happened the moment they hear about the news. We can monitor both. All edits are stored in the version history in MediaWiki, all http requests for pages are counted by our squid servers. Assuming there is a connection between world and large regional events and the number of requests at specific Wikipedia pages, we can start to ask and answer questions about this connection. And we can inverse the relationship: Can we get a better understanding of what is happening in the world right now just by looking at our request numbers?

Personal tools