CLEANEVAL is a shared task and competitive evaluation on the topic of cleaning arbitrary web pages, with the goal of preparing web data for use as a corpus, for linguistic and language technology research and development.

New as at 14 June 2007: COMPETITION OPEN until 13 July, instructions for getting data.

Prizes! A prize of 250.00 (GBP) will be awarded for the best student entrant for each task (Chinese and English). Anyone registered as a student at a University (at undergraduate, Masters, or PhD level) counts as a student. While we shall not ask you to provide proof of student status at time of entering, we shall verify the student status of the winners before awarding prizes.

CLEANEVAL is an activity of ACL-SIGWAC, the Association for Computational Linguistics (ACL) Special Interest Group on Web as Corpus.
