2005 Machine Translation Evaluation


Task Name

Evaluation Series (web-site)

2005 Machine Translation (MT) Machine Translation

Date of Latest Evaluation

NIST Point-of-Contact

Evaluation Period: May 9-13, 2005
Evaluation Workshop: June 2005
E-mail for more information

Sponsoring Program

Related Tasks

DARPA Translingual Information Detection Extraction Summarization program  

Task Description

The objective of the MT evaluation series is to develop technologies that convert free text from a variety of languages into English.

Data Domains/Sources

There were two source languages (Arabic & Chinese) and one target language (English) evaluated in the MT-05 evaluation.

Both the Arabic and Chinese MT-05 evaluation test sets include 100 newwire documents.

Evaluation Metric

Translations were measured automatically using the BLEU statistic as originally defined by IBM and described in the paper Papineni, Roukos, Ward, Zhu (2001). "Bleu: a Method for Automatic Evaluation of Machine Translation" (keyword = RC22176).

The BLEU metric measures performance of a task on a scale of 0 to 1 with 1 being the best.

Best 2005 Evaluation Score

Arabic-to-English: BLEU score = .5137
Chinese-to-English: BLEU score = .3531

Best Score Ever

MT-05 established new highs.

Evaluation Workshop Presentations/Papers

The results are published on the NIST MT web-site.



Speech Group

Created: 17-Dec-2004
Last updated: 03-Aug-2005