"Translation Memory" - Global, Free Repository?

Posted 25 Mar 2002 at 09:23 UTC by davidw Share This

Upon reading an article in The Economist regarding machine translation, I began to see that there might be some potential for open collaboration in the field of what is known as "Translation Memory", or, in other words, the use of already translated phrases to translate new, and slightly different phrases.

I haven't done professional translations for a couple of years, and I never did anything that expensive or high-level, so I don't know what the state of the art is, even in the free software world. It appears as though there are some efforts to pool resources, such as the kbabel features listed here:
I wonder, though, if there might be even more benefits to be had by pooling resources even further, maybe putting the phrases in the 'compendium' in the public domain, so that licensing wouldn't be a problem. Perhaps it could also be combined with a tag or trust metric system, in order to let the translator choose who to trust or not trust as a provider of previously translated phrases to use when munging documentation.

Prior to writing this article, I did a bit of reading on the web, and found that there are tools out there...foreigndesk is an open source (albeit for windows) tool to use for translating. I have no idea about the quality, but it looks interesting. There is also a DTD, which the aforementioned program supports, for exchanging translation memory data. Of course, as mentioned above, Kbabel looks like it has a lot of features, too...

So - what do you think? Interesting idea? Too much of a niche? Am I missing something obvious because of my lack of knowledge in this field? I hope that this article has provided some food for thought - I know that the idea in the original article in The Economist certainly piqued my interest.

Irony, posted 25 Mar 2002 at 09:53 UTC by chalst » (Master)

I think this is an excellent idea, and if you can get the trust metric idea to work it might be pathbreaking.

In another vein, I found the Economist article reminded me of something I find ironic. There was a big growth in the 1960s due to excitement in MT (machine translation), and one of the idea that pushed MT was the Vietnam War: deciphering intercepted VietCong messages was obviously of some interest to the military, and there were many people at the time bullish about the prospect of taking a language analysis (roughly, dictionary plus grammar) of a foreign language and mapping it onto english in some automatable way.

The principal beneficiaries of the wave of funding that came in was Chomsky and his followers, since their approach to language emphasises the syntax of language above its phonology, semantics and pragmatics, and the success of the above approach to MT depends upon translation being reducible to a problem of syntax. Nowadays most people think this is too simplistic, but Chomsky et al are now `in charge' of American linguistics, due to funding they received because of America's war in Vietnam. In view of Chomsky's political views, I find that rather ironic.

Shared translations for Open Source, posted 3 Apr 2002 at 02:57 UTC by superant » (Journeyer)

I think this would be a great help for Open Source projects. There are thousands of projects that need help with translation of menus, error messages and documentations. Common repositories would be a great help for the overworked programmers.

FreeCATS, posted 23 Jan 2003 at 14:41 UTC by chalst » (Master)

Have a look at FreeCATS...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page