News Item: : Machine Translation and Translation Memories
(Category: Issues in Linguistics)
Posted by admin
Friday 01 February 2008 - 09:58:51

Machine Translation and Translation Memories
 Machine Translation in its present form and implementation is not a blessing, but a curse. It is only good for those who are taking part in designing and producing products heavily sponsored by naive financiers.
MT is nothing but the utilisation of paired sentences, one in L1 and another in L2. They are matched via their numeric identifiers and the match itself is examined through a statistical analysis of characters that is a morphological analysis of the texts – any text, subject matter being irrelevant.
 Now this solution in its basic form has been around for a long time. It is called EDI or Electronic Data Interchange. The main difference is that in EDI you get an exact translation of what you have on both sides, no messing around with texts that you do not understand on each side, because everybody sees the text in his/her native or original language. In MT, however you tend to have a non qualified translator, a simple match-maker of instances from a bilingual paired corpora, whose job is to check when the match is not 100%, whether by modifying the new segment, you can get a perfect match or not. Now since the corpora may be homogenous, you seem to be working efficiently, producing more „translation” than otherwise.
 But that is not true. And it is not just efficiency, but quality at risk. Serious problems may occur despite the continuous development of the software to get rid of the snags. Blunders do happen. In case of a nearly 100% match where in one sentence you have 120 ml, and 210µ in the other, you will notice that the numbers are different, so you change them accordingly, and you may overlook the unit. Well, you may not, but somebody did, it is a true story. Another true story illustrates the point. But you need a little background info. Working with a MT software you do not see the original context of a segment so you may not remember the book form and forget that a line is from a table or a free running page. The context therefore is slightly different and can be deceptive as in the illustration below. If you read Hungarian, then it is an exercise for you and try to answer the question. If you do not, just read on.
Scanned at 06-12-2007 19-50k.jpg    apli_kk.jpg

 aplk.jpg
 In the context of the book (how to make interviews for employment, hiring) the meaning of word application in Hungarian (alkalmazás) may pop in first, despite the fact that alkalmazás has at least two different connotations, such as employment and application, and application has again two different meanings, such as alkalmazás és felvételi kérelem, the latter being the document sent in to apply for a job, which should have been written there.
 And that meaning was missed out when that page was translated on a word by word match (100%) in the above example. Do you want more? You can, but nor form free, I am afraid. Therefore Six Sigma quality control will never be attainable in such an environment, and the rest of QA is a matter of faith, not reality.
 In reality, translation is not and should not be focused around finding the right word or matching texts on the basis of morphological analysis. Those working with AI already realise that, but they have a problem: they have not identified the „terminal symbols” as yet, and a couple of other items that are needed for a top down approach to MT. But it is in the air. And they know that there is a tremendous lot to be gained or lost once and for all.
 What translation is about should be clear: orientation in reality. In one chunk of reality (C1) objects have names in L1, in another chunk of reality (C2) in L2. If you need to translate anything on C1 into L2 from L1, you need to find the relevant expressions in L2 on C1. It is as simple as that. If there are no words for the details of C1, you need to create words in compliance with the word formation rules of L2 – and you should seek agreement about the solutions or equations between two speakers, a speaker of L1 and L2 and a speaker L2 and L1 who not only speak the two languages, but are also knowledgeable about C1 and c2. Without the latter they will produce crap in some of the time and in some of the assignments.
 Therefore you need to have a proper definition of the word translation itself.


This news item is from www.firkasz.com
( http://www.firkasz.com/news.php?extend.52 )