Automatic translation and its limitations - Pt. 1

November 24th, 2009

What is automatic translation?

Automatic or machine translation is the use of state of the art technology for translation of a text from one natural language to another without the intervention of human beings. With the advent of digital computers, it usually refers to using software tools that carry out the translation. The translation process is an inherently complex one and not easily amenable to automation. Translation is not a mere word-for-word substitution. The translator must interpret and analyze all the words in the text and know how each word / sentence influences another. One of the earliest attempts at machine translation is the Georgetown experiment in 1954 that involved automatic translation of over 60 sentences in Russian into English. Since then a lot of funding has gone into machine translation research with limited success. The translation process basically involves decoding the meaning of the source text and encoding it in the target language. This apparently simple procedure involves complex cognitive operation. Decoding the meaning involves interpretation and analysis of all the features of the source text that requires the translator to know the grammar, syntax, idioms, semantics etc. of the source language and also the culture of the speakers. The latter is particularly difficult to incorporate in any automatic process. The translator needs to have the same in depth knowledge in the target language for encoding the meaning in it. Thus the machine translation has to “understand” the source language in the same way as the translator and “create” new text in the target language just like a human being. There are basically two approaches to machine or automatic translation – rule based and statistical. Rule based automatic translation involves building several linguistic rules and use of many bilingual dictionaries for each language pair. It uses several concepts like transfer-based machine translation, dictionary-based machine translation and inter-lingual machine translation. Transfer based machine translation can be further divided into batch and interactive systems. As the name suggests, a batch system outputs the entire text at one go while the interactive system waits for human intervention to choose the best word. The software using rule based translation parses the source text and creates an intermediate representation of it using large set of rules and morphological, semantic and syntactic information. It is then transferred into the target language using its grammatical structure. The process can be refined by the user by adding appropriate terminology. Statistical machine translation generates translations using statistical techniques. Billions of words of text in the source and target languages along with examples human translations are fed into a computer. A translation model is developed using statistical learning techniques that improve the quality of translation as the translation progresses. Both the approaches have their own advantages and disadvantages. While rule based approach provides consistent and predictable translation quality, statistical method provides for rapid and cost effective development. There is a third approach to machine translation which is hybrid machine translation that exploits the strengths of both rule based and statistical methodologies. Development of a translator using any of the three approaches is very time consuming and expensive. The translation software needs to be upgraded with constant human intervention for quality improvement. The latter is a continuous process and can take long time. The next part of the article discusses the limitations of machine translation. --- One Hour Translation is the world’s fastest professional translation service. One Hour Translation provides Fast, High Quality Expert Translation service on a 24/7 basis thanks to a community of over 8000 certified translators from all over the world.

You might also like: