Wednesday, July 13, 2011

My experiences with Google Translate

Machine Translation is a sub-sect of computational linguistics. Computational linguistics is all about studying how to use computer software to translate text/speech from one natural language to another natural language. Any language which is typically used for communication (spoken, signed & written)is called Natural Language .

Machine translation is all about performing simple substitution of words. But this is not just enough to create a good translation due to the complexities in natural language translation. The Statistical Machine Translation (SMT) is one popular paradigm of Machine translation although there are many in the field of research now. The SMT uses word based translation and in in this, it is necessary to assume that each word in a language covers the same concept or meaning in other languages. In practice this is not really true.

Another thing is due to morphology (sentence formation) and Out Of Vocabulary(OOV) issues, simple word-based translation can't translate well. For example each word in a language could produce any number of words in other language (flower - பூ, மலர், புஷ்பம் & புய்ப்பம்), — sometimes none at all ( The Japanese word ‘Kyoikumama’ is an example for it - “A mother who relentlessly pushes her children toward academic achievement”).

Few days ago, I tried to understand the Google’s translation (not transliteration) service’s functionality. It is a free translation service that provides instant translations. By looking for patterns from millions of documents to decide the best translation for us with intelligent (?!) guesses, the Google’s translation service is translating natural languages.

I learnt from that site that Google Translation service uses the SMT method and I experienced few interesting things while I tried it. I have displayed them here to illustrate the shortcomings in SMT. I also very firmly believe that they are the perfect examples for the issues in natural language translation describe here.

The first picture shows how a word called ’sentence’ is translated (Example 1). I don’t want to give any explanations to this, as I assume that the picture itself says a lot. Due to my inquisitiveness, I made another attempt and encountered a similar experience (Example 2). But though the translation is a fallacy according to linguistics, I have an opinion that it’s not wrong (just Kidding!).
Note: Google has given a clear disclaimer that the service may not be perfect and it requires more human intervention to improve the quality of translation.




Example 1





Example 2