| Dominican University College

Blog - Blogue

  • Google Translate: myths and reality

    Thursday, April 01, 2021
    Iva Apostolova


    Academics, particularly the ones in the Humanities, are only too fond of sarcasm and derision when it comes to common use of language. The fight over linguistic territory, which almost always involves a claim to a certain (cultural) identity, is as old as the human civilization. Nothing new there. Nothing new in the intellectual elite taking pride in swatting flies with a sledgehammer or going all around Robin Hood’s barn, either. But what makes our times particularly interesting when it comes to language skills is the existence of online translation engines such as Google translate.

    Translation devices are nothing new, of course. Pocket-size conversational guides have been around for as long as tourism has. But the first proper machine translation engines date back from the Cold War era and like everything associated with the period, the race for total world domination birthed some pretty cool inventions. In 1954 the first IBM 701 computer translated 60 Russian sentences into English. (Well, if one wants to be anal about facts, the very first machine-like translation was documented in 1933 by the Soviet scientist Peter Troyanskii who used cards with sentences in four languages, a typewriter, and a film camera*.

    Ever since, philosophers, AI enthusiasts, cognitive scientists, neurobiologists, to name but a few groups with vested interests, have tried to perfect machine learning, a big part of which is language comprehension. And when you think about it, isn’t the need to be understood the most human of all human needs?

    Why are we, then, still such snobs when it comes to digital translation engines such as Google translate? Like any other multi-faceted question, before we can look into possible answers, it is important to separate myth from reality. Google Translate is the most popular, although not the only, online translation service which was launched in 2006 with the ability to translate from and into nine languages (Yandex Translate, Linguee, and Bing Translate are worthy mentions). Today, it boasts over 500 million daily users, and it offers translations in 109 languages. Pretty impressive, huh?! But how does it work exactly? Here is where the story gets complicated but pretty amazing, too.

    The very first attempts in machine translation used a combination of rule-based machine translation and direct translation, both of which work by isolating and matching up individual phrases and words. These two methods have obvious shortcomings which have resulted in hundreds of “Don’t stand there and be hungry. Come on in and get fed up!” types of jokes, supposed to demonstrate the inadequacy of direct translation. But then came the statistical machine translation technique (SMT) which is what Google Translate used back in 2006 meaning that the engine would translate the sentence or phrase from the original language into English as an intermediary step, before translating it into the target language**. The machine would be able to quickly find examples of all kinds of frequent uses of a given term within certain sentential structure. The original SMT examples were drawn from the thousands of the UN Security Council and the EU parliament meetings transcripts. All the fascinating details about the various models of SMT that finessed the lexical orders of words, the grammatical exceptions, etc., etc., aside, the goal was to convey the general intent of the sentence.

    But in 2016 Google Translate introduced a game-changing approach based on the premise of deep machine learning known as neural machine translation (NMT). NMTs vary quite a bit depending on what languages we are dealing with. As the blog writer Ilya Pestov points out, “Neural translation contains 50% fewer word order mistakes, 17% fewer lexical mistakes, and 19% fewer grammar mistakes. The neural networks even learned to harmonize gender and case in different languages. And no one taught them to do so.” ***.

    The point is that neural machine translation, inspired by deep learning has even deeper layers yet to be discovered. So, to paraphrase my original question, why are we so threatened by translation engines that we feel the need to mock them any occasion we get? On the surface, it looks like this is yet another manifestation of the deeply imbedded human fear of the intelligent machine. But a scratch below the surface provides a slightly different outlook. When you think about it, translation engines, available to anyone who has internet access and most importantly, being free of charge, actually level out the language playfield, and allow users from all walks of life to, well, sound educated, really.

    Rest assured, no translation engine can or should substitute learning a language from the bottom up (i.e., using real-time human interaction in a class setting, for example). At the same time, relying exclusively on online translation engines without having any prior knowledge or, at least, a basic understanding of a given linguistic structure, will most likely yield funny at best, and disastrous at worst, results. But for the intelligent user who is looking for a quick grammar check of a formal-style sentence, or who wants to know possible alternative uses of a phrase, Google Translate can be a real pal, si vous voyez ce que je veux dire:)

    * Ilya Pestov (March 12, 2018). A history of machine translation from the Cold War to deep learning.


    *** Ilya Pestov (March 12, 2018). A history of machine translation from the Cold War to deep learning.