Lentis/Google Translate

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Google Translate is a machine translation service for written text, speech, and text in images[1]. Since its launch in 2006, the service has become available online and as an app, including offline functionality. There are 103 languages available and over 100 billion words being translated each day as of 2018, with more than 92% of translations coming from outside the US[2]. In addition, there are numerous features, such as text pronunciation, single-word dictionaries and "phrasebook" translation-saving for later use.

How does Google Translate work?[edit | edit source]

Key Algorithm[edit | edit source]

Google Neural Machine Translation[edit | edit source]

In 2016, Google introduced its Neural Machine Translation (NMT) system, which is a framework for machine-learning algorithms to process complex data[3]. With this design, the network first encodes the original sentence as a list of individual vectors, with each vector representing the meaning of a word. Once the sentence is read, the decoder begins generating the new sentence by having it focus on a weighted distribution over the relevant encoded vectors[4]. The benefit of this is that the vectors take their surrounding vectors into context in order to produce the correct translation, one word at a time. In comparison to Phrase-Based Machine Translation, this algorithm takes an entire input sentence as a block for translation as opposed to smaller individual blocks.

Translation Quality[edit | edit source]

Google's Neural Machine Translation is currently only available for select languages; however, there are many others are in the process of development. With its increase in the quality of translation, the new implementation has attained a translation that is very close to that of a human[5].

Translate Community[edit | edit source]

As a corrective measure, Google Translate employs a "Translate Community." User can sign up as speakers of a given language, check provided translations for accuracy, and provide translations of words and phrases Google is unsure about. They also have an incentive system of badges and awards. Additionally, even those who don't sign up can click on a Google Translate translation and offer an alternative which Google uses to improve its translation quality[6].

How is Google Translate used?[edit | edit source]

Ad hoc communication[edit | edit source]

2018 World Cup[edit | edit source]

Russian and Argentinean fans cheer together during the 2018 World Cup.

Google Translate saw a large spike in usage during the 2018 World Cup in Russia. Within the country, it saw a 30% increase in total usage, and a 200% increase in queries containing the phrase "world cup" [7]. Even the word 'beer' saw an increase of 65%.

Legal cases[edit | edit source]

In 2017, a court in the United Kingdom had to rely on Google translate to inform a defendant that proceedings had to be delayed because an interpreter could not be found [8]. While this was a good use, it shows that the court was not ready to rely on the service itself to communicate during the actual trial.

In the 2017 case US vs Cruz-Zamora [9], Google translate was used for communication between a Police Officer and driver during a traffic stop which led to the discovery of illicit substances and arrest of Mr. Cruz-Zamora. At issue was whether the Mr. Cruz-Zamora was able to give informed consent to have his car searched based on a question asked through Google translate. The question the officer asked, “Can I search the car?”, translated to “¿Puedo buscar el auto?” While literally correct, when translating back to English this becomes “Can I find the car?”, to which the defendant first replied with “I do not understand” and, upon repeating, “Yes, yes. Go ahead.” The court ruled that Google translate could not be used by an officer to reliably converse with and therefore obtain consent from someone. This case established precedent for Google Translate being insufficient in the US court of law.

Translation between cultures[edit | edit source]

Because the training translations attempt to make material relevant to foreign-language audiences, it is possible for Google Translate to learn interesting "translations." For example, it has translated "Ivan the Terrible" in Russian to "Abraham Lincoln" in English[10]. This phenomenon may be due to translators using these names as examples of strong central leaders that their audiences will recognize.

Effect on human judgement[edit | edit source]

Google Translate, while very intelligent, is not a substitute for human judgement. Consider the Portuguese idiom "tirar onda," which means "to kid" or "to joke." Google Translate word-for-word translates it to "to take a wave." A human translator would recognize that this phrase is not meant to be literally translated. Reliance on machine translations by users unfamiliar with the source or target language can cause confusion and error because they are not able to use their best judgement.

Google also has the ability to capture the underlying meaning behind the language, even at times when it should not have. For instance, the expression “I’m a flat-earther” translates to “I’m a crazy person”, when translating from English to French. It found that when people use the term ‘flat-earther’, they mean an individual with unusual beliefs. However, with the resurgence of actual flat-earther believers, the system is not prepared for the literal meaning of the term for the group of people.

Bilingual road sign in Wales

An example of this phenomenon of misguided trust does not even involve the service. In 2008 a bilingual road sign in Wales used "I am not in the office at the moment. Send any work to be translated" as the Welsh translation of a sign's English message[11]. Ironically, in this particular case, the road planners could have used Google Translate to uncover the error.

What does Google Translate say about us?[edit | edit source]

Cultural bias[edit | edit source]

Google Translate is based on human-created translations, which often substitute the original cultural concepts for the target audience. This presents the opportunity for human biases to appear. For example, when typing "Barcelona, Catalonia" in Catalan, the Spanish translation is "Barcelona, Spain." [12] This translation reflects nationalist sentiment in Spanish culture regarding Catalonian independence.

Google Translate learns largely from United Nations and European Parliament transcripts, as well as popular novels, such as Harry Potter, that are translated into many languages. Furthermore, translations between two non-English languages always are translated first to English then from English to the target language [13]. For these reasons, there is a Eurocentric, and specifically Anglocentric bias. Translating between two similar languages, such as Italian and Spanish, can inject mistakes caused by English’s distance from each language, and languages of countries with smaller roles in the UN or EU have less source material and therefore produce less accurate translations.

Google Translate also chooses what languages to add to its list based on how much translated text exists in that language on the internet [14]. This means that languages of groups underrepresented on the internet, especially those with less access to the internet, are unlikely to see their languages offered anytime soon. This creates a feedback loop whereby groups with little text in their language on the internet cannot use Google Translate, and therefore have limited accessibility to the internet which then makes it more difficult to put texts in their language on the internet. Increasing accessibility to smaller language groups or speakers of dying languages is an issue Google Translate faces moving forward.

Gender bias[edit | edit source]

Google’s artificial intelligence has been found to reflect human stereotypes with respect to gender. This is seen in many translations, particularly from gender-neutral to gender-inclusive words. During translation, the system uses source documents as input from its databases and learns to give a result from the knowledge it gains[15]. This allows the system to follow both current and previous patterns that are written in the English language. For example, according to a 2016 study by DATAUSA, 77.5% of computer scientists were recorded as men while 89.3% of registered nurses were women[16]. As a result, it is common to see the male term of the title "programmer" in place of the female term, such as in the case when translating from English to Italian. Statistically speaking, the system produces a reasonable outcome given the data that it reflects.

Conclusion[edit | edit source]

Technology such as Google Translate can lower barriers to internationalization. However, as evinced in the gender bias, Anglicizing effects, and accessibility issues of Google Translate, it is important to question how the technology we use affects our interactions across cultures and what standards or cultural inequities it may unknowingly be imposing.

Artificial Intelligence has limitations, but is rapidly bridging the gap. Additionally, as seen in the Welsh cases, humans are not themselves infallible. This raises two questions: to what standard do we hold AI, and is it a fair standard? As seen in the Cruz-Zamora case, AI’s imperfections are not currently acceptable, at least by a court of law, but as the day approaches when that is no longer the case, how we handle the transition will be largely affected by the forethought we practice or fail to today.

References[edit | edit source]

  1. https://translate.google.com/intl/en/about/
  2. https://www.languageoasis.com/blog/interesting-facts-about-google-translate-you-must-know/
  3. https://deepai.org/machine-learning-glossary-and-terms/neural-network
  4. https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html
  5. https://1.bp.blogspot.com/-jOLa-LdidQU/V-qV2oJn1aI/AAAAAAAABPg/-6OhKKPhxT89Vs9HhyKMEnyG_0ncWGjJQCLcB/s1600/image00.png
  6. https://translate.google.com/community
  7. https://www.theguardian.com/football/2018/jul/11/google-translate-world-cup-hero-fans-language-barriers
  8. https://www.businessinsider.com/teesside-magistrates-court-forced-to-rely-on-google-translate-because-it-had-no-interpreter-2017-8
  9. https://ecf.ksd.uscourts.gov/cgi-bin/show_public_doc?2017cr40100-24
  10. https://web.archive.org/web/20070912175216/http://google.blognewschannel.com/archives/2007/09/10/google-translates-ivan-the-terrible-as-abraham-lincoln/
  11. http://news.bbc.co.uk/2/hi/7702913.stm
  12. https://www.vilaweb.cat/noticia/4177847/20140308/google-translate-converts-barcelona-catalunya-into-barcelona-espana.html
  13. https://ai.googleblog.com/2016/11/zero-shot-translation-with-googles.html
  14. https://productforums.google.com/forum/#!topic/gmail/5Tq3xp8KlKE
  15. https://www.fastcompany.com/3010223/google-translates-gender-problem-and-bing-translates-and-systrans
  16. https://datausa.io/profile/soc/151131/?compare=291141