Archive for May 21st, 2011

Google Translate

On the following lines, it will be analyzed one of the most succesful translators of this century-Google Translate. This translator is a free on-line statistical machine service owned by Google Inc that translates immediately a lot of different languages (57) such as Polish, German, Dutch, Spanish… However, it has to be said that some languages are better translated than others, in other words, some languages are supported by Google translate and others languages are called by the company “alpha languages”, this is to say that these languages have lower quality in their translations.

It is possible to translate long texts, but the system limits the number of paragraphs. Nevertheless, if the user wants to translate completely a website, Google Translate gives him or her the opportunity to use Google chrome which is a fast free browser that translates websites automatically in many languages. Not only  does Google translate give you the opportunity to use Google chrome, but also other tools such as to the Google translated search (the information that you are searching probably will not be in your own language; the system searches the best contribution and translates it to your own language) or the iphone version which allows voice input.

The aim of this enterprise is “to make information universally accessible, regardless of the language in which it is written” That is why it has been improving  since it started. Nowadays, it can be done many things that could not be done at the beginning. For example, in the first version, only English could be translated to some other languages, now it can be done the other way round. Moreover, it is also possible to have the romanization written for languages such as Chinese or Greek and, in the last version launched in January 2011, it is also possible to see different possible translations for a specific word. A good way that helps this translator to improve is that the user himself can increase the quality of  translations by suggesting improvements or uploading his translations memories into Google Translate’s Translator Toolkit. Furthermore, the service itself asks the user sometimes alternate translations for technical terms.

But, how does this translator work? As it has been said, Google Translate is a Statistical Machine Translator (SMT) which is a way of translating texts completely different from the traditional rule-based translations. The rule-based  machine translations were used some years ago and they applied the rules and grammars of the language that was being translated. However, Linguists knew that not all languages had the same rules (e.g the order of some languages is subject- verb-object but in others is verb- subject-object) that is why the translations were not very good.

 Then, it began statistical machine translations where the computer looks for patterns in millions of documents. This documents had already been translated by human beings and thanks to them the computer can know more or less how the translation should be. However, the translations are not always perfect and the quality of them depends mainly on the number of documents that the computer can analyze to see patterns. That is why Google Translate can translate better, for example, German than Basque, it has more German documents than Basque Documents. Franz Josef Och is the main head in Google and he is in favour of Statistical machine translators. The documents that are available for the machine are taken from United Nations documents.

Finally, this way to translate texts has advantages. For instance, the quality is better than in rule-based translations, also, the translations are more natural and we have better use of resources. But, there are some disadvantages and problems with: sentence alignment, different word orders, compound words, idioms, morphology

Do not hesitate to see the following video that explains how SMT works . If you are interested in knowing more about the problems Google Translate has, you can see the portfolio I did commenting the main problems here: http://wiki.littera.deusto.es/en/index.php/User:1adcaden/trans0910/Portfolio


References:


Calendar

May 2011
F S S M T W T
« Mar    
 12345
6789101112
13141516171819
20212223242526
2728293031  

Categories

About

RSS CiteULike

RSS Rss Planet Littera

  • El Mejor Data Mining 30 May 2012
    Un sistema de la UPV de ayuda a diagnosticar tumores cerebrales, mejor aportación tecnológica en unos premios sanitarios Parece que el uso de la tecnología rompe fronteras, y el caso de Data Mining o Minería de Datos sigue la misma estela. “El sistema CURIAM BT, desarrollado por investigadores del Grupo de Informática Biomédica (IBIME-ITACA) de [...] […]
    Itxaro González
  • Lenguas románicas 30 May 2012
    LENGUAS ROMANICAS: Laurentino Rodríguez Contreras explica de donde provienen las lenguas románicas: “La verdadera lengua matriz, que dio nacimiento a las lenguas romances, fue… el italiano, pero el italiano no proviene del latín como comúnmente se cree, si no que es, y esto forma parte también de su tesis, una lengua más antigua, desprendida en [...] […]
    Janire Campo
  • Sare sozialen eta identitate digitalen abantailak!! 30 May 2012
    Gaur egun, Internet ezinbestekoa bilakatu da. Izan ere, edonork dauka eskuragarri eta honen bidez, beharrezko dugun informazioa aurkitu dezakegu. Internet edozein gauzatarako erabil dezakegu, bai jentearekin kontaktuan jartzeko, bai lan mundurako eta bai aisialdi gisa erbailtzeko. Honen barruan, sare sozialak aurkitu ditzakegu. Denbora aurrera joan ahala, sa […]
    Jone Etxeandia
  • Informatika erabiltzen!! 30 May 2012
    Informatika ordenagailuen bidez egiten den informazioaren tratamendu automatikoa posible egiten duen ezagutza zientifiko eta teknikoen multzoa da. Hitz hau frantsesetik dator, frantsesek sortu baitzuten “informatique”-ren kontzeptua, hau da, informatika. Informatika garatzen joan da denbora aurrera joan ahala gizakiak lan arruntak egin ahal izate […]
    Jone Etxeandia
  • Mendeley: A new good tool for our computers 30 May 2012
    Mendeley is actually a very sophisticated research management tool and free to use. It has had a great deal of developments since it was invented until now. It was founded in November 2007 and is based in London. The first public beta version was released in August 2008. The team comprises researchers, graduates, and open [...]
    Edurne Sagarna

Follow

Get every new post delivered to your Inbox.