{"id":1238,"date":"2018-10-30T14:59:55","date_gmt":"2018-10-30T14:59:55","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/10\/30\/model-paves-way-for-faster-more-efficient-translations-of-more-languages\/"},"modified":"2018-10-30T14:59:55","modified_gmt":"2018-10-30T14:59:55","slug":"model-paves-way-for-faster-more-efficient-translations-of-more-languages","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/10\/30\/model-paves-way-for-faster-more-efficient-translations-of-more-languages\/","title":{"rendered":"Model paves way for faster, more efficient translations of more languages"},"content":{"rendered":"<p>Author: Rob Matheson | MIT News Office<\/p>\n<div>\n<p>MIT researchers have developed a novel \u201cunsupervised\u201d language translation model \u2014\u00a0meaning it runs without the need for human annotations and guidance \u2014 that could lead to faster, more efficient computer-based translations of far more languages.<\/p>\n<p>Translation systems from Google, Facebook, and Amazon require training models to look for patterns in millions of documents \u2014\u00a0such as legal and political documents, or news articles \u2014\u00a0that have been translated into various languages by humans. Given new words in one language, they can then find the matching words and phrases in the other language.<\/p>\n<p>But this translational data is time consuming and difficult to gather, and simply may not exist for many of the 7,000 languages spoken worldwide. Recently, researchers have been developing \u201cmonolingual\u201d models that make translations between texts in two languages, but without direct translational information between the two.<\/p>\n<p>In a paper being presented this week at the Conference on Empirical Methods in Natural Language Processing, researchers from MIT\u2019s Computer Science and Artificial Intelligence Laboratory (CSAIL) describe a model that runs faster and more efficiently than these monolingual models.<\/p>\n<p>The model leverages a metric in statistics, called Gromov-Wasserstein distance, that essentially measures distances between points in one computational space and matches them to similarly distanced points in another space. They apply that technique to \u201cword embeddings\u201d of two languages, which are words represented as vectors \u2014 basically, arrays of numbers \u2014\u00a0with words of similar meanings clustered closer together. In doing so, the model quickly aligns the words, or vectors, in both embeddings that are most closely correlated by relative distances, meaning they\u2019re likely to be direct translations.<\/p>\n<p>In experiments, the researchers\u2019 model performed as accurately as state-of-the-art monolingual models \u2014\u00a0and sometimes more accurately \u2014\u00a0but much more quickly and using only a fraction of the computation power.<\/p>\n<p>\u201cThe model sees the words in the two languages as sets of vectors, and maps [those vectors] from one set to the other by essentially preserving relationships,\u201d says the paper\u2019s co-author Tommi Jaakkola, a CSAIL researcher and the Thomas Siebel Professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society. \u201cThe approach could help translate low-resource languages or dialects, so long as they come with enough monolingual content.\u201d<\/p>\n<p>The model represents a step toward one of the major goals of machine translation, which is fully unsupervised word alignment, says first author David Alvarez-Melis, a CSAIL PhD student: \u201cIf you don\u2019t have any data that matches two languages \u2026 you can map two languages and, using these distance measurements, align them.\u201d<\/p>\n<p><strong>Relationships matter most<\/strong><\/p>\n<p>Aligning word embeddings for unsupervised machine translation isn\u2019t a new concept. Recent work trains neural networks to match vectors directly in word embeddings, or matrices, from two languages together. But these methods require a lot of tweaking during training to get the alignments exactly right, which is inefficient and time consuming.<\/p>\n<p>Measuring and matching vectors based on relational distances, on the other hand, is a far more efficient method that doesn\u2019t require much fine-tuning. No matter where word vectors fall in a given matrix, the relationship between the words, meaning their distances, will remain the same. For instance, the vector for \u201cfather\u201d may fall in completely different areas in two matrices. But vectors for \u201cfather\u201d and \u201cmother\u201d will most likely always be close together.<\/p>\n<p>\u201cThose distances are invariant,\u201d Alvarez-Melis says. \u201cBy looking at distance, and not the absolute positions of vectors, then you can skip the alignment and go directly to matching the correspondences between vectors.\u201d<\/p>\n<p>That\u2019s where Gromov-Wasserstein comes in handy. The technique has been used in computer science for, say, helping align image pixels in graphic design. But the metric seemed \u201ctailor made\u201d for word alignment, Alvarez-Melis says: \u201cIf there are points, or words, that are close together in one space, Gromov-Wasserstein is automatically going to try to find the corresponding cluster of points in the other space.\u201d<\/p>\n<p>For training and testing, the researchers used a dataset of publicly available word embeddings, called FASTTEXT, with 110 language pairs. In these embeddings, and others, words that appear more and more frequently in similar contexts have closely matching vectors. \u201cMother\u201d and \u201cfather\u201d will usually be close together but both farther away from, say, \u201chouse.\u201d<\/p>\n<p><strong>Providing a \u201csoft translation\u201d<\/strong><\/p>\n<p>The model notes vectors that are closely related yet different from the others, and assigns a probability that similarly distanced vectors in the other embedding will correspond. It\u2019s kind of like a \u201csoft translation,\u201d Alvarez-Melis says, \u201cbecause instead of just returning a single word translation, it tells you \u2018this vector, or word, has a strong correspondence with this word, or words, in the other language.\u2019\u201d<\/p>\n<p>An example would be in the months of the year, which appear closely together in many languages. The model will see a cluster of 12 vectors that are clustered in one embedding and a remarkably similar cluster in the other embedding. \u201cThe model doesn\u2019t know these are months,\u201d Alvarez-Melis says. \u201cIt just knows there is a cluster of 12 points that aligns with a cluster of 12 points in the other language, but they\u2019re different to the rest of the words, so they probably go together well. By finding these correspondences for each word, it then aligns the whole space simultaneously.\u201d<\/p>\n<p>The researchers hope the work serves as a \u201cfeasibility check,\u201d Jaakkola says, to apply Gromov-Wasserstein method to machine-translation systems to run faster, more efficiently, and gain access to many more languages.<\/p>\n<p>Additionally, a possible perk of the model is that it automatically produces a value that can be interpreted as quantifying, on a numerical scale, the similarity between languages. This may be useful for linguistics studies, the researchers say. The model calculates how distant all vectors are from one another in two embeddings, which depends on sentence structure and other factors. If vectors are all really close, they\u2019ll score closer to 0, and the farther apart they are, the higher the score. Similar Romance languages such as French and Italian, for instance, score close to 1, while classic Chinese scores between 6 and 9 with other major languages.<\/p>\n<p>\u201cThis gives you a nice, simple number for how similar languages are \u2026 and can be used to draw insights about the relationships between languages,\u201d Alvarez-Melis says.<\/p>\n<\/div>\n<p><a href=\"http:\/\/news.mit.edu\/2018\/unsupervised-model-faster-computer-translations-languages-1030\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Rob Matheson | MIT News Office MIT researchers have developed a novel \u201cunsupervised\u201d language translation model \u2014\u00a0meaning it runs without the need for human [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/10\/30\/model-paves-way-for-faster-more-efficient-translations-of-more-languages\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":465,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1238"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1238"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1238\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/461"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1238"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1238"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1238"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}