{"id":3108,"date":"2020-02-07T16:20:01","date_gmt":"2020-02-07T16:20:01","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2020\/02\/07\/hey-alexa-sorry-i-fooled-you\/"},"modified":"2020-02-07T16:20:01","modified_gmt":"2020-02-07T16:20:01","slug":"hey-alexa-sorry-i-fooled-you","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2020\/02\/07\/hey-alexa-sorry-i-fooled-you\/","title":{"rendered":"Hey Alexa! Sorry I fooled you &#8230;"},"content":{"rendered":"<p>Author: Rachel Gordon | MIT CSAIL<\/p>\n<div>\n<p>A human can likely tell the difference between a turtle and a rifle. Two years ago, Google&rsquo;s AI wasn&rsquo;t so <a href=\"https:\/\/www.theverge.com\/2017\/11\/2\/16597276\/google-ai-image-attacks-adversarial-turtle-rifle-3d-printed\">sure<\/a>. For quite some time, a subset of computer science research has been dedicated to better understanding how machine-learning models handle these &ldquo;adversarial&rdquo; attacks, which are inputs deliberately created to trick or fool machine-learning algorithms.&nbsp;<\/p>\n<p>While much of this work has focused on <a href=\"https:\/\/www.washingtonpost.com\/technology\/2019\/09\/04\/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft\/\">speech<\/a> and <a href=\"https:\/\/www.theguardian.com\/technology\/2020\/jan\/13\/what-are-deepfakes-and-how-can-you-spot-them\">images<\/a>, recently, a team from MIT&rsquo;s <a href=\"http:\/\/csail.mit.edu\/\">Computer Science and Artificial Intelligence Laboratory<\/a> (CSAIL) tested the boundaries of text. They came up with &ldquo;TextFooler,&rdquo; a general framework that can successfully attack natural language processing (NLP) systems &mdash; the types of systems that let us interact with our Siri and Alexa voice assistants &mdash; and &ldquo;fool&rdquo; them into making the wrong predictions.&nbsp;<\/p>\n<p>One could imagine using TextFooler for many applications related to internet safety, such as email spam filtering, hate speech flagging, or &ldquo;sensitive&rdquo; political speech text detection &mdash; which are all based on text classification models.&nbsp;<\/p>\n<p>&ldquo;If those tools are vulnerable to purposeful adversarial attacking, then the consequences may be disastrous,&rdquo; says Di Jin, MIT PhD student and lead author on a new paper about TextFooler. &ldquo;These tools need to have effective defense approaches to protect themselves, and in order to make such a safe defense system, we need to first examine the adversarial methods.&rdquo;&nbsp;<\/p>\n<p>TextFooler works in two parts: altering a given text, and then using that text to test two different language tasks to see if the system can successfully trick machine-learning models.&nbsp;&nbsp;<\/p>\n<p>The system first identifies the most important words that will influence the target model&rsquo;s prediction, and then selects the synonyms that fit contextually. This is all while maintaining grammar and the original meaning to look &ldquo;human&rdquo; enough, until the prediction is altered.&nbsp;<\/p>\n<p>Then, the framework is applied to two different tasks &mdash; text classification, and entailment (which is the relationship between text fragments in a sentence), with the goal of changing the classification or invalidating the entailment judgment of the original models.&nbsp;<\/p>\n<p>In one example, TextFooler&rsquo;s input and output were:<\/p>\n<p>&ldquo;The characters, cast in impossibly contrived situations, are totally estranged from reality.&rdquo;&nbsp;<\/p>\n<p>&ldquo;The characters, cast in impossibly engineered circumstances, are fully estranged from reality.&rdquo;&nbsp;<\/p>\n<p>In this case, when testing on an NLP model, it gets the example input right, but then gets the modified input wrong.&nbsp;<\/p>\n<p>In total, TextFooler successfully attacked three target models, including &ldquo;BERT,&rdquo; the popular open-source NLP model. It fooled the target models with an accuracy of over 90 percent to under 20 percent, by changing only 10 percent of the words in a given text. The team evaluated success on three criteria: changing the model&#8217;s prediction for classification or entailment; whether it looked similar in meaning to a human reader, compared with the original example; and whether the text looked natural enough.&nbsp;<\/p>\n<p>The researchers note that while attacking existing models is not the end goal, they hope that this work will help more abstract models generalize to new, unseen data.&nbsp;<\/p>\n<p>&ldquo;The system can be used or extended to attack any classification-based NLP models to test their robustness,&rdquo; says Jin. &ldquo;On the other hand, the generated adversaries can be used to improve the robustness and generalization of deep-learning models via adversarial training, which is a critical direction of this work.&rdquo;&nbsp;<\/p>\n<p>Jin wrote the paper alongside MIT Professor Peter Szolovits, Zhijing Jin of the University of Hong Kong, and Joey Tianyi Zhou of A*STAR, Singapore. They will present the paper at the AAAI Conference on Artificial Intelligence in New York.&nbsp;<\/p>\n<\/div>\n<p><a href=\"http:\/\/news.mit.edu\/2020\/hey-alexa-sorry-i-fooled-you-0207\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Rachel Gordon | MIT CSAIL A human can likely tell the difference between a turtle and a rifle. Two years ago, Google&rsquo;s AI wasn&rsquo;t [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2020\/02\/07\/hey-alexa-sorry-i-fooled-you\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":468,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/3108"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=3108"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/3108\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/474"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=3108"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=3108"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=3108"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}