{"id":2315,"date":"2019-07-01T04:00:00","date_gmt":"2019-07-01T04:00:00","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/07\/01\/teaching-artificial-intelligence-to-create-visuals-with-more-common-sense\/"},"modified":"2019-07-01T04:00:00","modified_gmt":"2019-07-01T04:00:00","slug":"teaching-artificial-intelligence-to-create-visuals-with-more-common-sense","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/07\/01\/teaching-artificial-intelligence-to-create-visuals-with-more-common-sense\/","title":{"rendered":"Teaching artificial intelligence to create visuals with more common sense"},"content":{"rendered":"<p>Author: Adam Conner-Simons | MIT CSAIL<\/p>\n<div>\n<p>Today\u2019s smartphones often use artificial intelligence (AI) to help make the photos we take crisper and clearer. But what if these AI tools could be used to create entire scenes from scratch?<\/p>\n<p>A team from MIT and IBM has now done exactly that with \u201cGANpaint Studio,\u201d a system that can automatically generate realistic photographic images and edit objects inside them. In addition to helping artists and designers make quick adjustments to visuals, the researchers say the work may help computer scientists identify \u201cfake\u201d images.<\/p>\n<div class=\"cms-placeholder-content-video\"><\/div>\n<p>David Bau, a PhD student at MIT\u2019s Computer Science and Artificial Intelligence Lab (CSAIL), describes the project as one of the first times computer scientists have been able to actually \u201cpaint with the neurons\u201d of a neural network \u2014 specifically, a popular type of network called a generative adversarial network (GAN).<\/p>\n<p>Available online as <a href=\"http:\/\/169.63.46.227\/demo\/ganpaint.html\">an interactive demo<\/a>, GANpaint Studio allows a user to upload an image of their choosing and modify multiple aspects of its appearance, from changing the size of objects to adding completely new items like trees and buildings.<\/p>\n<p><strong>Boon for designers<\/strong><\/p>\n<p>Spearheaded by MIT professor Antonio Torralba as part of the <a href=\"https:\/\/mitibmwatsonailab.mit.edu\/\">MIT-IBM Watson AI Lab<\/a> he directs, the project has vast potential applications. Designers and artists could use it to make quicker tweaks to their visuals. Adapting the system to video clips would enable computer-graphics editors to quickly compose specific arrangements of objects needed for a particular shot. (Imagine, for example, if a director filmed a full scene with actors but forgot to include an object in the background that\u2019s important to the plot.)<\/p>\n<p>GANpaint Studio could also be used to improve and debug other GANs that are being developed, by analyzing them for \u201cartifact\u201d units that need to be removed. In a world where opaque AI tools have made image manipulation easier than ever, it could help researchers better understand neural networks and their underlying structures.<\/p>\n<p>\u201cRight now, machine learning systems are these black boxes that we don\u2019t always know how to improve, kind of like those old TV sets that you have to fix by hitting them on the side,\u201d says Bau, lead author on a related paper about the system with a team overseen by Torralba. \u201cThis research suggests that, while it might be scary to open up the TV and take a look at all the wires, there\u2019s going to be a lot of meaningful information in there.\u201d<\/p>\n<p>One unexpected discovery is that the system actually seems to have learned some simple rules about the relationships between objects. It somehow knows not to put something somewhere it doesn\u2019t belong, like a window in the sky, and it also creates different visuals in different contexts. For example, if there are two different buildings in an image and the system is asked to add doors to both, it doesn\u2019t simply add identical doors \u2014 they may ultimately look quite different from each other.\u00a0<\/p>\n<p>\u201cAll drawing apps will follow user instructions, but ours might decide not to draw anything if the user commands to put an object in an impossible location,\u201d says Torralba. \u201cIt\u2019s a drawing tool with a strong personality, and it opens a window that allows us to understand how GANs learn to represent the visual world.\u201d<\/p>\n<p>GANs are sets of neural networks developed to compete against each other. In this case, one network is a generator focused on creating realistic images, and the second is a discriminator whose goal is to not be fooled by the generator. Every time the discriminator \u2018catches\u2019 the generator, it has to expose the internal reasoning for the decision, which allows the generator to continuously get better.<\/p>\n<p>\u201cIt\u2019s truly mind-blowing to see how this work enables us to directly see that GANs actually learn something that\u2019s beginning to look a bit like common sense,\u201d\u00a0 says Jaakko Lehtinen, an associate professor at Finland\u2019s<a href=\"http:\/\/www.aalto.fi\/en\"> Aalto University<\/a> who was not involved in the project. \u201cI see this ability as a crucial steppingstone to having autonomous systems that can actually function in the human world, which is infinite, complex and ever-changing.\u201d<\/p>\n<p><strong>Stamping out unwanted\u00a0\u201cfake\u201d images<\/strong><\/p>\n<p>The team\u2019s goal has been to give people more control over GAN networks.\u00a0 But they recognize that with increased power comes the potential for abuse, like using such technologies to doctor photos. Co-author Jun-Yan Zhu says that he believes that better understanding GANs \u2014 and the kinds of mistakes they make \u2014 will help researchers be able to better stamp out fakery.<\/p>\n<p>\u201cYou need to know your opponent before you can defend against it,\u201d says Zhu, a postdoc at CSAIL. \u201cThis understanding may potentially help us detect fake images more easily.\u201d<\/p>\n<p>To develop the system, the team first identified units inside the GAN that correlate with particular types of objects, like trees. It then tested these units individually to see if getting rid of them would cause certain objects to disappear or appear. Importantly, they also identified the units that cause visual errors (artifacts) and worked to remove them to increase the overall quality of the image.<\/p>\n<p>\u201cWhenever GANs generate terribly unrealistic images, the cause of these mistakes has previously been a mystery,\u201d says co-author Hendrik Strobelt, a research scientist at IBM. \u201cWe found that these mistakes are triggered by specific sets of neurons that we can silence to improve the quality of the image.\u201d<\/p>\n<p>Bau, Strobelt, Torralba and Zhu co-wrote the paper with former CSAIL PhD student Bolei Zhou, postdoctoral associate Jonas Wulff, and undergraduate student William Peebles. They will present it next month at the SIGGRAPH conference in Los Angeles. \u201cThis system opens a door into a better understanding of GAN models, and that\u2019s going to help us do whatever kind of research we need to do with GANs,\u201d says Lehtinen.<\/p>\n<\/div>\n<p><a href=\"http:\/\/news.mit.edu\/2019\/teaching-artificial-intelligence-to-create-more-common-sense-visuals-0701\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Adam Conner-Simons | MIT CSAIL Today\u2019s smartphones often use artificial intelligence (AI) to help make the photos we take crisper and clearer. But what [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/07\/01\/teaching-artificial-intelligence-to-create-visuals-with-more-common-sense\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":473,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2315"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2315"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2315\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/466"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2315"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2315"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2315"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}