{"id":1907,"date":"2019-03-21T15:48:52","date_gmt":"2019-03-21T15:48:52","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/03\/21\/kicking-neural-network-automation-into-high-gear\/"},"modified":"2019-03-21T15:48:52","modified_gmt":"2019-03-21T15:48:52","slug":"kicking-neural-network-automation-into-high-gear","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/03\/21\/kicking-neural-network-automation-into-high-gear\/","title":{"rendered":"Kicking neural network automation into high gear"},"content":{"rendered":"<p>Author: Rob Matheson | MIT News Office<\/p>\n<div>\n<p>A new area in artificial intelligence involves using algorithms to automatically design machine-learning systems known as neural networks, which are more accurate and efficient than those developed by human engineers. But this so-called neural architecture search (NAS) technique is computationally expensive.<\/p>\n<p>A state-of-the-art NAS algorithm recently developed by Google to run on a squad of graphical processing units (GPUs) took 48,000 GPU hours to produce a single convolutional neural network, which is used for image classification and detection tasks. Google has the wherewithal to run hundreds of GPUs and other specialized hardware in parallel, but that\u2019s out of reach for many others.<\/p>\n<p>In a paper being presented at the International Conference on Learning Representations in May, MIT researchers describe an NAS algorithm that can directly learn specialized convolutional neural networks (CNNs) for target hardware platforms \u2014 when run on a massive image dataset \u2014\u00a0in only 200 GPU hours, which could enable far broader use of these types of algorithms.<\/p>\n<p>Resource-strapped researchers and companies could benefit from the time- and cost-saving algorithm, the researchers say. The broad goal is \u201cto democratize AI,\u201d says co-author Song Han, an assistant professor of electrical engineering and computer science and a researcher in the Microsystems Technology Laboratories at MIT. \u201cWe want to enable both AI experts and nonexperts to efficiently design neural network architectures with a push-button solution that runs fast on a specific hardware.\u201d<\/p>\n<p>Han adds that such NAS algorithms will never replace human engineers. \u201cThe aim is to offload the repetitive and tedious work that comes with designing and refining neural network architectures,\u201d says Han, who is joined on the paper by two researchers in his group, Han Cai and Ligeng Zhu.<\/p>\n<p><strong>\u201cPath-level\u201d binarization and pruning<\/strong><\/p>\n<p>In their work, the researchers developed ways to delete unnecessary neural network design components, to cut computing times and use only a fraction of hardware memory to run a NAS algorithm. An additional innovation ensures each outputted CNN runs more efficiently on specific hardware platforms \u2014 CPUs, GPUs, and mobile devices \u2014 than those designed by traditional approaches. In tests, the researchers\u2019 CNNs were 1.8 times faster measured on a mobile phone than traditional gold-standard models with similar accuracy.<\/p>\n<p>A CNN\u2019s architecture consists of layers of computation with adjustable parameters, called \u201cfilters,\u201d and the possible connections between those filters. Filters process image pixels in grids of squares \u2014 such as 3&#215;3, 5&#215;5, or 7&#215;7 \u2014 with each filter covering one square. The filters essentially move across the image and combine all the colors of their covered grid of pixels into a single pixel. Different layers may have different-sized filters, and connect to share data in different ways. The output is a condensed image \u2014 from the combined information from all the filters \u2014 that can be more easily analyzed by a computer.<\/p>\n<p>Because the number of possible architectures to choose from \u2014 called the \u201csearch space\u201d \u2014 is so large, applying NAS to create a neural network on massive image datasets is computationally prohibitive. Engineers typically run NAS on smaller proxy datasets and transfer their learned CNN architectures to the target task. This generalization method reduces the model\u2019s accuracy, however. Moreover, the same outputted architecture also is applied to all hardware platforms, which leads to efficiency issues.<\/p>\n<p>The researchers trained and tested their new NAS algorithm on an image classification task directly in the ImageNet dataset, which contains millions of images in a thousand classes. They first created a search space that contains all possible candidate CNN \u201cpaths\u201d \u2014 meaning how the layers and filters connect to process the data. This gives the NAS algorithm free reign to find an optimal architecture.<\/p>\n<p>This would typically mean all possible paths must be stored in memory, which would exceed GPU memory limits. To address this, the researchers leverage a technique called \u201cpath-level binarization,\u201d which stores only one sampled path at a time and saves an order of magnitude in memory consumption. They combine this binarization with \u201cpath-level pruning,\u201d a technique that traditionally learns which \u201cneurons\u201d in a neural network can be deleted without affecting the output. Instead of discarding neurons, however, the researchers\u2019 NAS algorithm prunes entire paths, which completely changes the neural network\u2019s architecture.<\/p>\n<p>In training, all paths are initially given the same probability for selection. The algorithm then traces the paths \u2014 storing only one at a time \u2014 to note the accuracy and loss (a numerical penalty assigned for incorrect predictions) of their outputs. It then adjusts the probabilities of the paths to optimize both accuracy and efficiency. In the end, the algorithm prunes away all the low-probability paths and keeps only the path with the highest probability \u2014 which is the final CNN architecture.<\/p>\n<p><strong>Hardware-aware<\/strong><\/p>\n<p>Another key innovation was making the NAS algorithm \u201chardware-aware,\u201d Han says, meaning it uses the latency on each hardware platform as a feedback signal to optimize the architecture. To measure this latency on mobile devices, for instance, big companies such as Google will employ a \u201cfarm\u201d of mobile devices, which is very expensive. The researchers instead built a model that predicts the latency using only a single mobile phone.<\/p>\n<p>For each chosen layer of the network, the algorithm samples the architecture on that latency-prediction model. It then uses that information to design an architecture that runs as quickly as possible, while achieving high accuracy. In experiments, the researchers\u2019 CNN ran nearly twice as fast as a gold-standard model on mobile devices.<\/p>\n<p>One interesting result, Han says, was that their NAS algorithm designed CNN architectures that were long dismissed as being too inefficient \u2014 but, in the researchers\u2019 tests, they were actually optimized for certain hardware. For instance, engineers have essentially stopped using 7&#215;7 filters, because they\u2019re computationally more expensive than multiple, smaller filters. Yet, the researchers\u2019 NAS algorithm found architectures with some layers of 7&#215;7 filters ran optimally on GPUs. That\u2019s because GPUs have high parallelization \u2014\u00a0meaning they compute many calculations simultaneously \u2014 so can process a single large filter at once more efficiently than processing multiple small filters one at a time.<\/p>\n<p>\u201cThis goes against previous human thinking,\u201d Han says. \u201cThe larger the search space, the more unknown things you can find. You don\u2019t know if something will be better than the past human experience. Let the AI figure it out.\u201d<\/p>\n<p>The work was supported, in part, by the MIT Quest for Intelligence, the MIT-IBM Watson AI lab, SenseTime, and Xilinx.<\/p>\n<\/div>\n<p><a href=\"http:\/\/news.mit.edu\/2019\/convolutional-neural-network-automation-0321\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Rob Matheson | MIT News Office A new area in artificial intelligence involves using algorithms to automatically design machine-learning systems known as neural networks, [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/03\/21\/kicking-neural-network-automation-into-high-gear\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":465,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1907"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1907"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1907\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/460"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1907"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1907"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1907"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}