{"id":2214,"date":"2019-05-31T14:21:03","date_gmt":"2019-05-31T14:21:03","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/05\/31\/cracking-open-the-black-box-of-automated-machine-learning\/"},"modified":"2019-05-31T14:21:03","modified_gmt":"2019-05-31T14:21:03","slug":"cracking-open-the-black-box-of-automated-machine-learning","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/05\/31\/cracking-open-the-black-box-of-automated-machine-learning\/","title":{"rendered":"Cracking open the black box of automated machine learning"},"content":{"rendered":"<p>Author: Rob Matheson | MIT News Office<\/p>\n<div>\n<p>Researchers from MIT and elsewhere have developed an interactive tool that, for the first time, lets users see and control how automated machine-learning systems work. The aim is to build confidence in these systems and find ways to improve them.<\/p>\n<p>Designing a machine-learning model for a certain task \u2014 such as image classification, disease diagnoses, and stock market prediction \u2014 is an arduous, time-consuming process. Experts first choose from among many different algorithms to build the model around. Then, they manually tweak \u201chyperparameters\u201d \u2014 which determine the model\u2019s overall structure \u2014 before the model starts training.<\/p>\n<p>Recently developed automated machine-learning (AutoML) systems iteratively test and modify algorithms and those hyperparameters, and select the best-suited models. But the systems operate as \u201cblack boxes,\u201d meaning their selection techniques are hidden from users. Therefore, users may not trust the results and can find it difficult to tailor the systems to their search needs.<\/p>\n<p>In a paper presented at the ACM CHI Conference on Human Factors in Computing Systems, researchers from MIT, the Hong Kong University of Science and Technology (HKUST), and Zhejiang University describe a tool that puts the analyses and control of AutoML methods into users\u2019 hands. Called ATMSeer, the tool takes as input an AutoML system, a dataset, and some information about a user\u2019s task. Then, it visualizes the search process in a user-friendly interface, which presents in-depth information on the models\u2019 performance.<\/p>\n<p>\u201cWe let users pick and see how the AutoML systems works,\u201d says co-author Kalyan Veeramachaneni, a principal research scientist in the MIT Laboratory for Information and Decision Systems (LIDS), who leads the Data to AI group. \u201cYou might simply choose the top-performing model, or you might have other considerations or use domain expertise to guide the system to search for some models over others.\u201d<\/p>\n<p>In case studies with science graduate students, who were AutoML novices, the researchers found about 85 percent of participants who used ATMSeer were confident in the models selected by the system. Nearly all participants said using the tool made them comfortable enough to use AutoML systems in the future.<\/p>\n<p>\u201cWe found people were more likely to use AutoML as a result of opening up that black box and seeing and controlling how the system operates,\u201d says Micah Smith, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and a researcher in LIDS.<\/p>\n<p>\u201cData visualization is an effective approach toward better collaboration between humans and machines. ATMSeer exemplifies this idea,\u201d says lead author Qianwen Wang of HKUST. \u201cATMSeer will mostly benefit machine-learning practitioners, regardless of their domain, [who] have a certain level of expertise. It can relieve the pain of manually selecting machine-learning algorithms and tuning hyperparameters.\u201d<\/p>\n<p>Joining Smith, Veeramachaneni, and Wang on the paper are: Yao Ming, Qiaomu Shen, Dongyu Liu, and Huamin Qu, all of HKUST; and Zhihua Jin of Zhejiang University.<\/p>\n<p><strong>Tuning the model<\/strong><\/p>\n<p>At the core of the new tool is a custom AutoML system, called \u201c<a href=\"http:\/\/github.com\/HDI-Project\/ATM\">Auto-Tuned Models<\/a>\u201d (ATM), developed by Veeramachaneni and other researchers in 2017. Unlike traditional AutoML systems, ATM fully catalogues all search results as it tries to fit models to data.<\/p>\n<p>ATM takes as input any dataset and an encoded prediction task. The system randomly selects an algorithm class \u2014 such as neural networks, decision trees, random forest, and logistic regression \u2014 and the model\u2019s hyperparameters, such as the size of a decision tree or the number of neural network layers.<\/p>\n<p>Then, the system runs the model against the dataset, iteratively tunes the hyperparameters, and measures performance. It uses what it has learned about that model\u2019s performance to select another model, and so on. In the end, the system outputs several top-performing models for a task.<\/p>\n<p>The trick is that each model can essentially be treated as one data point with a few variables: algorithm, hyperparameters, and performance. Building on that work, the researchers designed a system that plots the data points and variables on designated graphs and charts. From there, they developed a separate technique that also lets them reconfigure that data in real time. \u201cThe trick is that, with these tools, anything you can visualize, you can also modify,\u201d Smith says.<\/p>\n<p>Similar visualization tools are tailored toward analyzing only one specific machine-learning model, and allow limited customization of the search space. \u201cTherefore, they offer limited support for the AutoML process, in which the configurations of many searched models need to be analyzed,\u201d Wang says. \u201cIn contrast, ATMSeer supports the analysis of machine-learning models generated with various algorithms.\u201d<\/p>\n<div class=\"cms-placeholder-content-video\"><\/div>\n<p><strong>User control and confidence<\/strong><\/p>\n<p>ATMSeer\u2019s interface consists of three parts. A control panel allows users to upload datasets and an AutoML system, and start or pause the search process. Below that is an overview panel that shows basic statistics \u2014 such as the number of algorithms and hyperparameters searched \u2014 and a \u201cleaderboard\u201d of top-performing models in descending order. \u201cThis might be the view you\u2019re most interested in if you\u2019re not an expert diving into the nitty gritty details,\u201d Veeramachaneni says.<\/p>\n<p>ATMSeer includes an \u201cAutoML Profiler,\u201d with panels containing in-depth information about the algorithms and hyperparameters, which can all be adjusted. One panel represents all algorithm classes as histograms \u2014 a bar chart that shows the distribution of the algorithm\u2019s performance scores, on a scale of 0 to 10, depending on their hyperparameters. A separate panel displays scatter plots that visualize the tradeoffs in performance for different hyperparameters and algorithm classes.<\/p>\n<p>Case studies with machine-learning experts, who had no AutoML experience, revealed that user control does help improve the performance and efficiency of AutoML selection. User studies with 13 graduate students in diverse scientific fields \u2014 such as biology and finance \u2014 were also revealing. Results indicate three major factors \u2014 number of algorithms searched, system runtime, and finding the top-performing model \u2014 determined how users customized their AutoML searches. That information can be used to tailor the systems to users, the researchers say.<\/p>\n<p>\u201cWe are just starting to see the beginning of the different ways people use these systems and make selections,\u201d Veeramachaneni says. \u201cThat\u2019s because now that this information is all in one place, and people can see what\u2019s going on behind the scenes and have the power to control it.\u201d<\/p>\n<\/div>\n<p><a href=\"http:\/\/news.mit.edu\/2019\/atmseer-machine-learning-black-box-0531\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Rob Matheson | MIT News Office Researchers from MIT and elsewhere have developed an interactive tool that, for the first time, lets users see [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/05\/31\/cracking-open-the-black-box-of-automated-machine-learning\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":472,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2214"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2214"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2214\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/472"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2214"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2214"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2214"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}