{"id":2212,"date":"2019-05-31T06:35:34","date_gmt":"2019-05-31T06:35:34","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/05\/31\/real-time-computer-vision-is-likely-to-be-the-next-killer-app-but-were-going-to-need-new-chips\/"},"modified":"2019-05-31T06:35:34","modified_gmt":"2019-05-31T06:35:34","slug":"real-time-computer-vision-is-likely-to-be-the-next-killer-app-but-were-going-to-need-new-chips","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/05\/31\/real-time-computer-vision-is-likely-to-be-the-next-killer-app-but-were-going-to-need-new-chips\/","title":{"rendered":"Real Time Computer Vision is Likely to be the Next Killer App but We\u2019re Going to Need New Chips"},"content":{"rendered":"<p>Author: William Vorhies<\/p>\n<div>\n<p><strong><em>Summary:<\/em><\/strong> <em>\u00a0Real Time Computer Vision (RTCV) that requires processing video DNNs at the edge is likely to be the next killer app that powers a renewed love affair with our mobile devices.\u00a0 The problem is that current GPUs won\u2019t cut it and we have to wait once again for the hardware to catch up.<\/em><\/p>\n<p>\u00a0<a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2677059945?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2677059945?profile=RESIZE_710x\" width=\"600\" class=\"align-center\"><\/a><\/p>\n<p>\u00a0The entire history of machine learning and artificial intelligence (AI\/ML) has been a story about the race between techniques and hardware.\u00a0 There have been times when we had the techniques but the hardware couldn\u2019t keep up.\u00a0 Conversely there have been times when hardware has outstripped technique.\u00a0 Candidly though, it\u2019s been mostly about waiting for the hardware to catch up.<\/p>\n<p>You may not have thought about it, but we\u2019re in one of those wait-for-tech hardware valleys right now.\u00a0 Sure there\u2019s lots of cloud based compute and ever faster GPU chips to make CNN and RNN work.\u00a0 But the barrier that we\u2019re up against is latency, particularly in computer vision.\u00a0<\/p>\n<p>If you want to utilize computer vision on your cell phone or any other edge device (did you ever think of self-driving cars as edge devices) then the data has to make the full round trip from your local camera to the cloud compute and back again before anything can happen.<\/p>\n<p>There are some nifty applications that are just fine with delays of say 200 ms or even longer.\u00a0 Most healthcare applications are fine with that as are chatbots and text\/voice apps.\u00a0 Certainly search and ecommerce don\u2019t mind.<\/p>\n<p>But the fact is that those apps are rapidly approaching saturation.\u00a0 Maturity if you\u2019d like a kinder word.\u00a0 They don\u2019t thrill us any longer.\u00a0 Been there, done that.\u00a0 Competing apps are working for incremental share of attention and the economics are starting to work against them.\u00a0 If you\u2019re innovating with this round-trip data model, you\u2019re probably too late.<\/p>\n<p>What everyone really wants to know is what\u2019s the next big thing.\u00a0 What will cause us to become even more attached to our cell phones, or perhaps our smart earbuds or augmented glasses.\u00a0 That thing is most likely to be \u2018<strong>edge<\/strong>\u2019 or \u2018<strong>real time computer vision\u2019<\/strong> (RTCV).<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>What Can RTCV Do that Regular Computer Vision Can\u2019t<\/strong><\/span><\/p>\n<p>Pretty much any task that relies on movie-like live vision (30 fps, roughly 33 ms response or better) can\u2019t really be addressed by the round-trip data model.\u00a0 Here are just a few: drones, autonomous vehicles, augmented reality, robotics, cashierless checkout, video monitoring for retail theft, monitoring for driver alertness.<\/p>\n<p>Wait a minute you say.\u00a0 All those things currently exist and function.\u00a0 Well yes and no.\u00a0 Everyone\u2019s working on chips to make this faster but there\u2019s a cliff coming in two forms.\u00a0 First, the sort of Moore\u2019s law that applies to this round-trip speed, and second the sheer volume of devices that want to utilize RTCV.<\/p>\n<p>What about 5G?\u00a0 That\u2019s a temporary plug.\u00a0 The fact is that the round-trip data architecture is at some level always going to be inefficient and what we need is a chip that can do RTCV that is sufficiently small enough, cheap enough, and fast enough to live directly on our phone or other edge device.\u00a0<\/p>\n<p>By the way, if the processing is happening on your phone then the potential for hacking the data stream goes completely away.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>AR Enhanced Navigation is Most Likely to be the Killer App\u00a0<\/strong><\/span><\/p>\n<p>Here\u2019s a screen shot of an app called Phiar, shown by co-founder and CEO Chen Ping Yu at last week\u2019s \u2018Inside AI Live\u2019 event.\u00a0<a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2677081148?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2677081148?profile=RESIZE_710x\" width=\"500\" class=\"align-center\"><\/a><\/p>\n<p>Chen Ping\u2019s point is simple.\u00a0 Since the camera on our phone is most likely already facing forward, why not just overlay the instruction directly on the image AR-style and eliminate the confusion caused by having to glance elsewhere at a more confusing 2D map.<\/p>\n<p>This type of app requires that minimum 30 FPS processing speed (or faster).\u00a0 Essentially AR overlays on real time images is the core of RTCV and you can begin to see the attraction.<\/p>\n<p><span style=\"font-size: 12pt;\">\u00a0<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><strong>The Needed Compute is Just Going Wild<\/strong><\/span><\/p>\n<p>Even without RTCV, our most successful DNNs are winning by using increasingly larger amounts of compute.\u00a0<\/p>\n<p>Gordon Wilson, CEO of RAIN Neuromorphics used this graph at that same \u2018Inside AI Live\u2019 event to illustrate these points.\u00a0 Since our success with AlexNet in 2012 to AlphaGo in 2018, the breakthrough winners have done it with a <strong>300,000x<\/strong> increase in needed compute.\u00a0<\/p>\n<p>\u00a0<a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2677104771?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2677104771?profile=RESIZE_710x\" width=\"450\" class=\"align-center\"><\/a><\/p>\n<p>That doubling every 3 \u00bd months is driven by ever larger DNNs with more neurons processing more features and ever larger datasets.\u00a0 And to be specific, it\u2019s not that our DNNs are just getting deeper with more layers, the real problem is that they need to get wider, with more input neurons corresponding to more features.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>What\u2019s Wrong with Simply More and Faster GPUs?<\/strong><\/span><\/p>\n<p>Despite the best efforts of Nvidia, the clear king of GPUs for DNN, the hardware stack needed to train and run computer vision apps is about the size of three or four laptops stacked on top of each other.\u00a0 That\u2019s not going to fit in your phone.<\/p>\n<p>For some really interesting technical reasons that we\u2019ll discuss in our next article, it\u2019s not likely that GPUs are going to cut it for next gen edge processing of computer vision in real time.<\/p>\n<p>The good news is that there are on the order of 100 companies working on AI-specific silicon, ranging from the giants like Nvidia down to a host of startups.\u00a0 Perhaps most interesting is that this app will introduce the era of the neuromorphic chip.\u00a0 More on the details next week.<\/p>\n<p>\u00a0<\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blog\/list?user=0h5qapp2gbuf8\"><em><u>Other articles by Bill Vorhies<\/u><\/em><\/a><\/p>\n<p>\u00a0<\/p>\n<p><strong>About the author:<\/strong>\u00a0 Bill is Contributing Editor for Data Science Central.\u00a0 Bill is also President &#038; Chief Data Scientist at Data-Magnum and has practiced as a data scientist since 2001.\u00a0 His articles have been read more than 1.5 million times.<\/p>\n<p>He can be reached at:<\/p>\n<p><a href=\"mailto:Bill@DataScienceCentral.com\">Bill@DataScienceCentral.com<\/a> <span>or<\/span> <a href=\"mailto:Bill@Data-Magnum.com\">Bill@Data-Magnum.com<\/a><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:831243\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: William Vorhies Summary: \u00a0Real Time Computer Vision (RTCV) that requires processing video DNNs at the edge is likely to be the next killer app [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/05\/31\/real-time-computer-vision-is-likely-to-be-the-next-killer-app-but-were-going-to-need-new-chips\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":463,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2212"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2212"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2212\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/466"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2212"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2212"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2212"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}