{"id":2230,"date":"2019-06-05T06:32:37","date_gmt":"2019-06-05T06:32:37","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/06\/05\/neuromorphic-chips-and-the-future-of-your-cell-phone\/"},"modified":"2019-06-05T06:32:37","modified_gmt":"2019-06-05T06:32:37","slug":"neuromorphic-chips-and-the-future-of-your-cell-phone","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/06\/05\/neuromorphic-chips-and-the-future-of-your-cell-phone\/","title":{"rendered":"Neuromorphic Chips and the Future of Your Cell Phone"},"content":{"rendered":"<p>Author: William Vorhies<\/p>\n<div>\n<p><strong><em>Summary:<\/em><\/strong><em>\u00a0 The ability to train large scale CNNs directly on your cell phone without send the data round trip to the cloud is the key to next gen AI applications like real time computer vision and safe self-driving cars.\u00a0 Problem is our current GPU AI chips won\u2019t get us there.\u00a0 But neuromorphic chips look like they will.<\/em><\/p>\n<p>\u00a0<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760555134?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760555134?profile=RESIZE_710x\" width=\"350\" class=\"align-right\"><\/a>This article is particularly fun for me since it brings together two developments that I didn\u2019t see coming together, real time computer vision (RTCV), and neuromorphic neural nets (aka spiking neural nets).<\/p>\n<p>We\u2019ve been following neuromorphic nets for a few years now (additional references at the end of this article) and viewed them as the next generation (3<sup>rd<\/sup> generation) of neural nets.\u00a0 This was mostly in the context of the pursuit of Artificial General Intelligence (AGI) which is the holy grail (or terrifying terminator) of all we\u2019ve been doing.<\/p>\n<p>Where we got off track was in thinking that neuromorphic nets that are just in their infancy were only for AGI.\u00a0 Turns out that they facilitate a lot of closer-in capabilities, and among them could be real time computer vision (RTCV).\u00a0 Why that\u2019s true turns out to have more to do with how neuromorphics are structured than what fancy things they may be able to do.\u00a0 Here\u2019s the story.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Real Time Computer Vision<\/strong><\/span><\/p>\n<p>In our last article we argued that <a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/real-time-computer-vision-is-likely-to-be-the-next-killer-app-but\"><em><u>RTCV will be the next killer app<\/u><\/em><\/a> that makes us once again fall in love with our phones and other mobile devices.\u00a0 Problem is that the road map to get there can\u2019t be supported by our current GPU chips or the architecture that requires that the signal go from the phone to the cloud and back again.\u00a0<\/p>\n<p>That round trip data flow won\u2019t support the roughly 30 frames per second processing RTCV needs, and current chips can\u2019t be made small enough to fit directly on the phone.\u00a0 This is a benchmark case where extremely powerful CNN edge compute is needed.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Problem with Current GPUs<\/strong><\/span><\/p>\n<p>This story was told quite eloquently by Gordon Wilson, CEO of RAIN Neuromorphics at the \u2018Inside AI Live\u2019 conference and I will borrow from his explanation.<\/p>\n<p>The first part of the story is to remember that CNNs are based on matrix multiplication.\u00a0 Basically this requires that all the elements in the matrix be multiplied by all the values in the additional dimension.\u00a0 In other words lots of small independent operations.<\/p>\n<p>\u00a0<a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760562685?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760562685?profile=RESIZE_710x\" width=\"450\" class=\"align-center\"><\/a><\/p>\n<p>So the matrix algebra can calculate all the values in the three dimensional matrix necessary for the classification of images through the convolution + RELU and pooling steps of a CNN.<\/p>\n<p>The happy breakthrough we had several years ago was in recognizing that GPU chips (graphical processing units) were doing a very similar thing in which each of the elements of the matrix was a pixel on a screen that needed to be recalculated very rapidly to keep up with the movie-like animation.\u00a0<\/p>\n<p>GPUs did that by using a chip architecture in which a separate and very simple processing and storage unit was created for each element of the matrix.\u00a0 Doing these calculations in parallel made the screen refresh very fast.<\/p>\n<p>Turns out that pixels on a screen are a very close analog to the weights of the neurons in the convolution value capture step of the CNN.\u00a0 GPUs could do this in parallel for each neuron making it finally fast enough to keep up with our compute needs.<\/p>\n<p>GPUs architecture has been largely unchanged since at least 2006 and it\u2019s likely that most cloud compute AI data center chips will continue to look like this for years to come.\u00a0 This works quite well for the types of image compute AI problems we\u2019ve been encountering.\u00a0 Problem is it won\u2019t scale up to the level needed for RTCV.<\/p>\n<p>It\u2019s simple mechanics.\u00a0 To process problems that have greater dimensions (both width \u2013 features, and depth \u2013 layers), the chips would have to grow at a rate of n<sup>2<\/sup> (number of nodes squared).\u00a0 But the GPUs we have based on the simple von Neumann architecture are already approaching a kind of Moore\u2019s law limit.\u00a0<\/p>\n<p>Last week, Intel announced that their premier GPU-based AI stack had outperformed NVIDIA\u2019s best GPU stack by processing 7878 images per second on the ResNet 50 test compared to NVIDIA\u2019s 7844 images.\u00a0 That\u2019s where we are, improvements of 0.4% are considered a big deal.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Another Architecture You May Not Have Heard Of<\/strong><\/span><\/p>\n<p>Another chip architecture that has been getting a lot of attention lately is analog crossbar CNN accelerators.\u00a0 That\u2019s a mouthful, but as simply as I can explain it, this is a basic tiny wire grid where voltages representing the input values enter one axis and the values representing the values passed on to the next level exit on the other axis.\u00a0 The values at the intersections of the two sets of grid wires can be read directly (analog) as representing the weights of the nodes.<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760574488?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760574488?profile=RESIZE_710x\" width=\"450\" class=\"align-center\"><\/a><\/p>\n<p>Typically these are found on very small silicon devices called memristors and don\u2019t rely on the von Neumann architecture of process + storage at each intersection (node).\u00a0 As a result they are extremely fast, small, and low power.<\/p>\n<p>The problem is that the physical representation of the CNN on the memristor requires as many input and output nodes as the CNN you want to train.\u00a0 As that number is very large for RTCV, there simply isn\u2019t enough space on a chip.\u00a0 So like GPUs, they fail to scale adequately for this type of application.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Sparsity<\/strong><\/span><\/p>\n<p>Over about the last 18 months the academic research community has been demonstrating that traditional CNNs can be compressed by as much as 90% by simply leaving out the nodes that don\u2019t add to the final calculation.\u00a0 The concept is known as sparsity.<\/p>\n<p>Our CNNs (and our crossbar networks) are all fully connected meaning that every node in a layer is connected to every other node in the next layer.\u00a0 The brain doesn\u2019t work this way and we\u2019ve long suspected that CNNs could be made more compact, and therefore require less compute if only we could figure out which nodes to eliminate.<\/p>\n<p>So far researchers haven\u2019t been able to anticipate in advance which 90% of nodes to leave out during training but they have made some advances in slimming down pretrained CNNs by eliminating the non-value creating nodes.\u00a0 Indications are that sparse CNNs could be 90% smaller and therefore faster and less power hungry.\u00a0 Some tests show they can even be more accurate.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Neuromorphic Chips<\/strong><\/span><\/p>\n<p>Neuromorphic chips as a group are based on design elements that more closely mimic how the brain actually works.\u00a0 That includes the concept of sparsity (not all brain neurons are \u2018fully connected\u2019).\u00a0 It also includes the concept of interpreting the spiking signal train produced by the neurons.\u00a0 Is there information in the amplitude, the number of recurring spikes, the time delay between spikes, or all of these?\u00a0<\/p>\n<p>IBM created the TrueNorth neuromorphic chip for DARPA from a project that started in 2008.\u00a0 Long term startups like Numenta are exploring even more fundamental brain phenomena like hierarchical temporal memory and have commercial licensed products. \u00a0BrainChip Holdings out of California is listed on the Australian stock exchange and has neuromorphic chips that learn casino games like blackjack from video feeds already in use in Las Vegas casinos to spot cheaters.<\/p>\n<p>The point is that the developing field of neuromorphic chips is not a single concept or architecture and the goals of the individual startups are not the same.\u00a0 However, RAIN Neuromorphics that presented at the \u2018Inside AI Live\u2019 conference is focused on sparsity and the edge compute applications like RTCV.<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760587371?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/2760587371?profile=RESIZE_710x\" width=\"200\" class=\"align-right\"><\/a>A picture of their first product comprised of a network of randomly generated nano-wires looks surprisingly like the sparse connections in brain neurons.\u00a0 Their reported performance also has the breakthrough characteristics of representing a very large CNN in an extremely miniaturized and low power chip.\u00a0<\/p>\n<p>Per Gordon Wilson their CEO it may take a few more years to make the chip as small as they want but it appears their roadmap will take them to a chip that could indeed fit in your cell phone and train a large scale CNN on-the-fly.\u00a0<\/p>\n<p>And that is how neuromorphic chips may well rekindle our love affair with our cell phones by enabling the next generation of video-like applications like AR enhanced navigation.\u00a0 There are 4 billion cell phone users in the world and whole new generations of self-driving vehicles just waiting for this.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Other articles on Neuromorphic Neural Nets<\/strong><\/span><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/a-wetware-approach-to-artificial-general-intelligence-agi\"><em><u>A Wetware Approach to Artificial General Intelligence (AGI)<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/off-the-beaten-path-htm-based-strong-ai-beats-rnns-and-cnns-at-pr\"><em><u>Off the Beaten Path &#8211; HTM-based Strong AI Beats RNNs and CNNs at Prediction and Anomaly Detection<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/the-three-way-race-to-the-future-of-ai-quantum-vs-neuromorphic-vs\"><em><u>The Three Way Race to the Future of AI. Quantum vs. Neuromorphic vs. High Performance Computing<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/in-search-of-artificial-general-intelligence-agi\"><em><u>In Search of Artificial General Intelligence (AGI)<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/more-on-3rd-generation-spiking-neural-nets\"><em><u>More on 3rd Generation Spiking Neural Nets<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/beyond-deep-learning-3rd-generation-neural-nets\"><em><u>Beyond Deep Learning \u2013 3rd Generation Neural Nets<\/u><\/em><\/a><\/p>\n<p>\u00a0<\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blog\/list?user=0h5qapp2gbuf8\"><em><u>Other articles by Bill Vorhies<\/u><\/em><\/a><\/p>\n<p>\u00a0<\/p>\n<p>About the author:\u00a0 Bill is Contributing Editor for Data Science Central.\u00a0 Bill is also President &#038; Chief Data Scientist at Data-Magnum and has practiced as a data scientist since 2001.\u00a0 His articles have been read more than 1.5 million times.<\/p>\n<p>He can be reached at:<\/p>\n<p><a href=\"mailto:Bill@DataScienceCentral.com\">Bill@DataScienceCentral.com<\/a> <span>or<\/span> <a href=\"mailto:Bill@Data-Magnum.com\">Bill@Data-Magnum.com<\/a><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:833972\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: William Vorhies Summary:\u00a0 The ability to train large scale CNNs directly on your cell phone without send the data round trip to the cloud [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/06\/05\/neuromorphic-chips-and-the-future-of-your-cell-phone\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":459,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2230"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2230"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2230\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/458"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2230"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2230"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2230"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}