{"id":901,"date":"2018-08-14T06:41:26","date_gmt":"2018-08-14T06:41:26","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/08\/14\/off-the-beaten-path-htm-based-strong-ai-beats-rnns-and-cnns-at-prediction-and-anomaly-detection\/"},"modified":"2018-08-14T06:41:26","modified_gmt":"2018-08-14T06:41:26","slug":"off-the-beaten-path-htm-based-strong-ai-beats-rnns-and-cnns-at-prediction-and-anomaly-detection","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/08\/14\/off-the-beaten-path-htm-based-strong-ai-beats-rnns-and-cnns-at-prediction-and-anomaly-detection\/","title":{"rendered":"Off the Beaten Path &#8211; HTM-based Strong AI Beats RNNs and CNNs at Prediction and Anomaly Detection"},"content":{"rendered":"<p>Author: William Vorhies<\/p>\n<div>\n<p><strong><em>Summary:<\/em><\/strong> <em>This is the second in our \u201cOff the Beaten Path\u201d series looking at innovators in machine learning who have elected strategies and methods outside of the mainstream.\u00a0 In this article we look at Numenta\u2019s unique approach to scalar prediction and anomaly detection based on their own brain research.<\/em><\/p>\n<p><em>\u00a0<\/em><\/p>\n<p><a href=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLFB32eSOd6g1EGn852ASU5W-cTwlh-E3E49ScFM5B1wgLtW2Qud3iF9pf4j9NndgldBtmRFLtBXN9XS1rnG1RYn\/strongvsweakaismall.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLFB32eSOd6g1EGn852ASU5W-cTwlh-E3E49ScFM5B1wgLtW2Qud3iF9pf4j9NndgldBtmRFLtBXN9XS1rnG1RYn\/strongvsweakaismall.png?width=250\" width=\"250\" class=\"align-right\"><\/a>Numenta, the machine intelligence company founded in 2005 by Jeff Hawkins of Palm Pilot fame might well be the poster child for \u2018off the beaten path\u2019.\u00a0 More a research laboratory than commercial venture, Hawkins has been pursuing a strong-AI model of computation that will at once directly model the human brain, and as a result be a general purpose solution to all types of machine learning problems.<\/p>\n<p>After swimming against the tide of the \u2018narrow\u2019 or \u2018weak\u2019 AI approaches represented by deep learning\u2019s CNNs and RNN\/LSTMs his bet is starting to pay off.\u00a0 There are now benchmarked studies showing that Numenta\u2019s strong AI computational approach can outperform CNN\/RNN based deep learning (which Hawkins characterizes as \u2018classic\u2019 AI) at scalar predictions (future values of things like commodity, energy, or stock prices) and at anomaly detection.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>How Is It Different from Current Deep Learning<\/strong><\/span><\/p>\n<p>The \u2018strong AI\u2019 approach pursued by Numenta relies on computational models drawn directly from the brain\u2019s own architecture.\u00a0<\/p>\n<p>\u2018Weak AI\u2019 by contrast, represented by the full panoply of deep neural nets, acknowledges that it is only suggestive of true brain function, but it gets results.\u00a0<\/p>\n<p>We are all aware of the successes in image, video, text, and speech analysis that CNNs and RNN\/LSTMs have achieved and that is their primary defense.\u00a0 They work.\u00a0 They give good commercial results.\u00a0 But we are also beginning to recognize their weaknesses: large training sets, susceptibility to noise, long training times, complex setup, inability to adapt to changing data, and time invariance that begin to show us where the limits of their development will lead us.<\/p>\n<p>Numenta\u2019s computational approach has a few similarities to these and many unique contributions that require those of us involved in deep learning to consider a wholly different computational paradigm.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Hierarchical Temporal Memory (HTM)<\/strong><\/span><\/p>\n<p>It would take several articles of this length to do justice to the methods introduced by Numenta.\u00a0 Here are the highlights.<\/p>\n<p><strong><a href=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLHbqCHjYoO*AdRN5pQc2s83lKvmw3TBbP46a2zKdTms0ARHAKVPu2Eu4D8wjkteelZ4lppvSsckkU7CLt*uXv2F\/HTMneuron.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLHbqCHjYoO*AdRN5pQc2s83lKvmw3TBbP46a2zKdTms0ARHAKVPu2Eu4D8wjkteelZ4lppvSsckkU7CLt*uXv2F\/HTMneuron.png?width=300\" width=\"300\" class=\"align-right\"><\/a>HTM and Time:<\/strong>\u00a0 Hawkins uses the term <strong>Hierarchical Temporal Memory (HTM)<\/strong> to describe his overall approach.\u00a0 The first key to understanding is that HTM relies on data that streams over time.\u00a0 Quoting from previous interviews, Hawkins says,<\/p>\n<p><em>&#8220;The brain does two things: it does inference, which is recognizing patterns, and it does behavior, which is generating patterns or generating motor behavior.\u00a0 Ninety-nine percent of inference is time-based \u2013 language, audition, touch \u2013 it&#8217;s all time-based. You can&#8217;t understand touch without moving your hand. The order in which patterns occur is very important.&#8221;<\/em><\/p>\n<p>So the \u2018memory\u2019 element of HTM is how the brain differently interprets or relates each sequential input, also called \u2018sequence memory\u2019.<\/p>\n<p>By contrast, conventional deep learning uses static data and is therefore time invariant.\u00a0 Even RNN\/LSTMs that process speech, which is time based, actually do so on static datasets.<\/p>\n<p>\u00a0<\/p>\n<p><strong>Sparse Distributed Representations (SDRs):<\/strong>\u00a0 Each momentary input to the brain, for example from the eye or from touch is understood to be some subset of all the available neurons in that sensor (eye, ear, finger) firing and forwarding that signal upward to other neurons for processing.<\/p>\n<p>Since not all the available neurons fire for each input, the signal sent forward can be seen as a Sparse Distributed Representation (SDR) of those that have fired (the 1s in the binary representation) versus the hundreds or thousands that have not (the 0s in the binary representation).\u00a0 We know from research that on average only about 2% of neurons fire with any given event giving meaning to the term \u2018sparse\u2019.<\/p>\n<p>SDRs lend themselves to vector arrays and because they are sparse have the interesting characteristics that they can be extensively compressed without losing meaning and are very resistant to noise and false positives.<\/p>\n<p>By comparison, deep neural nets fire all neurons in each layer, or at least those that have reached the impulse threshold.\u00a0 This is an acknowledged drawback by current researchers in moving DNNs much beyond where they are today.<\/p>\n<p>\u00a0<\/p>\n<p><strong>Learning is Continuous and Unsupervised:\u00a0<\/strong> Like CNNs and RNNs this is a feature generating system that learns from unlabeled data.<\/p>\n<p>When we look at diagrams of CNNs and RNNs they are typically shown as multiple (deep) layers of neurons which decrease in pyramidal fashion as the signal progresses.\u00a0 Presumably discovering features in this self-constricting architecture down to the final classifying layer.<\/p>\n<p>HTM architecture by contrast is simply columnar with columns of computational neurons passing the information on to upward layers in which pattern discovery and recognition occurs organically by the comparison of one SDR (a single time signal) to the others in the signal train.\u00a0<\/p>\n<p>HTM has the characteristic that it discovers these patterns very rapidly, with as few as on the order of 1,000 SDR observations.\u00a0 This compares with the hundreds of thousands or millions of observations necessary to train CNNs or RNNs.<\/p>\n<p>Also the pattern recognition is unsupervised and can recognize and generalize about changes in the pattern based on changing inputs as soon as they occur.\u00a0 This results in a system that not only trains remarkably quickly but also is self-learning, adaptive, and not confused by changes in the data or by noise.<\/p>\n<p>Numenta offers a deep library of explanatory papers and YouTube videos for those wanting to experiment hands-on.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Where Does HTM Excel<\/strong><\/span><\/p>\n<p>HTM has for many years been a work in progress.\u00a0 That has recently changed.\u00a0 Numenta has published several peer reviewed performance papers and established benchmarking in areas of its strength that highlight its superiority over traditional DNNs and other ML methods on particular types of problems.<\/p>\n<p>In general, Numenta says that the current state of its technology represented by its open source project NuPIC (Numenta Platform for Intelligent Computing) currently excels in three areas:<\/p>\n<p><strong>Anomaly Detection in streaming data<\/strong>.\u00a0 For example:<\/p>\n<ul>\n<li>Highlighting anomalies in the behavior of moving objects, such as tracking a fleet\u2019s movements on a truck by truck basis using geospatial data.<\/li>\n<li>Understanding if human behavior is normal or abnormal on a securities trading floor.<\/li>\n<li>Predicting failure in a complex machine based on data from many sensors.<\/li>\n<\/ul>\n<p><strong>Scalar Predictions<\/strong>, for example:<\/p>\n<ul>\n<li>Predicting energy usage for a utility on a customer by customer basis.<\/li>\n<li>Predicting New York City taxi passenger demand 2 \u00bd hours in advance based on a public data stream provided by the New York City Transportation Authority.<\/li>\n<\/ul>\n<p><strong>Highly Accurate Semantic Search<\/strong> on static and streaming data <em>(these examples are from Corticol.Io a Numenta commercial partner using the SDR concept but not NuPICS)<\/em>.<\/p>\n<ul>\n<li>Automate extraction of key information from contracts and other legal documents.<\/li>\n<li>Quickly find similar cases to efficiently solve support requests.<\/li>\n<li>Extract topics from different data sources (e.g. emails, social media) and determine customers\u2019 intent.<\/li>\n<li>Terrorism Prevention: Monitor all social media messages alluding to terrorist activity even if they don\u2019t use known keywords.<\/li>\n<li>Reputation Management: Track all social media posts mentioning a business area or product type without having to type hundreds of keywords.<\/li>\n<\/ul>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Two Specific Examples of Performance Better Than DNNs<\/strong><\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Taxi Demand Forecast<\/strong><\/span><\/p>\n<p>In this project, the objective was to predict the demand for New York City taxi service 2 \u00bd hours in advance based on a public data stream provided by the New York City Transportation Authority.\u00a0 This was based on historical streaming data at 30 minutes intervals using the previous 1,000, 3,000, or 6,000 observations as the basis for the forward projection 5 periods (2 \u00bd hours) in advance.\u00a0 The study (<a href=\"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/NECO_a_00893\"><em><u>which you can see here<\/u><\/em><\/a>) compared ARIMA, TDNN, and LSTM to the HTM method where HTM demonstrated the lowest error rate.<\/p>\n<p>\u00a0<a href=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLEid4zDAFwhYYjDCrvVDobGjfYXmGqRIubIfk0F8r1OHu3KE73TWNuPCArenN1PMpz8Tg2KdVzrpEvVnOz798NB\/taxistudy.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLEid4zDAFwhYYjDCrvVDobGjfYXmGqRIubIfk0F8r1OHu3KE73TWNuPCArenN1PMpz8Tg2KdVzrpEvVnOz798NB\/taxistudy.png?width=450\" width=\"450\" class=\"align-center\"><\/a><\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Machine Failure Prediction (Anomaly)<\/strong><\/span><\/p>\n<p>The objective of this test was to compare two of the most popular anomaly detection routines (Twitter ADVec and Etsy Skyline) against HTM in a machine failure scenario.\u00a0 In this type of IoT application it\u2019s important that the analytics detect all the anomalies present, detect them as far before occurrence as possible, trigger no false alarms (false positives), and work with real time data.\u00a0 A full description of the study can be <a href=\"https:\/\/arxiv.org\/ftp\/arxiv\/papers\/1510\/1510.03336.pdf\"><em><u>found here<\/u><\/em><\/a>.<\/p>\n<p>The results showed that the Numenta HTM outperformed the other methods by a significant margin.\u00a0<a href=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLHaXD3n5SnYIR6BJ8476R0JARkZpWYqARz8LCKX2SmLq6EXt7mEi1HaypYtseGXA0AeCJDBGWDnYYlbPXMqtybq\/anomalytable.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLHaXD3n5SnYIR6BJ8476R0JARkZpWYqARz8LCKX2SmLq6EXt7mEi1HaypYtseGXA0AeCJDBGWDnYYlbPXMqtybq\/anomalytable.png?width=350\" width=\"350\" class=\"align-center\"><\/a><\/p>\n<p>Even more significantly, as noted in the caption below, the Numenta HTM method identified the potential failure a full 3 hours before the other techniques.<\/p>\n<p>\u00a0<a href=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLGUFFZKyAaKzs-TGzSRNbecni7z5Dj3t9dNqofvq64WqiLHBaepiMezMF7qERYmCO-arETfIbW*VxHOusI6emZv\/anomalychart.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/4QC9xOMBQLGUFFZKyAaKzs-TGzSRNbecni7z5Dj3t9dNqofvq64WqiLHBaepiMezMF7qERYmCO-arETfIbW*VxHOusI6emZv\/anomalychart.png?width=450\" width=\"450\" class=\"align-center\"><\/a><\/p>\n<p>You can find other benchmark studies on the Numenta site.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>The Path Forward<\/strong><\/span><\/p>\n<p>Several things are worthy of note here since as we mentioned earlier the Numenta HTM platform is still a work in progress.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Commercialization<\/strong><\/span><\/p>\n<p>Numenta\u2019s business model currently calls for it to be the center of a commercial ecosystem while retaining its primary research focus.\u00a0 Currently Numenta has two commercially licensed partners, Corticol.Io which focuses on streaming text and semantic interpretation.\u00a0 The second is Grok (Grokstream.com) which has adapted the NuPIC core platform for anomaly detection in all types of IT operational scenarios.\u00a0 The core NuPICs platform is open source if you\u2019re motivated to experiment with potential commercial applications.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Image and Text Classification<\/strong><\/span><\/p>\n<p>A notable absence from the current list of capabilities is image and text classification.\u00a0 There are no current plans for Numenta to develop image classification from static data since that is not on the critical path defined by streaming data.\u00a0 It\u2019s worth noting that others have demonstrated the use of HTM as a superior technique for image classification not using the NuPICs platform.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Near Term Innovation<\/strong><\/span><\/p>\n<p>In my conversation with Christy Maver, Numenta\u2019s VP of Marketing she expressed that they\u00a0are confident that they will have a fairly complete framework for how the neocortex works\u00a0within the timeframe of perhaps a year.\u00a0 This last push is in the area of sensorimotor integration that would be the core concept in applying the HTM architecture to robotics.<\/p>\n<p>For commercial development, the focus will be on partners to license the IP.\u00a0 Even IBM established a Cortical Research Center a few years back staffed with about 100 researchers to examine the Numenta HTM approach.\u00a0 Like so many others now trying to advance AI by more closely modeling the brain, IBM, like Intel and others has moved off in the direction of specialty chips that fall in the category on neuromorphic or spiking chips.\u00a0\u00a0 Brainchip out of Irvine already has a spiking neuromorphic chip in commercial use.\u00a0 As Maver alluded, there may be a silicon representation of Numenta\u2019s HTM in the future.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>Other articles in this series:<\/p>\n<p><em><u><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/off-the-beaten-path-using-deep-forests-to-outperform-cnns-and-rnn\">Off the Beaten path \u2013 Using Deep Forests to Outperform CNNs and RNNs<\/a><\/u><\/em><\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>About the author:\u00a0 Bill Vorhies is Editorial Director for Data Science Central and has practiced as a data scientist since 2001.\u00a0 He can be reached at:<\/p>\n<p><a href=\"mailto:Bill@DataScienceCentral.com\">Bill@DataScienceCentral.com<\/a><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:695581\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: William Vorhies Summary: This is the second in our \u201cOff the Beaten Path\u201d series looking at innovators in machine learning who have elected strategies [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/08\/14\/off-the-beaten-path-htm-based-strong-ai-beats-rnns-and-cnns-at-prediction-and-anomaly-detection\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":459,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/901"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=901"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/901\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/458"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=901"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=901"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=901"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}