{"id":1062,"date":"2018-09-19T06:47:17","date_gmt":"2018-09-19T06:47:17","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/09\/19\/the-fourth-way-to-practice-data-science-purpose-built-analytic-modules\/"},"modified":"2018-09-19T06:47:17","modified_gmt":"2018-09-19T06:47:17","slug":"the-fourth-way-to-practice-data-science-purpose-built-analytic-modules","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/09\/19\/the-fourth-way-to-practice-data-science-purpose-built-analytic-modules\/","title":{"rendered":"The Fourth Way to Practice Data Science \u2013 Purpose Built Analytic Modules"},"content":{"rendered":"<p>Author: William Vorhies<\/p>\n<div>\n<p><strong><em>Summary:<\/em><\/strong><em>\u00a0 Purpose Built Analytic Modules (PBAMs) such as those for Fraud Detection represent a fourth way to practice data science, a new model for the good use of Citizen Data Scientists, and a new market for AI-first companies.<\/em><\/p>\n<p>\u00a0<\/p>\n<p><a href=\"http:\/\/api.ning.com\/files\/cBpR2VtbQBJHQM0Qm4DGetIN58leuL9Cb2Hnsnp2iJB956ZVLU1aH*R-m*WsIt93O5PX0SNEw-sEYIFDAE8m6BNe8aRTobsO\/moonshot.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/cBpR2VtbQBJHQM0Qm4DGetIN58leuL9Cb2Hnsnp2iJB956ZVLU1aH*R-m*WsIt93O5PX0SNEw-sEYIFDAE8m6BNe8aRTobsO\/moonshot.png?width=350\" width=\"350\" class=\"align-right\"><\/a>It appears that data science has exited its age of exploration and entered into its <a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/what-comes-after-deep-learning\"><em><u>age of consolidation and refinement<\/u><\/em><\/a>.\u00a0 That doesn\u2019t mean that we aren\u2019t making improvements but increasingly these tend to be incremental and not the big exciting break throughs we made through about 2016.\u00a0 More like a mission on the International Space Station and less like those original walks on the moon.<\/p>\n<p>Nothing wrong with that.\u00a0 In fact in our maturity we\u2019re doing two things at once, becoming more specialized and at the same time setting a larger place at the table for new members, representing the team sport that data science has become.\u00a0 No more unicorns, and at the same time achieving what we wanted all along, much wider and deeper adoption of advanced analytics by companies of all sizes and types.<\/p>\n<p>There seem to be four distinct schools emerging for how to practice data science.\u00a0 No they\u2019re not exclusive.\u00a0 There\u2019s plenty of crossover.\u00a0 But see if you don\u2019t recognize these as four fairly unique tribes of practitioners.<\/p>\n<ol>\n<li><strong>Write the code:<\/strong> Do it from the ground up in Python, R, Scala or whatever you like.\u00a0 Make it just the way you like it.<\/li>\n<li><strong>Drag-and-Drop:<\/strong> Plenty of platforms offer the efficiency and simplicity of drag-and-drop environments (e.g. SAS, SPSS, Alteryx, etc.).\u00a0 Many use cases can get the same level of accuracy much more quickly, with fewer resources, and greater standardization.<\/li>\n<li><strong>Automated Machine Learning (AML):<\/strong> Skip the drag-and-drop altogether and let ML tune and select the champion models.\u00a0 Some now include even data cleaning and feature engineering. (e.g. DataRobot, Tazi, etc.).<\/li>\n<\/ol>\n<p>And now a fourth category that\u2019s existed in plain sight for some time.<\/p>\n<ol start=\"4\">\n<li><strong>Purpose Built Analytic Modules (PBAMs):<\/strong> Highly tuned special purpose modules such as those for fraud detection.\u00a0 These are practically plug-and-play in the industries and applications for which they\u2019re targeted.\u00a0 And they allow Citizen Data Scientists (aka business analysts and some LOB managers) to operate advanced ML without the need to extensively configure the underlying DS techniques.<\/li>\n<\/ol>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Fraud Detection as the Paradigm<\/strong><\/span><\/p>\n<p>PBAMs for fraud detection have been in the market for several years but it was the most recent Forrester report on Enterprise Fraud Management applications that caused me to stop and reflect on this.<\/p>\n<p>As we all recognize, you can build your own fraud detection program from any number of platforms and techniques, including coding it up from scratch.\u00a0 Graph databases, various anomaly detection routines, good old fashion supervised models, and now even deep learning techniques can all be harnessed for this.<\/p>\n<p>If you identify as a data scientist you may scoff a little at this.\u00a0 Pushing the buttons on a highly simplified UI and reading preformated reports and dashboards may not seem much like the data science we\u2019re used to.\u00a0 I get it.\u00a0 But frankly we\u2019re going to see more and more of this.\u00a0<\/p>\n<p>If you\u2019re old enough to remember business in the 90s, this is the ERPification of the reengineering movement that was originally built entirely on custom code.\u00a0 Sorry for the made up word, but the continued simplification of DS into simplified analytic packages is a trend we will not be able to resist.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Market Drivers of Enterprise Fraud Management<\/strong><\/span><\/p>\n<p>This is a product that appeals to banks and financial services that provide credit, especially consumer.\u00a0 If you issue a credit card, or for that matter if you accept credit cards, the whole profitability of your business can turn on these losses.\u00a0<\/p>\n<p>In financial services the benchmark is 3 to 6 basis points for card fraud and 10 basis points for non-card fraud.\u00a0 Now those are extremely small percentages and you want to make sure you\u2019ve got absolutely the best detection system there is.\u00a0<\/p>\n<p>Those basis points of actual loss are only the tip of the iceberg.\u00a0 There are very significant costs associated with maintaining a staff of fraud investigators who review and approve pretty much every item flagged by the system.\u00a0 High false positive rates balloon these costs and also chase away your good customers when their legitimate transactions are falsely flagged as fraud.<\/p>\n<p>In other words, you want world class data scientists to build those for you and keep them updated.\u00a0 The question is, are your internal data scientists up to that level and is do-it-yourself really the way to go?<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>PBAMs Require Highly Standardized Processes Across Similar Businesses<\/strong><\/span><\/p>\n<p>The good news is that the business processes of banks and non-bank card lenders and acceptors are extremely similar, allowing for significant standardization of the module.\u00a0<\/p>\n<p>Yes the data inputs are going to be somewhat different and in fact, users of these PBAM fraud modules report that implementation adds 20% to 35% to the total project cost.\u00a0 But hey, if you\u2019ve ever implemented an ERP that ratio if more like 100% so these numbers look pretty attractive.<\/p>\n<p>The standardization is critical since these have to be extremely easy to use.\u00a0 And that\u2019s because the users, Security &#038; Risk professionals (S&#038;R) are not data scientists.\u00a0 But they are the very definition of the Citizen Data Scientist.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Forrester and Enterprise Fraud Management (EFM)<\/strong><\/span><\/p>\n<p>Here\u2019s a little of what Forrester had to say.<\/p>\n<p>Old rules based detections are dead and gone.\u00a0 ML is where it\u2019s at.<\/p>\n<p>Although the S&#038;R professionals that are the primary users are not data scientists they are quite tuned into the techniques used to identify potential fraud.\u00a0 They need the ability to further tune the sensitivity of the system which means understanding, if not building, the underlying data science.<\/p>\n<p>This means the EFM can\u2019t be just a black box.\u00a0 They have to provide for some user adjustment.\u00a0 They have to have sufficient transperancy to explain themselves and also offer trends.\u00a0 Many of these institutions are subject to regulatory audits in which they have to not only \u2018show their work\u2019 but also show that they\u2019re getting better over time.<\/p>\n<p>Here\u2019s the <a href=\"https:\/\/www.sas.com\/en_us\/news\/analyst-viewpoints\/forrester-names-sas-leader-in-enterprise-fraud-management.html\"><em><u>Forrester Wave chart for Enterprise Fraud Management<\/u><\/em><\/a> Q3 2018.<\/p>\n<p>\u00a0<a href=\"http:\/\/api.ning.com\/files\/cBpR2VtbQBILwc88v3KRUvutH6J7T2i3fqxo4bTsU3AoD*mUUgPgXFm9eybdnruei2Y2EitAno4pXZyDFWRoVmquI9OR4M2A\/ForresterEFMQ32018.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/cBpR2VtbQBILwc88v3KRUvutH6J7T2i3fqxo4bTsU3AoD*mUUgPgXFm9eybdnruei2Y2EitAno4pXZyDFWRoVmquI9OR4M2A\/ForresterEFMQ32018.png?width=400\" width=\"400\" class=\"align-center\"><\/a><\/p>\n<p><span style=\"font-size: 12pt;\"><strong>What This Tells Us About the PBAM Market Segment<\/strong><\/span><\/p>\n<p>There are several things interesting about this result.\u00a0 First of all, there are only seven vendors that met the minimum requirements for inclusion.\u00a0 They range from specialty analytics companies like NICE Actimize, a smaller company focused on fighting financial crime, to the bigs like SAS and IBM which promote themselves as one-stop-shops for everything data science.<\/p>\n<p>Second, it\u2019s interesting that the competitors are both small specialty companies and just a few of the large providers of comprehensive analytic platforms.\u00a0 I would have expected to see more of the large competitors, and especially the drag-and-drops like Alteryx or the AMLs like DataRobot enter here.\u00a0 And where are Amazon, Microsoft, and Google that should be able to bring significant AI resources to bear on this?<\/p>\n<p>I think what this is telling us is that the movement to PBAMs is fairly new.\u00a0 I would have expected a much deeper bench of competitors, both large and small.\u00a0 I see this as an emerging trend and an opportunity for AI-first startups to establish a competitive position.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Isn\u2019t This Just AI?<\/strong><\/span><\/p>\n<p>Using the broader consumer definition, AI is any predictive analytic function that has been fully automated to provide a valuable action that replaces or augments a human action.\u00a0 Some of these do indeed involve our new deep learning capabilities of image and speech classification, but many can be created from classic predictive analytics, now allowed to take an automatic action.<\/p>\n<p><a href=\"http:\/\/api.ning.com\/files\/cBpR2VtbQBKwx3*gv8FgsA3IQFSd1B9SqHZU4Vcw-iAIcG*OXXdcASToKdsjRTCSgFYTOuc4mofAcHexAmXe1u-ggDEv77bX\/analyticcontinuum.png\" target=\"_self\"><img decoding=\"async\" src=\"http:\/\/api.ning.com\/files\/cBpR2VtbQBKwx3*gv8FgsA3IQFSd1B9SqHZU4Vcw-iAIcG*OXXdcASToKdsjRTCSgFYTOuc4mofAcHexAmXe1u-ggDEv77bX\/analyticcontinuum.png?width=350\" width=\"350\" class=\"align-right\"><\/a>The easiest way to understand this is to simply see AI as a logical extension that started with predictive analytics (what is likely to happen), to prescriptive analytics (what should happen), and now to AI (the automation of the optimized prescriptive analytic).<\/p>\n<p>The key to the difference between AI and PBAMs is the automation.\u00a0 AI as we understand it is fully automated.\u00a0 Our deep learning AI looks at a part on an assembly line, detects that it is defective, and removes it.\u00a0 <a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/what-makes-a-successful-ai-company\"><em><u>Blue River<\/u><\/em><\/a>, the company that has automated \u2018see and spray\u2019 optimization technology in agriculture uses image classification to inspect a lettuce plant and instantly take action to fertilize it or kill it.<\/p>\n<p>PBAMs however allow the user some control over the underlying analytics, and provide them with output that is typically evaluated with human judgement, not fully automated.\u00a0 In fraud, the control allows for adjusting the sensitivity of certain cases.\u00a0 Although specific transactions may be automatically held up, almost all are passed by a human S&#038;R reviewer before action to reject is taken.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>Tools for the Citizen Data Scientist (CDS)<\/strong><\/span><\/p>\n<p>Ever since the phrase \u2018<a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/citizen-data-scientist-care-feeding-and-control\"><em><u>Citizen Data Scientist<\/u><\/em><\/a>\u2019 entered our lexicon we\u2019ve been gnashing our teeth over what it means to give non-data scientists access to data science tools.\u00a0 Originally it was almost unthinkable because of the experience and expertise required to prevent amateurs from making really serious mistakes with big financial consequences.<\/p>\n<p>The analytic platform vendors have continued to embrace the CDS market.\u00a0 Their motive however is that there are far more CDSs than data scientists and therefore many more seats to sell.\u00a0 The drag-and-drop and AML markets have at least in part been driven by this, in addition to legitimate goals of efficiency and standardization.<\/p>\n<p>It\u2019s old news that there are not enough data scientist to go around.\u00a0 That\u2019s one part of it.\u00a0 The other part however is that the cadres of business analysts have been looking for a legitimate role in the advanced analytics world, beyond their traditional BI domain.<\/p>\n<p>Businesses themselves have been embracing this.\u00a0 Just this week I hosted a <a href=\"https:\/\/www.datasciencecentral.com\/video\/dsc-webinar-series-embedded-analytics-data-science-shell-oil-case\"><em><u>DataScienceCentral webinar<\/u><\/em><\/a> in which the centerpiece was a case study of Shell Oil that has gone out of its way to leverage the CDS capabilities of analysts and LOB managers based on a sophisticated structure of support from their data science COE.\u00a0 They are not alone.<\/p>\n<p>Data scientists need to do what we do best.\u00a0 I\u2019m thinking that PBAMs, purpose built analytic modules are the foundation for expanding the role of citizen data scientists surrounding the rare and more expensive data science core.<\/p>\n<p>Fraud is perhaps the most obvious specialty area.\u00a0 It\u2019s likely though that other PBAM process targets will arise providing companies with new and safe ways to leverage their citizen data scientists and provide both large and small AI-first vendors new markets to dominate.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><strong>Other articles on AI Strategy<\/strong><\/p>\n<p><em><u><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/from-strategy-to-implementation-planning-an-ai-first-company\">From Strategy to Implementation \u2013 Planning an AI-First Company<\/a><\/u><\/em><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/comparing-the-four-major-ai-strategies\"><em><u>Comparing the Four Major AI Strategies<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/comparing-ai-strategies-systems-of-intelligence\"><em><u>Comparing AI Strategies \u2013 Systems of Intelligence<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/comparing-ai-strategies-vertical-vs-horizontal\"><em><u>Comparing AI Strategies \u2013 Vertical versus Horizontal.<\/u><\/em><\/a><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/what-makes-a-successful-ai-company\"><em><u>What Makes a Successful AI Company<\/u><\/em><\/a> <span><em><u>\u2013 Data Dominance<\/u><\/em><\/span><\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/ai-strategies-incremental-and-fundamental-improvements\"><em><u>AI Strategies \u2013 Incremental and Fundamental Improvements<\/u><\/em><\/a><\/p>\n<p>\u00a0<\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blog\/list?user=0h5qapp2gbuf8\"><em><u>Other articles by Bill Vorhies.<\/u><\/em><\/a><\/p>\n<p>\u00a0<\/p>\n<p>About the author:\u00a0 Bill Vorhies is Editorial Director for Data Science Central and has practiced as a data scientist since 2001.\u00a0 He can be reached at:<\/p>\n<p><a href=\"mailto:Bill@Data-Magnum.com\">Bill@Data-Magnum.com<\/a> <span>or<\/span> <a href=\"mailto:Bill@DataScienceCentral.com\">Bill@DataScienceCentral.com<\/a><\/p>\n<p><span>\u00a0<\/span><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:761448\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: William Vorhies Summary:\u00a0 Purpose Built Analytic Modules (PBAMs) such as those for Fraud Detection represent a fourth way to practice data science, a new [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/09\/19\/the-fourth-way-to-practice-data-science-purpose-built-analytic-modules\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":465,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1062"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1062"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1062\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/463"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1062"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1062"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1062"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}