{"id":2340,"date":"2019-07-09T06:40:18","date_gmt":"2019-07-09T06:40:18","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/07\/09\/privacy-and-ai-how-much-should-we-really-care\/"},"modified":"2019-07-09T06:40:18","modified_gmt":"2019-07-09T06:40:18","slug":"privacy-and-ai-how-much-should-we-really-care","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/07\/09\/privacy-and-ai-how-much-should-we-really-care\/","title":{"rendered":"Privacy and AI &#8211; How Much Should We Really Care"},"content":{"rendered":"<p>Author: William Vorhies<\/p>\n<div>\n<p><strong><em>Summary:<\/em><\/strong> <em>\u00a0More data means better models but we may be crossing over a line into what the public can tolerate, both in the types of data collected and our use of it.\u00a0 The public seems divided.\u00a0 Targeted advertising is good but the increased invasion of privacy is bad.<\/em><\/p>\n<p>\u00a0<\/p>\n<p>Headlines are full of alarm.\u00a0 The public is up in arms.\u00a0 The internet is stealing their privacy.\u00a0 Indeed, the Future of Humanity Institute at Oxford rates this as the most severe problem we will face over the next 10 years.<\/p>\n<p>\u00a0<a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3211777990?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3211777990?profile=RESIZE_710x\" width=\"500\" class=\"align-center\"><\/a><\/p>\n<p>As data scientists how much should we care?\u00a0 Well more data means better models and less data means less accurate models.\u00a0 So in a sense the value we bring to the table will be directly impacted if government regulation takes many of our data sources off the table.\u00a0 So the answer is likely we should care a lot.<\/p>\n<p>However, \u201cprivacy\u201d has become what Marvin Minsky described as a \u2018suitcase word\u2019.\u00a0 That is it can carry such a variety of meanings that it can refer to many different experiences.\u00a0 After all, who doesn\u2019t want more privacy?\u00a0 Give me more.\u00a0 Similarly \u2018freedom\u2019 or \u2018democracy\u2019 or \u2018safety\u2019 are words used so broadly that if asked, the public will respond from their own personal point of view, not necessarily what any survey is seeking to discover.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>So<\/strong><strong>me Obvious Exceptions \u2013 Some Benefits<\/strong><\/span><\/p>\n<p>So is the public really alarmed about data privacy?\u00a0 Let\u2019s carve out a few obvious issues:<\/p>\n<p><strong>Government:<\/strong>\u00a0 The government doesn\u2019t sell us anything and we\u2019re probably right to be suspicious of government gathering too much of our personal data.\u00a0 Will it be used for us or against us?<\/p>\n<p><strong>Children:<\/strong> \u00a0We can probably all agree that children\u2019s data should be off limits for commercial use.\u00a0 Children don\u2019t yet have the judgement needed to prevent their being manipulated by targeted advertising.\u00a0 It\u2019s hard enough for some adults.<\/p>\n<p><strong>Really Personal Stuff:<\/strong>\u00a0 For example, I don\u2019t think revealing my personal healthcare data is going to benefit me if I make it available for targeted advertising.\u00a0 We can probably make a short list of what constitutes information that should remain private.<\/p>\n<p>But aside from these limited carve outs you\u2019d think the public would welcome targeted advertising driven by our increasingly accurate models.\u00a0 Thanks to us ecommerce has dramatically reduced advertising and distribution cost.<\/p>\n<p>Mary Meeker, in her well known annual review of the internet concluded in 2018 that ecommerce (and our targeted advertising) had actually driven down the online cost of goods by 3% in 2 \u00bc years and was probably responsible for another 1% reduction in off line goods through competition.<\/p>\n<p>\u00a0<a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3211779269?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3211779269?profile=RESIZE_710x\" width=\"500\" class=\"align-center\"><\/a><\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>So What Does the Public Think About Targeted Advertising?<\/strong><\/span><\/p>\n<p>Let\u2019s separate out the issue of data privacy from the perceived value of targeted advertising driven by our models.\u00a0 It seems to depend a lot on who\u2019s asking the question and just as important, how the question is being asked.\u00a0 Here\u2019s a <a href=\"https:\/\/today.yougov.com\/opi\/surveys\/results#\/survey\/01894b4a-6c27-11e9-8012-358611e41108\/question\/73ad43ea-6c27-11e9-9bf2-ef0cdf3620c6\/social\"><em><u>survey conducted by YouGov<\/u><\/em><\/a> representing the negative point of view.<\/p>\n<p>\u00a0<a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3211779834?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3211779834?profile=RESIZE_710x\" width=\"500\" class=\"align-center\"><\/a><\/p>\n<p>OK data scientists.\u00a0 This looks bad but you should have spotted that on average 22% of responses are missing and are in the \u2018don\u2019t know\u2019 (or more likely don\u2019t care) category.\u00a0 Second, whenever you have an \u2018opt in\u2019 survey you are much more likely to get strongly opinionated responses, not representative of the average Jack and Jill.<\/p>\n<p>Here\u2019s a sampling of findings from other surveys.<\/p>\n<p>Mary Meeker\u2019s 2019 Internet Review Study:<\/p>\n<ul>\n<li>91% prefer brands that provide personalized offers \/ recommendations.<\/li>\n<li>83% willing to passively share data in exchange for personalized experiences.<\/li>\n<li>74% willing to actively share data in return for personalized experiences.<\/li>\n<\/ul>\n<p>EMarketer.com did a <a href=\"https:\/\/www.emarketer.com\/content\/do-people-actually-want-personalized-ads\"><em><u>metadata analysis of other recent surveys<\/u><\/em><\/a> and reported:<\/p>\n<ul>\n<li>Adlucent found that 7 out of 10 customers yearn for personalized ads.<\/li>\n<li>A survey by Epsilon found that 80% of customers were more likely to make a purchase when brands offered a more personalized experience.<\/li>\n<\/ul>\n<p>Adobe backs this up with compelling stats:<\/p>\n<ul>\n<li>67% of respondents said it&#8217;s important for brands to automatically adjust content based on their current context.<\/li>\n<li>42% get annoyed when their content isn&#8217;t personalized.<\/li>\n<\/ul>\n<p>So if you come at the question from the standpoint of value, the public seems overwhelmingly to say yes.\u00a0 When you come at it from the standpoint of privacy the result is more likely to be 50\/50 or even negative.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>What\u2019s the Disconnect?<\/strong><\/span><\/p>\n<p>From a quick survey of other studies on the topic, it seems we\u2019re doing our job too well, or at least our advertising departments are when they make use of our models.\u00a0 A few customer responses:<\/p>\n<ul>\n<li>Digital ads are starting to feel psychic.<\/li>\n<li>Too many, too intrusive, too creepy (especially retargeting ads).<\/li>\n<\/ul>\n<p>Is the public simply schizophrenic on this topic?\u00a0 Is it possible to push too far into getting more data?<\/p>\n<p>It\u2019s clear for starters that the public wants to know exactly how they are being tracked.\u00a0 There\u2019s <a href=\"https:\/\/aori.com\/blog\/targeted-ads-creepy\"><em><u>an apocryphal story<\/u><\/em><\/a> making its way around the internet that Facebook may be monitoring our conversations through our phones even when they\u2019re not in use.\u00a0<\/p>\n<p>The story involves a casual conversation between old friends at a bar where it comes to light that friend 1 has been working out and friend 2 has not.\u00a0 Upon leaving the bar friend 2 suddenly receives an ad for a gym membership with 20% off if he signs up today.\u00a0 True?\u00a0 Unconfirmed.\u00a0 Could just have been a coincidence based on equally creepy location based analysis.<\/p>\n<p>Too many targeted ads is the other half of this story and likely the reason that ad blockers are on the rise.\u00a0 <a href=\"https:\/\/blog.hubspot.com\/news-trends\/why-people-block-ads-and-what-it-means-for-marketers-and-advertisers?__hstc=93759874.28a32b82d4a257360ca6ce204bcb4d43.1552401358358.1552401358358.1552406693632.2&#038;__hssc=93759874.1.1552406693632&#038;__hsfp=1290674413\"><em><u>Hubspot reports<\/u><\/em><\/a> these top 6 reasons:<\/p>\n<ul>\n<li>annoying\/intrusive (64%)<\/li>\n<li>disrupt what I&#8217;m doing (54%)<\/li>\n<li>create security concerns (39%)<\/li>\n<li>better page load time\/reduced bandwidth use (36%)<\/li>\n<li>offensive\/inappropriate ad content (33%)<\/li>\n<li>privacy concerns (32%)<\/li>\n<\/ul>\n<p>These responses tend to support the idea that it\u2019s frequency and intrusiveness that drives resistance, not necessarily the privacy of the data used.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>How is Privacy Legislation Progressing?<\/strong><\/span><\/p>\n<p>GDPR is now just over a year old and by all reports it\u2019s not going well at all.\u00a0 A recent <a href=\"https:\/\/www.datainnovation.org\/2019\/06\/what-the-evidence-shows-about-the-impact-of-the-gdpr-after-one-year\/?utm_medium=email&#038;utm_source=topic+optin&#038;utm_campaign=awareness&#038;utm_content=20190626+data+nl&#038;mkt_tok=eyJpIjoiWW1KaU1EbGpNMk0xT0dSaiIsInQiOiJmTmRLRUhpRkd6MUJIV1lGRzR4eHlSZkw1aEJ5RCtmVnFsajNySEdyR253cTloaEJPdk9sSjNJcGNoblI4cDdaOFVJeXNZSzN1YWVZUDVEMWkzd3NaRkQ2WW1KQXdQbWtkbWxPcHYzK3BZYmJmb3UrSFlMSmZrVCtTS2dQdlhVaCJ9\"><em><u>review by the Center for Data Innovation<\/u><\/em><\/a> reported a long laundry list of unintended consequences.<\/p>\n<ul>\n<li>Negatively affects the EU economy and businesses.<\/li>\n<li>Drains company resources.<\/li>\n<li>Hurts European tech startups.<\/li>\n<li>Reduces competition in digital advertising.<\/li>\n<li>Is too complicated for businesses to implement.<\/li>\n<li>Fails to increase trust among users.<\/li>\n<li>Negatively impacts users\u2019 online access.<\/li>\n<li>Is too complicated for consumers to understand.<\/li>\n<li>Is not consistently implemented across member states.<\/li>\n<li>Strains resources of regulators.<\/li>\n<\/ul>\n<p>The specifics include companies on average spending on average $1.8 million each on compliance, weaponizing GDPR authorization requests that must be handled within 30 days to attack competitors by wasting their time, a 30% fall off in tech company formation, and as many as 30% of previously available news and information services withdrawn from the market over failure to be able to comply.<\/p>\n<p>Not a good track record and not getting better.<\/p>\n<p>Our own congress has also recognized the complexity of the task and the Commerce Committee assigned with drafting legislation has currently taken a pause.\u00a0 Doesn\u2019t mean this good news will last indefinitely.<\/p>\n<p>\u00a0<\/p>\n<p><span style=\"font-size: 12pt;\"><strong>From all this what can we conclude<\/strong><\/span><\/p>\n<p>As data scientists we should care about potential privacy restrictions since the withdrawal of data will add cost and reduce model accuracy.\u00a0 That will reflect directly on us.<\/p>\n<p>The American people seem legitimately divided on the topic.\u00a0 In general targeting advertising is good except:<\/p>\n<ul>\n<li>When it\u2019s too frequent.<\/li>\n<li>When it\u2019s too creepy.<\/li>\n<\/ul>\n<p>To the extent that you can guide the conversation in your own company you can give your executives this balanced view and take a common sense approach to crossing that line.\u00a0 To paraphrase a famous mission statement \u201cdon\u2019t be creepy\u201d.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blog\/list?user=0h5qapp2gbuf8\"><em><u>Other articles by Bill Vorhies<\/u><\/em><\/a><\/p>\n<p>\u00a0<\/p>\n<p>About the author:\u00a0 Bill is Contributing Editor for Data Science Central.\u00a0 Bill is also President &#038; Chief Data Scientist at Data-Magnum and has practiced as a data scientist since 2001.\u00a0 His articles have been read more than 1.5 million times.<\/p>\n<p>He can be reached at:<\/p>\n<p><a href=\"mailto:Bill@DataScienceCentral.com\">Bill@DataScienceCentral.com<\/a> <span>or<\/span> <a href=\"mailto:Bill@Data-Magnum.com\">Bill@Data-Magnum.com<\/a><\/p>\n<p><span>\u00a0<\/span><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:854205\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: William Vorhies Summary: \u00a0More data means better models but we may be crossing over a line into what the public can tolerate, both in [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/07\/09\/privacy-and-ai-how-much-should-we-really-care\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":459,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2340"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2340"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2340\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/470"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2340"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2340"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2340"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}