{"id":551,"date":"2018-05-27T07:19:27","date_gmt":"2018-05-27T07:19:27","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/05\/27\/what-is-machine-learning-why-machine-learning\/"},"modified":"2018-05-27T07:19:27","modified_gmt":"2018-05-27T07:19:27","slug":"what-is-machine-learning-why-machine-learning","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/05\/27\/what-is-machine-learning-why-machine-learning\/","title":{"rendered":"What is Machine Learning? Why Machine Learning?"},"content":{"rendered":"<p>Author: Keshav Dhandhania<\/p>\n<div>\n<p class=\"graf graf--p\">Sometimes we encounter problems for which it\u2019s really hard to write a computer program to solve. For example, let\u2019s say we wanted to program a computer to recognize hand-written digits:<\/p>\n<p class=\"graf graf--p\">\n<p class=\"graf graf--p\"><a href=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*nvgbHtaAADlwHHTQ.\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*nvgbHtaAADlwHHTQ.\" class=\"align-center\"><\/a><\/p>\n<p class=\"graf graf--p\" style=\"text-align: center;\">Source: MNIST handwritten database<\/p>\n<p class=\"graf graf--p\"><a href=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*nvgbHtaAADlwHHTQ.\" target=\"_blank\" rel=\"noopener\"><\/a><\/p>\n<p class=\"graf graf--p\">You could imagine trying to devise a set of rules to distinguish each individual digit. Zeros, for instance, are basically one closed loop. But what if the person didn\u2019t perfectly close the loop. Or what if the right top of the loop closes below where the left top of the loop starts?<\/p>\n<p class=\"graf graf--p\"><a href=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*XeorWgOMuyHf8qLE.\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*XeorWgOMuyHf8qLE.\" class=\"align-center\"><\/a><\/p>\n<p class=\"graf graf--p\" style=\"text-align: center;\">A zero that\u2019s difficult to distinguish from a six<\/p>\n<p class=\"graf graf--p\">In this case, we have difficulty differentiating zeroes from sixes. We could establish some sort of cutoff, but how would you decide the cutoff in the first place? As you can see, it quickly becomes quite complicated to compile a list of heuristics (i.e., rules and guesses) that accurately classifies handwritten digits.<\/p>\n<p class=\"graf graf--p\">And there are so many more classes of problems that fall into this category. Recognizing objects, understanding concepts, comprehending speech. We don\u2019t know what program to write because we still don\u2019t know how it\u2019s done by our own brains. And even if we did have a good idea about how to do it, the program might be horrendously complicated.<\/p>\n<p class=\"graf graf--p\">So instead of trying to write a program, we try to develop an algorithm that a computer can use to look at hundreds or thousands of examples (and the correct answers), and then the computer uses that experience to solve the same problem in new situations. Essentially, our goal is to teach the computer to solve by example, very similar to how we might teach a young child to distinguish a cat from a dog.<\/p>\n<p class=\"graf graf--p\">\n<h2 class=\"graf graf--h2\">What is Machine Learning?\u200a\u2014\u200aDefinition<\/h2>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">The field itself:<\/strong> ML is a field of study which harnesses principles of computer science and statistics to create statistical models. These models are generally used to do two things:<\/p>\n<ol class=\"postList\">\n<li class=\"graf graf--li\"><strong class=\"markup--strong markup--li-strong\">Prediction<\/strong>: make predictions about the future based on data about the past<\/li>\n<li class=\"graf graf--li\"><strong class=\"markup--strong markup--li-strong\">Inference<\/strong>: discover patterns in data<\/li>\n<\/ol>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Difference between ML and AI:<\/strong> There is no universally agreed upon distinction between ML and artificial intelligence (AI). AI usually concentrates on programming computers to make <em class=\"markup--em markup--p-em\">decisions<\/em> (based on ML models and sets of logical rules), whereas ML focuses more on making <em class=\"markup--em markup--p-em\">predictions<\/em> about the future. They are highly interconnected fields, and, for most non-technical purposes, they are the same.<\/p>\n<p class=\"graf graf--p\"><a href=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*HHiBFh__iV9LMhl_.\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/cdn-images-1.medium.com\/max\/800\/0*HHiBFh__iV9LMhl_.\" class=\"align-center\"><\/a><\/p>\n<p class=\"graf graf--p\" style=\"text-align: center;\">&#8212;\u00a0 \u00a0&#8212;\u00a0 \u00a0&#8212;<\/p>\n<h2 class=\"graf graf--h2\">What\u2019s a statistical model?<\/h2>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Models:<\/strong> Teaching a computer to make predictions involves feeding data into machine learning <strong class=\"markup--strong markup--p-strong\">models<\/strong>, which are representations of how the world supposedly works. If I tell a statistical model that the world works a certain way (say, for example, that taller people make more money than shorter people), then this model can then tell me who it thinks will make more money, between Cathy, who is 5\u20192\u201d, and Jill, who is 5\u20199\u201d.<\/p>\n<p class=\"graf graf--p\">What does a model actually look like? Surely the concept of a model makes sense in the abstract, but knowing this is just half of the battle. You should also know how it\u2019s represented inside of a computer, or what it would look like if you wrote it down on paper.<\/p>\n<p class=\"graf graf--p\">A model is just a mathematical function, which, as you probably already know, is a relationship between a set of inputs and a set of outputs. Here\u2019s an example:<\/p>\n<p class=\"graf graf--p\"><em class=\"markup--em markup--p-em\">f(x) = x\u00b2<\/em><\/p>\n<p class=\"graf graf--p\">This is a function that takes as input a number and returns that number squared. So, f(1) = 1, f(2) = 4, f(3) = 9.<\/p>\n<p class=\"graf graf--p\">Let\u2019s briefly return to the example of the model that predicts income from height. I may believe, based on what I\u2019ve seen in the corporate world, that a given human\u2019s annual income is, on average, equal to her height (in inches) times 1,000. So, if you\u2019re 60 inches tall (5 feet), then I\u2019ll guess that you probably make $60,000 a year. If you\u2019re a foot taller, I think you\u2019ll make $72,000 a year.<\/p>\n<p class=\"graf graf--p\">This model can be represented mathematically as follows:<\/p>\n<p class=\"graf graf--p\"><em class=\"markup--em markup--p-em\">Income = Height \u00d7 $1,000<\/em><\/p>\n<p class=\"graf graf--p\">In other words, income is a function of height.<\/p>\n<blockquote class=\"graf graf--pullquote\"><p><span style=\"font-size: 12pt;\"><strong class=\"markup--strong markup--pullquote-strong\"><em class=\"markup--em markup--pullquote-em\">Here\u2019s the main point:<\/em><\/strong> <em class=\"markup--em markup--pullquote-em\">Machine learning refers to a set of techniques for estimating functions (like the one involving income) based on datasets (pairs of heights and their associated incomes). These functions, which are called models, can then be used for predictions of future\u00a0data.<\/em><\/span><\/p><\/blockquote>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Algorithms:<\/strong> These functions are estimated using <strong class=\"markup--strong markup--p-strong\">algorithms<\/strong>. In this context, an algorithm is a predefined set of steps that takes as input a bunch of data and then transforms it through mathematical operations. You can think of an algorithm like a recipe\u200a\u2014\u200afirst do <em class=\"markup--em markup--p-em\">this<\/em>, then do <em class=\"markup--em markup--p-em\">that<\/em>, then do <em class=\"markup--em markup--p-em\">this<\/em>. Done.<\/p>\n<p class=\"graf graf--p\">Machine learning of all types uses models and algorithms as its building blocks to make predictions and inferences about the world.<\/p>\n<h2 class=\"graf graf--h2\">What exactly is being\u00a0learnt<\/h2>\n<p class=\"graf graf--p\">To explain what is being <em class=\"markup--em markup--p-em\">learnt<\/em> in machine learning, let\u2019s start with an example application, spam classification. One approach to write a computer program to classify spam emails from non-spam emails, is to split each email into individual words and maintain a list of words that appear more frequently in spam emails. For example, some example of such words might be \u2018loan\u2019, \u2018$\u2019, \u2018credit\u2019, \u2018discount\u2019, \u2018offer\u2019, \u2018password\u2019, \u2018viagra\u2019, and so on. Then, if an email has a substantial number of these words, it should be classified as spam.<\/p>\n<p class=\"graf graf--p\">Although the strategy above might give fairly good results (say detect spam with an accuracy of 80%), the accuracy depends in large part on the list of words we maintain, and on the precise threshold we choose to classify an email as spam.<\/p>\n<p class=\"graf graf--p\">In machine learning, the strategy is to learn the list of words and the threshold from examples. In fact, in addition to which words are bad words, we could also learn <em class=\"markup--em markup--p-em\">how<\/em> bad each word is. (This example is quite realistic, and is how many spam classification algorithms work.)<\/p>\n<p class=\"graf graf--p\">So in this case, the thing being learnt is, a notion of how bad each word is. Note that that is not the only way to frame the problem, we framed the problem in this way because we noticed a pattern that spam emails often contain specific words, and then we came up with a strategy that would analyze <strong class=\"markup--strong markup--p-strong\">every<\/strong> possible word as a possible suspect. This strategy might give inaccurate results for other tasks, or be too inefficient.<\/p>\n<p class=\"graf graf--p\">\n<h2 class=\"graf graf--h2\">Desirable properties of machine\u00a0learning<\/h2>\n<p class=\"graf graf--p\">You might notice that using machine learning to learn how bad each word is has many desirable properties over maintaining this list manually.<\/p>\n<ol class=\"postList\">\n<li class=\"graf graf--li\">It <strong class=\"markup--strong markup--li-strong\">reduces the amount of manual work<\/strong> involved in creating the list. Think about how long this list could get if you try to do this manually. Also, if you\u2019re trying to maintain the list manually, how would you deal with hundreds of languages across the world? This task can easily become infeasible without machine learning.<\/li>\n<li class=\"graf graf--li\"><strong class=\"markup--strong markup--li-strong\">The same strategy works for other similar tasks<\/strong>. Say we wanted to classify whether a movie review is speaking positively or negatively about a movie. If we were creating lists of words manually, then we would have to create a new list of words manually. But if we learn it, the same algorithm would work given that we already have some data (say ratings and reviews left by users on imdb).<\/li>\n<li class=\"graf graf--li\"><strong class=\"markup--strong markup--li-strong\">It updates automatically<\/strong>. Lets say tomorrow the spammers become more advanced and start typing the word \u2018password\u2019 as \u2018passw0rd\u2019. Or they might try to sell you insurance, something we haven\u2019t yet encountered. We can simply set the machine learning algorithm to be trained daily, and it will use the new data available and keep adapting over time to changing behavior.<\/li>\n<\/ol>\n<p class=\"graf graf--p\" style=\"text-align: center;\">&#8212;\u00a0 \u00a0&#8212;\u00a0 \u00a0&#8212;<\/p>\n<p class=\"graf graf--p\"><em class=\"markup--em markup--p-em\">Co-authored by Noah Yonack, Keshav Dhandhania and Nikhil Buduma.<\/em><\/p>\n<p class=\"graf graf--p\"><em class=\"markup--em markup--p-em\">Originally published <a href=\"http:\/\/www.commonlounge.com\/\" target=\"_blank\" rel=\"noopener\">here<\/a><\/em>.\u00a0<\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:724139\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Keshav Dhandhania Sometimes we encounter problems for which it\u2019s really hard to write a computer program to solve. For example, let\u2019s say we wanted [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/05\/27\/what-is-machine-learning-why-machine-learning\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":463,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/551"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=551"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/551\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/466"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=551"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=551"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=551"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}