{"id":914,"date":"2018-08-17T06:31:33","date_gmt":"2018-08-17T06:31:33","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/08\/17\/general-myths-to-avoid-in-data-science-and-machine-learning\/"},"modified":"2018-08-17T06:31:33","modified_gmt":"2018-08-17T06:31:33","slug":"general-myths-to-avoid-in-data-science-and-machine-learning","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/08\/17\/general-myths-to-avoid-in-data-science-and-machine-learning\/","title":{"rendered":"General Myths to avoid in Data Science and Machine\u00a0Learning"},"content":{"rendered":"<p>Author: Vibhor Nigam<\/p>\n<div>\n<div class=\"section-divider\"><\/div>\n<div class=\"section-content\">\n<div class=\"section-inner sectionLayout--insetColumn\">\n<p class=\"graf graf--p\">What is Machine Learning, Data Science or Artificial Intelligence? is one of the most common questions which I have faced from people. Be it newcomers, recruiters or even people in leadership positions, this is a question which is puzzling everyone in its own way.<\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">For beginners it takes the form of how do I become a data scientist? For leaders it becomes a question of whether it has an imperative business impact? and for people in the field it takes the form of what I should call myself, a data scientist a data engineer or a data analyst<\/strong>.<\/p>\n<p class=\"graf graf--p\">This post is an attempt to clear some of the myths and develop a basic understanding around what Data Science is, and its different interpretations in corporate world.<\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Myth 1: Data Scientist\/Engineer\/Analyst are one and same<\/strong>.<\/p>\n<p class=\"graf graf--p\">This is a warped myth which I have faced many times in my career and which basically does harm to both employee and the company. It\u2019s like calling a software engineer and QA the same thing.<\/p>\n<p class=\"graf graf--p\">To put things in perspective, a <strong class=\"markup--strong markup--p-strong\">Data Scientist<\/strong> is someone who has experience and knowledge in at least 2 of these 3 fields, Statistics, Programming and Machine Learning. Primary expectation of such an employee is to be able to work on a challenging business problem where he\/she can use their knowledge to find solutions. Such a person would love to spend a major portion of their work in building predictive models and performing statistical experiments to obtain a working solution. It\u2019s a mixture of a research and a programming job, and the nature and workload differs depending on the size of the company\/team.<\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Data Engineering<\/strong> is a job where a person focuses on building the infrastructure for deploying applications performing jobs like predictive modeling, updating dashboards with streaming data, running daily jobs to generate reports and maintaining continuous flow of data. A really good knowledge of SQL is fast becoming a necessity for a good data engineer followed by knowledge of spark.<\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Data Analyst<\/strong> is a person with more of a bend towards interpreting and analyzing business results rather than being in the process of their creation. Such a person will prefer to use tools to generate those results and will spend a major portion of their time in interpreting and deriving business value out of them. Data Analysts have been in the industry a long time before data scientists came into picture and the primary tool of there choice has been Excel. In fact\u00a0, even today for small amount of data excel is most efficient. At present, there are tool like PowerBI, Azure which provide the ability to perform analytics on Big Data. Primary focus however for this position is accurately communicating day to day results as well as results of new hypothesis which they test. These inputs are critical and form a base for important decision making for a business.<\/p>\n<p><img decoding=\"async\" class=\"graf-image\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*FYzz7o1QLQJA8ommi0LnlQ.png\"><\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Myth 2: Deep Learning is Machine Learning or AI<\/strong><\/p>\n<p class=\"graf graf--p\">Deep learning has no doubt become a big name nowadays, and with all the hype and marketing around it, it has also led to people believing that deep learning is an ultimate solution to every data science\/machine learning problem. <strong class=\"markup--strong markup--p-strong\">Truth cannot be farther away than this.<\/strong><\/p>\n<p class=\"graf graf--p\">Deep learning, no doubt is one of the most complex concepts to understand in today\u2019s scope of machine learning but that is it. Deep learning gets its name since the \u201cneural network\u201d implied in this framework contains multiple layers and is hence called a \u201cdeep\u201d network. What is offered via tensorflow, pytorch or keras is just a framework to apply this concept easily.<\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">No doubt, learning the framework is hard and framework is efficient as well but it is not equivalent to gaining expertise in machine learning<\/strong>. Machine learning is a vast field which takes in concepts and algorithms from a number of fields such as statistics, information theory, optimization, information retrieval, neural networks etc. and has an abundance of algorithms each of which are more useful than others in particular use cases.<\/p>\n<p class=\"graf graf--p\">Deep learning for instance has been extremely efficient in computer vision and speech recognition but it is an absolute overkill to use it in sentiment analysis or a simple prediction problem which can be solved with linear regression.<\/p>\n<p class=\"graf graf--p\">It is always a wise decision to invest time in exploratory analysis and understanding the scope of a problem before fixing on the algorithm to use for the problem.<\/p>\n<p class=\"graf graf--p\">This pic explains it the best.<\/p>\n<p><img decoding=\"async\" class=\"graf-image\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*NIclAJqzR1Uutmk6l1Ezzw.jpeg\"><\/p>\n<p class=\"graf graf--p\"><strong class=\"markup--strong markup--p-strong\">Myth 3: Data Science can be picked up in 3 months.<\/strong><\/p>\n<p class=\"graf graf--p\">As much as I wish this to be true, this is not the case. To be an efficient data scientist one needs to know a lot more than just importing the libraries through scikit-learn and tensorflow and calling their train and predict functions.<\/p>\n<p class=\"graf graf--p\">It is one of those illusive fields where the results are not deterministic, meaning same sequence of steps will not always end in same result. It highly depends on the quality and the quantity of the data provided and there is a lot of stuff which needs to happen before calling the \u201ctrain\u201d function.<\/p>\n<p class=\"graf graf--p\">Sure, you can learn how to call libraries and write the sequence of steps to generate a model, but that model will not always be efficient. To understand things properly one needs to have a considerable understanding of working and dependencies of the algorithm which is being applied. It is imperative to have this knowledge, or else tweaking models or explaining the results to leadership becomes a real pain.<\/p>\n<p class=\"graf graf--p\">I always remember this answer to\u00a0, how to learn coding in a single night<\/p>\n<p><img decoding=\"async\" class=\"graf-image\" src=\"https:\/\/cdn-images-1.medium.com\/max\/1600\/1*itR8QaE3tPffIOrjYdGPgQ.png\"><\/p>\n<p class=\"graf graf--p\">This is a small attempt to underline and clear the prevalent myths in the field of machine learning and data science. Hope it helps.<\/p>\n<\/div>\n<\/div>\n<p><\/p>\n<div class=\"section-divider\">\n<hr class=\"section-divider\"><\/div>\n<div class=\"section-content\">\n<div class=\"section-inner sectionLayout--insetColumn\">\n<p class=\"graf graf--p\"><em class=\"markup--em markup--p-em\">If you enjoyed this article please be sure to follow me on\u00a0<\/em><em class=\"markup--em markup--p-em\">\u00a0<\/em><a href=\"https:\/\/twitter.com\/vibs01\" class=\"markup--anchor markup--p-anchor\" rel=\"noopener\" target=\"_blank\"><em class=\"markup--em markup--p-em\">Twitter<\/em><\/a><em class=\"markup--em markup--p-em\">,<\/em> <a href=\"https:\/\/medium.com\/@nigam.vibhor01\" class=\"markup--anchor markup--p-anchor\" target=\"_blank\" rel=\"noopener\"><em class=\"markup--em markup--p-em\">Medium<\/em><\/a> <em class=\"markup--em markup--p-em\">or<\/em> <a href=\"https:\/\/www.linkedin.com\/in\/nigamvibhor\/\" class=\"markup--anchor markup--p-anchor\" rel=\"noopener\" target=\"_blank\"><em class=\"markup--em markup--p-em\">find me on LinkedIn.<\/em><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:751398\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Vibhor Nigam What is Machine Learning, Data Science or Artificial Intelligence? is one of the most common questions which I have faced from people. [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/08\/17\/general-myths-to-avoid-in-data-science-and-machine-learning\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":915,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/914"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=914"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/914\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/915"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=914"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=914"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=914"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}