How Millennials Should View the World of Data Science

Author: Bill Schmarzo

Many of my MBA students are perplexed about how much data science capabilities and skills they need in order to get hired.  It appears that in many job interviews, employers are looking for students who have both business subject matter as well as data science expertise. Here is an example of such a conversation:

MBA Student: I get lots of questions about my data science skills. I don’t understand the need for that from me. I will never be as good as a data scientist who has spent years mastering the art. My intelligence comes from my critical thinking and ability to synthesize the analytic results; not number crunching.

Me: Agreed!  Technology is going to eventually replace many of the data scientist and data engineering tasks.  Think about what’s happening AutoML[1].  Ultimately, the value isn’t in building models, the value is knowing how to monetize the results of the model!!

MBA Student: Data Science is going to be obsolete in 10 years. We almost have self-driving cars today…

Me:  While that may be a bit of an over-statement, it is also why we emphasize that tomorrow’s business leaders must understand economics and how data science can be the foundation for the creation of wealth or value.  Economics is a much better discipline for MBA students than data science, though every business student should understand how to “think like a data scientist”.  Your value to the organization isn’t finding insights buried in the data.  Your value is knowing what to do with those insights once the data scientists find them.  For example in sports science, it would be about leveraging the player performance and health/recovery insights to optimize player development programs and in-game decisions.

MBA Student:  You don’t win a battle by knowing how to make sword. You win a battle by knowing how using a sword.  That’s why the samurai weren’t the swordsmiths.

 

As I’ve written before, I think creating MBA students who are also data scientists is the wrong focus; the proverbial “Data Science Unicorn” who equally has mastered software development, data science, and business. That’s like seeking Major League Baseball prospects who are both outstanding infielders and outstanding pitchers. While there may be the rare bird who can both pitch and hit (New York Yankees Babe Ruth, California Angels Shohei Ohtani), it is more common for successful Major League Baseball teams to draft the BEST infielder and the BEST pitcher and have them fulfill different roles on the team.  

So here is what we teach our MBA students (and future business and society leaders) what they need to understand about Data Science:

1)  Data Science is a Team Sport

Data science is a team sport comprised of Data Engineers, Data Scientists and Business Stakeholders.  And like a baseball team can’t function effectively with only shortstops and catchers. One’s data science initiative MUST clearly articulate the roles, responsibilities and expectations of the Data Engineers, Data Scientists and Business Stakeholders (see Figure 1).

Figure 1:  Data Science Team Roles

If the goal of your organization is to become more effective at leveraging data and analytics to power your business models and drive digital transformation, you can’t win that game with a team full of pitchers. 

See the blog “A Winning Game Plan For Building Your Data Science Team” for more details on creating a winning data science team.

2)  Embrace “Thinking Like A Data Scientist”

It is critical to the effectiveness of your data science strategy that your business stakeholders not only intimately understand the business, but also know how to “think like a data scientist.”  That is, the business stakeholders must understand the data science process in order to not only collaborate, but ultimately to lead the data science efforts to ensure that the precious data science resources are focused on the most important business opportunities (see Figure 2).

Figure 2:  The “Thinking Like A Data Scientist” Process

See the blog “Refined Thinking like a Data Scientist Series” for more details on each of the steps in the Thinking Like A Data Scientist (TLADS) process.

Note: as you can see from the differences in Figure 2 versus the process laid out in the blog, we continue to refine and update the TLADS process based upon customer engagements as well as class work.  That refinement effort has led to the following development – the Hypothesis Development Canvas.

3)  Hypothesis Development Canvas Is Your Monetization Guide

Our most recent development to capstone the “Thinking Like A Data Scientist” Process is the Hypothesis Development Canvas. The Hypothesis Development Canvas uses a common design thinking technique to summarize on a single document all the critical aspects of the TLADS process (see Figure 3).

Figure 3: Hypothesis Development Canvas Ties Business Strategy with Data Science Execution

And while we are still in the early stages of testing and refining the canvas, the early feedback and results are very encouraging.  See the blog “Data Science ‘Paint by the Numbers’ with the Hypothesis Development Canvas” for more details on the Hypothesis Development Canvas, as well as the Business Model Canvas and the modified Machine Learning Canvas.

4)  Gain High-level Understanding of Advanced Analytics

While we don’t expect that business students master data science, it is very important that they understand what data science (and advanced analytics) can do to power the organization’s business models.  Let’s start the advanced analytics conversation with an overly-simplified definition of Artificial Intelligence or AI:

AI is about codifying customer, product, operational or market patterns and relationships in order to learn, act and/or automate.

The supporting advanced analytics can then be categorized or layered into the 3 levels of advanced analytics (see Figure 4).

Figure 4:  Three Levels of Advanced Analytics

Components of Advanced Analytics include:

  • Level 1: Statistics & Predictive Analytics quantifies cause-and-effect (correlation coefficient) and goodness of fit (Chi-squared test)
  • Level 2: Deep Learning (Neural Networks) learns from a training data set and then applies those learnings to new data sets (photos, images, audio, handwriting)
  • Level 2: Supervised Machine Learning identifies known unknownsrelationships that drive “labeled” or known outcomes (e.g., fraud, attrition, product failure, spam)
  • Level 2: Unsupervised Machine Learning identifies unknown unknownsrelationships – clusters, segments, associations – hidden in the data
  • Level 3: Reinforcement Learning & Artificial Intelligence learns (mostly through trial and error) and adapts in order to operate within continuously changing environment (robots, autonomous vehicles)

 

See the blog “Artificial Intelligence is not ‘Fake” Intelligence’for more details on the 3 levels of advanced analytics.

Summary

So to summarize, here is what I feel MBA students (and business leaders) need to understand about the growing capabilities and power of Data Science:

  1. Data Science is a team sport that equally includes data engineers (who gather and prepare and enrich the data for advanced analytics), data scientists (who build analytic models that codify cause and effect and measure goodness of fit”), and business stakeholders.
  2. Embrace the “Thinking Like A Data Scientist” approach in order to determine what problems to target with data science and how to apply the resulting customer, product and operational insights to derive and drive business value.
  3. Understand how to collaborate with the data science team around the Hypothesis Development Canvas that cements the relationship between the organization’s business strategy and specific AI and Machine Learning efforts.
  4. Gain a high-level understanding of “what” advanced analytic capabilities, such as deep learning, machine learning and reinforcement learning, can do in uncovering customer, product and operational insights buried in the organization’s data (and be less concerned about “how” they do it – that’s why we have those giant brained data scientists!).

[1]Automated machine learning (AutoML) is the process of automating the end-to-end process of applying machine learning to real-world problems. ”Automated Machine Learning: In Depth Guide.”  Disclosure:  I sit on the Technology Advisory Board of the AutoML Startup Big Squid.

Go to Source