State of Data Science & Machine Learning in 2018

Author: Vincent Granville

The results of this Kaggle survey were published recently. The questions addressed include:

  • Introduction
  • Survey Methodology
  • Survey Participants- Basic Profile
  • Educational profile of participants
  • Job profile of Particpants
  • What do they do at work?
  • What tools are used for data analysis
  • Do you consider yourself a Data Scientist?
  • Coding Exposure
  • Programming Language Used
  • Data Visualization Libraries Used
  • What is more important -Academic Acheivements or independent projects?
  • Time spent on activities in Data Science projects
  • Types of Data handled
  • Do you use Machine learning methods at work?
  • Experience in using Machine Learning methods at work
  • Machine Learning Frameworks Used
  • Metrics for Model Success
  • ML algorithms -Importance of topics
  • Difficultes in projects involved in exploring unfair bias in datasets/ algorithms
  • Projects involving exploring model insights
  • Confidence in Explaining ML models
  • Methods used for explaining ML Models
  • Reproducibilty
  • Integrated Development tools(IDE) usage
  • Hosted NoteBook Usage
  • Cloud Computing services usage
  • Cloud Computing Products Usage
  • Machine Learning Products Usage
  • Relational databaseProducts Usage
  • Big Data and Analytics Products Usage
  • Training on Machine Learning/Data Science
  • Online Training
  • Source of Public datasets
  • Media Sources
  • Summary of Findings

One of the numerous charts from this survey (click on picture to zoom in)

Kaggle conducted an industry-wide global survey that presents a truly comprehensive view of the state of data science and machine learning. The survey attempts to understand broadly the profile , work activities, nature of projects undertaken , programming languages used, machine learning methods usage at work and machine learning frameworks used by participants from about 147 countries globally. It also provides information about the usage of various products and services like cloud computing, hosted notebooks etc in the field of data science and machine learning. It also throws light into the status of training and training methods, public data sets and media sources that the participants depend on to enhance their knowledge in the area.

The survey was live for one week in October, and after cleaning the data finished with responses from 23,859 participants globally.

Read the results here

DSC Resources

Go to Source