Data science Coding in a weekend series of books …

Author: ajit jaokar

After testing this idea for the last few months, we have formally launched this concept

 

The idea of ‘Data Science Coding in a weekend’ originated from meetups we conducted in London

 

The idea is simple but effective

 

We choose a complex section of code and try to learn it in detail over the weekend

 

We work backwards i.e. try to drill down the concepts behind the main ideas

 

This led to the philosophy which I articulated in  learn machine learning coding basics in a weekend a new approach

 

And the first book free book classification and regression in a weekend

 

The “in a weekend” series of books on Data Science Central can be seen as an online version of our London based meetups. All the books have a single community HERE. Like a meetup, the books are free to use. The code is in open source. We have drawn upon many sources which we have referenced in the books

 

For this first book, the steps in the code are

 

Regression

Load and describe the data

Exploratory Data Analysis

      Exploratory data analysis – numerical

      Exploratory data analysis – visual

      Analyse the target variable

      compute the correlation

Pre-process the data

      Dealing with missing values

      Treatment of categorical values

      Remove the outliers

      Normalise the data

Split the data

Choose a Baseline algorithm

defining / instantiating the baseline model

fitting the model we have developed to our training set

Define the evaluation metric

predict scores against our test set and assess how good it is

Refine our dataset with additional columns

Test Alternative Models

Choose the best model and optimise its parameters

Gridsearch

 

Classification

Load the data

Exploratory data analysis

     Analyse the target variable 
    Check if the data is balanced

     Check the co-relations
Split the data
Choose a Baseline algorithm
Train and Test the Model
Choose an evaluation metric
Refine our dataset

Feature engineering

Test Alternative Models
Ensemble models 
Choose the best model and optimise its parameters

 

The second book – coming by next week – is entitled “Azure machine learning in a weekend”.

 

I introduced the book in this blog – Azure machine learning concepts – an introduction.  Most of us start learning development using a language like Python or R. But when you work professionally, you typically end up working with a Cloud platform. The top three Cloud platforms today in terms of market share are AWS, Azure and GCP(Google). These platforms are similar. Our goal is to learn the how to develop for these platform.  We start with Azure and then with Google next month.

 

We welcome your comments on the books and approach

You can download the first book free book classification and regression in a weekend and join the community HERE

Go to Source