# Data science cookbook style code reference in Python for beginners

Author: ajit jaokar

Here is another resource I use for teaching my students at AI for Edge computing course

I like this resource because I like the cookbook style of learning to code

The resource is based on the book Machine Learning With Python Cookbook. and also Machine Learning Flashcards by the same author (both of which I recommend and I have bought)

I like the approach of using a simple simulated dataset like we see in LDA for dimensionality reduction  and pandas functions

The link chrisalbon.com itself contains others such as linux, postgres etc which I have not tried

The ones below I have used because they related to machine learning and deep learning

Machine Learning

Basics

• Make Simulated Data For Classification
• Make Simulated Data For Clustering
• Make Simulated Data For Regression
• Perceptron In Scikit
• Saving Machine Learning Models
• Vectors, Matrices, And Arrays

Preprocessing Structured Data

• Transpose A Vector Or Matrix
• Selecting Elements In An Array
• Reshape An Array
• Invert A Matrix
• Getting The Diagonal Of A Matrix
• Flatten A Matrix
• Find The Rank Of A Matrix
• Find The Maximum And Minimum
• Describe An Array
• Create A Vector
• Create A Sparse Matrix
• Create A Matrix
• Converting A Dictionary Into A Matrix
• Calculate The Trace Of A Matrix
• Calculate The Determinant Of A Matrix
• Calculate The Average, Variance, And Standard Deviation
• Calculate Dot Product Of Two Vectors
• Apply Operations To Elements

Preprocessing Images

• Binarize Images
• Blurring Images
• Cropping Images
• Detect Edges
• Enhance Contrast Of Color Image
• Enhance Contrast Of Greyscale Image
• Harris Corner Detector
• Installing OpenCV
• Isolate Colors
• Remove Backgrounds
• Save Images
• Sharpen Images
• Shi-Tomasi Corner Detector
• Using Mean Color As A Feature

Preprocessing Dates And Times

• Break Up Dates And Times Into Multiple Features
• Calculate Difference Between Dates And Times
• Convert Strings To Dates
• Convert pandas Columns Time Zone
• Encode Days Of The Week
• Handling Missing Values In Time Series
• Handling Time Zones
• Lag A Time Feature
• Rolling Time Window
• Select Date And Time Ranges

Feature Engineering

• Dimensionality Reduction On Sparse Feature Matrix
• Dimensionality Reduction With Kernel PCA
• Dimensionality Reduction With PCA
• Feature Extraction With PCA
• Group Observations Using K-Means Clustering
• Selecting The Best Number Of Components For LDA
• Selecting The Best Number Of Components For TSVD
• Using Linear Discriminant Analysis For Dimensionality Reduction

Feature Selection

Model Evaluation

Model Selection

Linear Regression

Logistic Regression

Trees And Forests

Nearest Neighbors

Support Vector Machines

Naive Bayes

Clustering

Deep Learning

Keras

Data Wrangling

• Apply Functions By Group In Pandas
• Apply Operations To Groups In Pandas
• Applying Operations Over pandas Dataframes
• Assign A New Column To A Pandas DataFrame
• Break A List Into N-Sized Chunks
• Breaking Up A String Into Columns Using Regex In pandas
• Columns Shared By Two Data Frames
• Construct A Dictionary From Multiple Lists
• Convert A CSV Into Python Code To Recreate It
• Convert A Categorical Variable Into Dummy Variables
• Convert A Categorical Variable Into Dummy Variables
• Convert A String Categorical Variable To A Numeric Variable
• Convert A Variable To A Time Variable In pandas
• Count Values In Pandas Dataframe
• Create A Pipeline In Pandas
• Create A pandas Column With A For Loop
• Create Counts Of Items
• Create a Column Based on a Conditional in pandas
• Creating Lists From Dictionary Keys And Values
• Crosstabs In pandas
• Delete Duplicates In pandas
• Descriptive Statistics For pandas Dataframe
• Dropping Rows And Columns In pandas Dataframe
• Enumerate A List
• Expand Cells Containing Lists Into Their Own Variables In Pandas
• Filter pandas Dataframes
• Find Largest Value In A Dataframe Column
• Find Unique Values In Pandas Dataframes
• Geocoding And Reverse Geocoding
• Geolocate A City And Country
• Geolocate A City Or Country
• Group A Time Series With pandas
• Group Data By Time
• Group Pandas Data By Hour Of The Day
• Grouping Rows In pandas
• Hierarchical Data In pandas
• Join And Merge Pandas Dataframe
• List Unique Values In A pandas Column
• Load A JSON File Into Pandas
• Load An Excel File Into Pandas
• Long To Wide Format
• Lower Case Column Names In Pandas Dataframe
• Make New Columns Using Functions
• Map External Values To Dataframe Values in pandas
• Missing Data In pandas Dataframes
• Moving Averages In pandas
• Normalize A Column In pandas
• Pivot Tables In pandas
• Quickly Change A Column Of Strings In Pandas
• Random Sampling Dataframe
• Ranking Rows Of Pandas Dataframes
• Regular Expression Basics
• Regular Expression By Example
• Reindexing pandas Series And Dataframes
• Rename Column Headers In pandas
• Rename Multiple pandas Dataframe Column Names
• Replacing Values In pandas
• Saving A pandas Dataframe As A CSV
• Search A pandas Column For A Value
• Select Rows When Columns Contain Certain Values
• Select Rows With A Certain Value
• Select Rows With Multiple Filters
• Selecting pandas DataFrame Rows Based On Conditions
• Simple Example Dataframes In pandas
• Sorting Rows In pandas Dataframes
• Split Lat/Long Coordinate Variables Into Separate Variables
• Streaming Data Pipeline
• String Munging In Dataframe
• Using List Comprehensions With pandas
• Using Seaborn To Visualize A pandas Dataframe
• pandas Data Structures
• pandas Time Series Basics

Data Visualization

Go to Source