Machine Learning Tutorials
This repository contains a topic-wise curated list of Machine Learning tutorials, articles and other resources.
Table of Contents
- General Stuff
- Interview Resources
- Artificial Intelligence
- Genetic Algorithms
- Statistics
- Useful Blogs
- Resources on Quora
- Resources on Kaggle
- Cheat Sheets
- Classification
- Linear Regression
- Logistic Regression
- Model Validation using Resampling
- Deep Learning
- Natural Language Processing
- Computer Vision
- Support Vector Machine
- Reinforcement Learning
- Decision Trees
- Random Forest / Bagging
- Boosting
- Ensembles
- Stacking Models
- VC Dimension
- Bayesian Machine Learning
- Semi Supervised Learning
- Optimizations
General Stuff
- A curated list of awesome Machine Learning frameworks, libraries and software
- A curated list of awesome data visualization libraries and resources.
- An awesome Data Science repository to learn and apply for real world problems
- The Open Source Data Science Masters
- Machine Learning FAQs on Cross Validated
- List of Machine Learning University Courses
- Machine Learning algorithms that you should always have a strong understanding of
- Differnce between Linearly Independent, Orthogonal, and Uncorrelated Variables
- List of Machine Learning Concepts
- Slides on Several Machine Learning Topics
- MIT Machine Learning Lecture Slides
- Comparison Supervised Learning Algorithms
- Learning Data Science Fundamentals
- Machine Learning mistakes to avoid
- Statistical Machine Learning Course
- TheAnalyticsEdge edX Notes and Codes
Interview Resources
- How can a computer science graduate student prepare himself for data scientist interviews?
- How do I learn Machine Learning?
- FAQs about Data Science Interviews
- What are the key skills of a data scientist?
Artificial Intelligence
- Awesome Artificial Intelligence (GitHub Repo)
- edX course | Klein & Abbeel
- Udacity Course | Norvig & Thrun
- TED talks on AI
Genetic Algorithms
- Genetic Algorithms Wikipedia Page
- Simple Implementation of Genetic Algorithms in Python (Part 1), Part 2
- Genetic Algorithms vs Artificial Neural Networks
- Genetic Algorithms Explained in Plain English
- Genetic Programming
Statistics
- Stat Trek Website - A dedicated website to teach yourselves Statistics
- Learn Statistics Using Python - Learn Statistics using an application-centric programming approach
- Statistics for Hackers | Slides | @jakevdp - Slides by Jake VanderPlas
- Online Statistics Book - An Interactive Multimedia Course for Studying Statistics
- What is a Sampling Distribution?
- Tutorials
- What is an Unbiased Estimator?
- Goodness of Fit Explained
- What are QQ Plots?
Useful Blogs
- Edwin Chen's Blog - A blog about Math, stats, ML, crowdsourcing, data science
- The Data School Blog - Data science for beginners!
- ML Wave - A blog for Learning Machine Learning
- Andrej Karpathy - A blog about Deep Learning and Data Science in general
- Colah's Blog - Awesome Neural Networks Blog
- Alex Minnaar's Blog - A blog about Machine Learning and Software Engineering
- Statistically Significant - Andrew Landgraf's Data Science Blog
- Simply Statistics - A blog by three biostatistics professors
- Yanir Seroussi's Blog - A blog about Data Science and beyond
- fastML - Machine learning made easy
- Trevor Stephens Blog - Trevor Stephens Personal Page
- no free hunch | kaggle - The Kaggle Blog about all things Data Science
- A Quantitative Journey | outlace - learning quantitative applications
- r4stats - analyze the world of data science, and to help people learn to use R
- Variance Explained - David Robinson's Blog
- AI Junkie - a blog about Artificial Intellingence
Resources on Quora
- Most Viewed Machine Learning writers
- Data Science Topic on Quora
- William Chen's Answers
- Michael Hochster's Answers
- Ricardo Vladimiro's Answers
- Storytelling with Statistics
- Data Science FAQs on Quora
- Machine Learning FAQs on Quora
Kaggle Competitions WriteUp
- How to almost win Kaggle Competitions
- Convolution Neural Networks for EEG detection
- Facebook Recruiting III Explained
- Predicting CTR with Online ML
Cheat Sheets
Classification
- Does Balancing Classes Improve Classifier Performance?
- What is Deviance?
- When to choose which machine learning classifier?
- What are the advantages of different classification algorithms?
- ROC and AUC Explained
- An introduction to ROC analysis
- Simple guide to confusion matrix terminology
Linear Regression
-
General
- Assumptions of Linear Regression, Stack Exchange
- Linear Regression Comprehensive Resource
- Applying and Interpreting Linear Regression
- What does having constant variance in a linear regression model mean?
- Difference between linear regression on y with x and x with y
- Is linear regression valid when the dependant variable is not normally distributed?
-
Multicollinearity and VIF
Logistic Regression
- Logistic Regression Wiki
- Geometric Intuition of Logistic Regression
- Obtaining predicted categories (choosing threshold)
- Residuals in logistic regression
- Difference between logit and probit models, Logistic Regression Wiki, Probit Model Wiki
- Pseudo R2 for Logistic Regression, How to calculate, Other Details
Model Validation using Resampling
- Bootstrapping
Deep Learning
- A curated list of awesome Deep Learning tutorials, projects and communities
- Lots of Deep Learning Resources
- Interesting Deep Learning and NLP Projects (Stanford), Website
- Core Concepts of Deep Learning
- Understanding Natural Language with Deep Neural Networks Using Torch
- Stanford Deep Learning Tutorial
- Deep Learning FAQs on Quora
- Google+ Deep Learning Page
- Recent Reddit AMAs related to Deep Learning, Another AMA
- Where to Learn Deep Learning?
- Deep Learning nvidia concepts
- Introduction to Deep Learning Using Python (GitHub), Good Introduction Slides
- Video Lectures Oxford 2015, Video Lectures Summer School Montreal
- Deep Learning Software List
- Hacker's guide to Neural Nets
- Top arxiv Deep Learning Papers explained
- Geoff Hinton Youtube Vidoes on Deep Learning
- Awesome Deep Learning Reading List
- Deep Learning Comprehensive Website, Software
- deeplearning Tutorials
- AWESOME! Deep Learning Tutorial
- Deep Learning Basics
- Stanford Tutorials
- Train, Validation & Test in Artificial Neural Networks
- Artificial Neural Networks Tutorials
- Neural Networks FAQs on Stack Overflow
-
Neural Machine Translation
-
Deep Learning Frameworks
- Torch vs. Theano
- dl4j vs. torch7 vs. theano
-
Caffe
-
TensorFlow
- Feed Forward Networks
- Implementing a Neural Network from scratch, Code
- Speeding up your Neural Network with Theano and the gpu, Code
- Basic ANN Theory
- Role of Bias in Neural Networks
- Choosing number of hidden layers and nodes,2,3
- Backpropagation Explained
- ANN implemented in C++ | AI Junkie
- Simple Implementation
- NN for Beginners
- Regression and Classification with NNs (Slides)
- Another Intro
- Recurrent and LSTM Networks
- awesome-rnn: list of resources (GitHub Repo)
- Recurrent Neural Net Tutorial Part 1, Part 2, Part 3, Code
- NLP RNN Representations
- The Unreasonable effectiveness of RNNs, Torch Code, Python Code
- Intro to RNN, LSTM
- An application of RNN
- Optimizing RNN Performance
- Simple RNN
- Auto-Generating Clickbait with RNN
- Sequence Learning using RNN (Slides)
- Machine Translation using RNN (Paper)
- Music generation using RNNs (Keras)
- Using RNN to create on-the-fly dialogue (Keras)
- Long Short Term Memory (LSTM)
- Understanding LSTM Networks
- LSTM explained
- LSTM
- Implementing LSTM from scratch, Python/Theano code
- Torch Code, Torch
- LSTM for Sentiment Analysis in Theano
- Deep Learning for Visual Q&A | LSTM | CNN, Code
- Computer Responds to email | Google
- LSTM dramatically improves Google Voice Search, 2
- Understanding Natural Language with Deep Neural Networks Using Torch
- Gated Recurrent Units (GRU)
- Restricted Boltzmann Machine
- Autoencoders: Unsupervised (applies BackProp after setting target = input)
- Convolution Networks
- Awesome Deep Vision: List of Resources (GitHub)
- Intro to CNNs
- Understanding CNN for NLP
- Stanford Notes, Codes, GitHub
- JavaScript Library (Browser Based) for CNNs
- Using CNNs to detect facial keypoints
- Deep learning to classify business photos at Yelp
- Interview with Yann LeCun | Kaggle
- Visualising and Understanding CNNs
Natural Language Processing
- A curated list of speech and natural language processing resources
- Understanding Natural Language with Deep Neural Networks Using Torch
- tf-idf explained
- Interesting Deep Learning NLP Projects Stanford, Website
- NLP from Scratch | Google Paper
- Graph Based Semi Supervised Learning for NLP
- Bag of Words
-
Topic Modeling
- LDA, LSA, Probabilistic LSA
- Awesome LDA Explanation!. Another good explanation
- The LDA Buffet- Intuitive Explanation
- Difference between LSI and LDA
- Original LDA Paper
- alpha and beta in LDA
- Intuitive explanation of the Dirichlet distribution
- Topic modeling made just simple enough
- Online LDA, Online LDA with Spark
- LDA in Scala, Part 2
- Segmentation of Twitter Timelines via Topic Modeling
- Topic Modeling of Twitter Followers
-
word2vec
- Google word2vec
- Bag of Words Model Wiki
- A closer look at Skip Gram Modeling
- Skip Gram Model Tutorial, CBoW Model
- Word Vectors Kaggle Tutorial Python, Part 2
- Making sense of word2vec
- word2vec explained on deeplearning4j
- Quora word2vec
- Other Quora Resources, 2, 3
- word2vec, DBN, RNTN for Sentiment Analysis
-
Text Clustering
-
Text Classification
- Kaggle Tutorial Bag of Words and Word vectors, Part 2, Part 3
- What would Shakespeare say (NLP Tutorial)
- A closer look at Skip Gram Modeling
Computer Vision
Support Vector Machine
- Highest Voted Questions about SVMs on Cross Validated
- Help me Understand SVMs!
- SVM in Layman's terms
- How does SVM Work | Comparisons
- A tutorial on SVMs
- Practical Guide to SVC, Slides
- Introductory Overview of SVMs
- Comparisons
- Optimization Algorithms in Support Vector Machines
- Variable Importance from SVM
- Software
- Kernels
- Probabilities post SVM
Reinforcement Learning
Decision Trees
- Wikipedia Page - Lots of Good Info
- FAQs about Decision Trees
- Brief Tour of Trees and Forests
- Tree Based Models in R
- How Decision Trees work?
- Weak side of Decision Trees
- Thorough Explanation and different algorithms
- What is entropy and information gain in the context of building decision trees?
- Slides Related to Decision Trees
- How do decision tree learning algorithms deal with missing values?
- Using Surrogates to Improve Datasets with Missing Values
- Good Article
- Are decision trees almost always binary trees?
- Pruning Decision Trees, Grafting of Decision Trees
- What is Deviance in context of Decision Trees?
- Comparison of Different Algorithms
- CART
- CTREE
- CHAID
- MARS
- Probabilistic Decision Trees
Random Forest / Bagging
- Awesome Random Forest (GitHub)**
- How to tune RF parameters in practice?
- Measures of variable importance in random forests
- Compare R-squared from two different Random Forest models
- OOB Estimate Explained | RF vs LDA
- Evaluating Random Forests for Survival Analysis Using Prediction Error Curve
- Why doesn't Random Forest handle missing values in predictors?
- How to build random forests in R with missing (NA) values?
- FAQs about Random Forest, More FAQs
- Obtaining knowledge from a random forest
- Some Questions for R implementation, 2, 3
Boosting
- Boosting for Better Predictions
- Boosting Wikipedia Page
- Introduction to Boosted Trees | Tianqi Chen
-
Gradient Boosting Machine
-
xgboost
- AdaBoost
Ensembles
- Wikipedia Article on Ensemble Learning
- Kaggle Ensembling Guide
- The Power of Simple Ensembles
- Ensemble Learning Intro
- Ensemble Learning Paper
- Ensembling models with R, Ensembling Regression Models in R, Intro to Ensembles in R
- Ensembling Models with caret
- Bagging vs Boosting vs Stacking
- Good Resources | Kaggle Africa Soil Property Prediction
- Boosting vs Bagging
- Resources for learning how to implement ensemble methods
- How are classifications merged in an ensemble classifier?
Stacking Models
- Stacking, Blending and Stacked Generalization
- Stacked Generalization (Stacking)
- Stacked Generalization: when does it work?
- Stacked Generalization Paper
Vapnik–Chervonenkis Dimension
- Wikipedia article on VC Dimension
- Intuitive Explanantion of VC Dimension
- Video explaining VC Dimension
- Introduction to VC Dimension
- FAQs about VC Dimension
- Do ensemble techniques increase VC-dimension?
Bayesian Machine Learning
- Bayesian Methods for Hackers (using pyMC)
- Should all Machine Learning be Bayesian?
- Tutorial on Bayesian Optimisation for Machine Learning
- Bayesian Reasoning and Deep Learning, Slides
- Bayesian Statistics Made Simple
- Kalman & Bayesian Filters in Python
- Markov Chain Wikipedia Page
Semi Supervised Learning
- Wikipedia article on Semi Supervised Learning
- Tutorial on Semi Supervised Learning
- Graph Based Semi Supervised Learning for NLP
- Taxonomy
- Video Tutorial Weka
- Unsupervised, Supervised and Semi Supervised learning
- Research Papers 1, 2, 3
Optimization
- Mean Variance Portfolio Optimization with R and Quadratic Programming
- Algorithms for Sparse Optimization and Machine Learning
- Optimization Algorithms in Machine Learning, Video Lecture
- Optimization Algorithms for Data Analysis
- Video Lectures on Optimization
- Optimization Algorithms in Support Vector Machines
- The Interplay of Optimization and Machine Learning Research