COL 341: Fundamentals of Machine Learning
After successfully completing the course, the students are
expected to develop:
o
Understanding of the basic concepts
in machine learning
o
Deeper mathematical understanding of
the foundations of machine learning methods
o
Skills in using popular tools,
libraries and software for problem solving
o
Hands on experience in solving
problems that may be encountered in the industry
o
Introductory understanding and
exposure to selected research topics in Machine Learning
Week Number |
Topics to be covered |
References |
1 |
Introduction to the
course and basic concepts in machine learning |
o
Pattern
Recognition and Machine Learning, by Christopher M. Bishop (Chapter 1.1) |
2 |
Linear regression, feature
creation, ridge regression, cross validation |
o
Andrew
Ng course (CS229) notes on linear and logistic
regression. o
Convex
Optimization, Stephen Boyd and Lieven Vandenberghe o
Linear
Algebra Review and Reference by Zico Kolter |
3 |
Lasso regression,
logistic regression, optimization basics, gradient descent and its variants,
step size selection |
|
4 |
Multi-class
classification, evaluation metrics, Introduction to neural networks |
o
Chapter
4.1, PRML, Bishop o
Classification
model metrics: : Precision
and recall, sensitivity
and specificity, ROC
curves |
5 |
Neural networks,
Hessian, overfitting in neural networks, convolutional neural networks (CNNs)
and deep learning |
o
Chapters
5.1, 5.2, 5.3, 5.4 of PRML, Bishop o
Neural
networks and back-propagation. o
Roi
Livni (Princeton) lecture
notes on back-propagation. o
CS231n
(Stanford) slides
on back-propagation. o
Matt
Gormley (CMU) slides
on back-propagation o
Convolutional
Neural Networks o
CNN notes from CS231n
(Stanford University), Understanding and
visualizing CNNs, Transfer
learning, CS231n slides
on CNN o
Barnabas
Poczos’s lecture notes on CNN |
6 |
Generative models,
Gaussian Discriminant Analysis (GDA), Naïve Bayes Classification |
o
Andrew
Ng (CS229) lecture
Notes on Generative Models |
7 |
Understanding
Multivariate Gaussian Distributions, linear regression revisited |
o
Multivariate
Gaussian Distribution o
A short
reference on Multivariate Gaussian Distributions, by Leon Gu, CMU o
Multivariate Gaussian
Distributions, by Chuong B. Do, Stanford University o
Probabilistic
modelling – Linear regression & Gaussian processes by Fredrik Lindsten, Thomas B. Schön, Andreas Svensson
and Niklas Wahlström |
8 |
Estimation theory |
o
Estimation
Theory notes, slides
from Cristiano Porciani, Point
estimation notes by Herman Bennett |
9 |
Density estimation, Parzen Window, KNN classifier |
o
PRML,
Bishop, Chapter 2.5 o
Lecture
Notes on Nonparametrics by Bruce E. Hansen |
10 |
Decision trees,
Random Forests |
o
Chapter
8.1-8.4 from Pattern Classification 2nd Edition, Duda,
Hard and Stork o
Chapter on Decision
Trees by Lior Rokach
and Oded Maimon o
Classical
paper on Random Forest by Leo Breiman |
11 |
Support Vector
Machines (SVMs) |
o
Andrew
Ng course (CS229) lecture
notes on Support Vector Machines (SVMs) o
Chapters
5.1, 5.2, 5.3, 5.4, 5.5 from Stephen Boyd and Lieven Vandenberghe |
12 |
Unsupervised
Learning, Principal Component Analysis (PCA) |
o
A Tutorial on Principal Component
Analysis (PCA) by Jonathon Shlens o
Slides
on PCA by Barnabás Póczos |
13 |
Non-Negative Matrix
Factorization (NMF), K-means, Non-linear dimensionality reduction |
o
Lee,
Daniel D., and H. Sebastian Seung. Algorithms for non-negative matrix
factorization. In Advances in neural information processing systems, pp.
556-562. 2001. o
Lee,
Daniel D., and H. Sebastian Seung. Learning the parts of objects by
non-negative matrix factorization. Nature 401, no. 6755 (1999): 788. o
Y.
Koren, R. M. Bell, and C. Volinsky.
Matrix factorization techniques for recommender systems. IEEE Computer,
42(8):30–37, 2009. o
PRML,
Bishop, Chapter 9.1 o
A
Global Geometric Framework for Nonlinear Dimensionality Reduction, Joshua
B. Tenenbaum,Vin de Silva,John
C. Langford and Nonlinear
Dimensionality Reduction by Locally Linear Embedding, Sam T. Roweis and Lawrence K. Saul |
14 |
Boosting, Generative
adversarial networks |
o
Yoav
Freund and Robert E. Schapire. A short
introduction to boosting. Journal of
Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999. o
VC
Dimension & The Fundamental Theorem, Lecture notes, Class of Roi Livni o
Goodfellow, Ian, Jean Pouget-Abadie,
Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville,
and Yoshua Bengio. "Generative
adversarial nets." In Advances in neural information processing
systems, pp. 2672-2680. 2014. |
·
Exams — 45 (Minors — 25,
Major — 20)
·
Assignments/project — 45
(Assignments — 30, Project — 15)
·
Quizzes/home work/class
participation — 10
o
Week 1
o
Learn Python,
NumPy and SciPy
o
Write a matrix multiplication
program in Python using (a) for loops and (b) by using NumPy/SciPy. Plot the Gflops of your programs as a function of matrix sizes
o
Week 2
o
Prove that
J(\theta) is convex in \theta (refer to Andrew Ng lecture notes on linear and
logistic regression for definition of J(\theta).
o
Read Linear
Algebra Review and Reference by Zico Kolter
o
Week 3
o
Read chapters 2.1,
2.2, 2.3, 2.5, 9.2, 9.3 and optionally 9.4 of Convex Optimization text book
o
Week 4
o
Review
probability and statistics
o
Read 1D, 2D and
3D convolutions
o
Learn Keras/Tensorflow