|
COL774: Machine Learning
General Information
Semester: Sem I, 2021-22.
Instructor: Parag Singla (email: parags AT cse.iitd.ac.in)
Class Timings (Slot C):
- Tue, 8:00 am - 8:50am
- Wed, 8:00 am - 8:50am
- Fri, 8:00 am - 8:50am
- Sat, 8:00 am - 8:50am (On some of the days)
Venue: Online:
Click Here
Sign up for Piazza
Code: As (will be) announced in class.
Seating Arrangement: Major Exam
Seating Arrangement: Minor Exam
Students Registered: List of Students Registered in the Course (as per
Eadmin Portal)
Announcements
[Oct 30, 2021]: Assignment 4 is out. Due Date: Friday Nov 26, 11:50 pm.
[Oct 8, 2021]: Assignment 3 is out. Due Date: Monday Oct 25, 11:50 pm.
[Sep 13, 2021]: Assignment 2 is out. Due Date: Wednesday Oct 6, 11:50 pm.
[Aug 26, 2021]: Assignment 1 is out. Due Date: Sunday Sep 12, 11:50 pm.
[Aug 9, 2021]: Welcome! COL 774 classes will start from Tuesday Aug 10, 2021.
Course Objectives
(a) To familiarize with/develop the understanding of fundamental concepts of Machine
Learning (ML)
(b) To develop the understanding of working of a variety of ML algorithms (both
supervised as well as unsupervised)
(c) To learn to apply ML algorithms to real world
data/problems
(d) To update with some of the latest advances in the field
Course Content
NOTE: The exact list of topics below is tentative (until we are past that week).
We will update it as we go through the lectures in each week. So, stay tuned!
Week |
Topic |
Supplementary Notes (by Andrew Ng and Others) |
Book Chapters |
1 | Introduction | | |
2,3 | Linear Regression (and Its Variants) |
lin-log-reg.pdf
| Bishop, Chapter 3.1 |
4 | Logistic Regression, Generalized Linear Models |
lin-log-reg.pdf
| Bishop, Chapter 3.1 |
5 | Gaussian Discriminant Analysis (GDA), Naive Bayes |
gda_nb.pdf |
Bishop, Chapter 4 |
6,7 | Support Vector Machines |
svm.pdf |
Bishop, Chapter 7.1 |
8 | Decision Trees, Random Forests |
dtrees.pdf
| Mitchell, Chapter 3.
Online Resource:
Random Forests |
9 | Neural Networks |
nnets.pdf
nnets-hw.pdf
|
Mitchell, Chapter 4 |
10 | Deep Learning |
deep_learning_slides.pdf
cnn.pdf
|
Online Resource:
Convolutional Neural Networks |
11 | K-Means, Gaussian Mixture Models |
kmeans.pdf
gmm.pdf
em.pdf
pca.pdf |
|
12 | Expectation Maximiation (EM), Principal Component Analysis (PCA) |
kmeans.pdf
gmm.pdf
em.pdf
pca.pdf |
|
13 | Learning Theory, Model Selection |
theory.pdf
model.pdf |
Mitchell, Chapter 7 |
14 | Advanced Topics |
|
|
Notes: Aug 10, Aug 11,
Aug 13, Aug 17,
Aug 18, Aug 21,
Aug 22, Aug 24,
Aug 25, Aug 27,
Aug 28, Aug 31,
Sep 1, Sep 3,
Sep 5, Sep 7,
Sep 8, Sep 10,
Sep 14, Sep 15,
Sep 17, Sep 24,
Sep 28, Sep 30 ,
Oct 1 , Oct 3,
Oct 5 , Oct 6,
Oct 12 , Oct 13,
Oct 16 , Oct 16 (a),
Oct 20 , Oct 22,
Oct 23, Oct 26,
Oct 27, Oct 29,
Nov 2, Nov 3,
Nov 9, Nov 10
Videos: Aug 10,
Aug 11,
Aug 13,
Aug 17,
Aug 18,
Aug 21,
Aug 22,
Aug 24,
Aug 25,
Aug 27,
Aug 28,
Aug 31,
Sep 1,
Sep 3,
Sep 5,
Sep 7,
Sep 8,
Sep 10,
Sep 14,
Sep 15,
Sep 17,
Sep 24,
Sep 28,
Sep 29 ,
Oct 1 ,
Oct 3 ,
Oct 5 ,
Oct 6 ,
Oct 12 [use class notes for links to embedded video],
Oct 13 ,
Oct 16 ,
Oct 16(a) [watch up to 28:15],
Oct 20,
Oct 22,
Oct 23,
Oct 26,
Oct 27,
Oct 29,
Nov 2,
Nov 3,
Nov 9,
Nov 10
References
- Machine Learning: A Probabilistic Perspective.
Kevin Murphy. MIT Press, 2012.
- Pattern Recognition and Machine Learning. Christopher Bishop. First Edition, Springer, 2006.
- Pattern Classification. Richard Duda, Peter Hart and David Stock. Second Edition, Wiley-Interscience, 2000.
- Machine Learning. Tom Mitchell. First Edition, McGraw-Hill, 1997.
Assignment Submission Instructions
- You are free to discuss the assignment problems with other students in the class. But all your code should be
produced independently without looking at/referring to anyone else's code.
- Python is the default programming
languages for the course. You should use it for programming your
assignments unless otherwise explicitly allowed.
- Code should be submitted using Moodle Page.
Make sure to include commenrs for readability.
- Create a separate directory
for each of the questions named by the question number. For instance, for question 1,
all your submissions files should be put in the directory named
Q1 (and so on for other questions). Put all the Question sub-directories in a single
top level directory. This directory should be named as "yourentrynumber_firstname_lastname".
For example, if your entry number is "2020anz7535" and your name is "Nitika Rao", your
submission directory should be named as "2017anz7535_nitika_rao". You should zip your
directory and name the resulting file as "yourentrynumber_firstname_lastname.zip" e.g. in
the above example it will be "2020anz7535_nitika_rao.zip". This single zip file should
be submitted online.
- Honor Code: Any cases of copying will be awarded a zero on the assignment and a
penalty of -10. More severe penalties may follow.
- Late Policy: You are allowed a total of 5 late (buffer) days acorss all the assignments. You are
free to decide how you would like to use them.
You will get a penalty of 10% deduction in marks (per day) for every additional late day in submission
used beyond the allowed 5 buffer days.
This applies to the first 3
assignments in the course
(i.e., leaving out the last assigment based on a competition).
Practice Questions
Assignments
- Assignment 4
Due Date:Friday November 26, 2021. 11:50 pm
- Assignment 3
Dataset (Part 1 - Decision Trees): bank_dataset.zip
Due Date (updated): Wednesday October 27, 2021. 11:50 pm
- Assignment 2
Due Date:Wednesday October 6, 2021. 11:50 pm
- Assignment 1
Datasets: ass1_data.zip
Due Date:Sunday September 12, 2021. 11:50 pm
Grading Policy (Tentative)
Assignments (4) | Ass1: 7%, Ass2: 9%, Ass3: 9%, Ass4: 10% |
| [Total Assignment Weight: 35%] |
Minor | 25% |
Major | 40% |
|