COL774: Machine Learning



General Information

Semester: Sem I, 2020-21.

Instructor: Parag Singla (email: parags AT cse.iitd.ac.in)

Class Timings (Slot C):
  • Tue, 8:00 am - 8:50am
  • Wed, 8:00 am - 8:50am
  • Fri, 8:00 am - 8:50am
  • Sat, 8:00 am - 8:50am (As per academic schedule)
Venue: Online: Click Here

TA Assignment Details: Click here

Sign up for Piazza
Code: As announced over email.

Announcements

  • [Dec. 25, 2020]: Assignment 4 is out!
  • [Dec. 15, 2020]: Dec 14 - Dec 20 lectures are up on the website now!
  • [Dec. 6, 2020]: Assignment 3 has been updated. Due date: Tuesday Dec 22, 2020, 11:59 pm.
  • [Dec. 4, 2020]: Assignment 3 is out. Due date: Tuesday Dec 22, 2020, 11:59 pm.
  • [Nov. 14, 2020]: Assignment 2 is out. Due date: Wednesday Dec 2, 2020, 11:50 pm.
  • [Oct. 19, 2020]: Assignment 1 is out. Due date: Friday Nov 6, 2020, 11:50 pm.
  • [Sep. 29, 2020]: Due to a large demand, the course is open for CSE/SIT students only. Others are welcome to sit-through the course.
  • [Sep. 28, 2020]: Welcome! COL 774 classes will start from Tuesday Oct 6th.

    Course Objectives

    (a) To familiarize with/develop the understanding of fundamental concepts of Machine Learning (ML)
    (b) To develop the understanding of working of a variety of ML algorithms (both supervised as well as unsupervised)
    (c) To learn to apply ML algorithms to real world data/problems
    (d) To update with some of the latest advances in the field

    Course Content

    NOTE: The exact list of topics below is tentative (until we are past that week). We will update it as we go through the lectures in each week. So, stay tuned!
    Week Topic Supplementary Notes
    (by Andrew Ng and Others)
    Book Chapters
    1 Introduction
    2,3 Linear Regression (and Its Variants) lin-log-reg.pdf Bishop, Chapter 3.1
    4 Logistic Regression, Generalized Linear Models lin-log-reg.pdf Bishop, Chapter 3.1
    5 Gaussian Discriminant Analysis (GDA), Naive Bayes gda_nb.pdf Bishop, Chapter 4
    6,7 Support Vector Machines svm.pdf Bishop, Chapter 7.1
    8 Decision Trees, Random Forests dtrees.pdf Mitchell, Chapter 3. Online Resource: Random Forests
    9 Neural Networks nnets.pdf nnets-hw.pdf Mitchell, Chapter 4
    10 Deep Learning deep_learning_slides.pdf
    cnn.pdf

    Online Resource:
    Convolutional Neural Networks
    11 K-Means, Gaussian Mixture Models kmeans.pdf gmm.pdf em.pdf pca.pdf
    12 Expectation Maximiation (EM), Principal Component Analysis (PCA) kmeans.pdf gmm.pdf em.pdf pca.pdf
    13 Learning Theory, Model Selection theory.pdf model.pdf Mitchell, Chapter 7
    14 Advanced Topics

    Class Notes/Videos (Date-Wise)

    Notes: Oct 6, Oct 7, Oct 9, Oct 10 Oct 13, Oct 14, Oct 16,Oct 17, Oct 20,Oct 21, Oct 23,Oct 24, Oct 27,Oct 28, Oct 30,Oct 31,
                Nov 4,Nov 6, Nov 7,Nov 13, Nov 17, Nov 18, Nov 20,Nov 21, Nov 24,Nov 25, Nov 27,Nov 29, Self Study - Neural Networks (1), Self Study - Neural Networks (2),
                Dec 15,Dec 16, Dec 18 (a),Dec 18, Dec 19,
                Dec 22, Dec 23,Dec 26, Dec 27,
                Dec 29,Dec 30, Jan 1,Jan 2

    Videos: Oct 6, Oct 7, Oct 9, Oct 10, Oct 13, Oct 14, Oct 16, Oct 17, Oct 20, Oct 21, Oct 23, Oct 24, Oct 27, Oct 28, Oct 30, Oct 31,
                 Nov 4, Nov 6, Nov 7, Nov 13, Nov 17, Nov 18, Nov 20, Nov 21, Nov 24, Nov 25, Nov 27, Nov 29, Self Study - Neural Networks(1) , Self Study - Neural Networks (2)
                 Dec 15, Dec 16, Dec 18 (a) [watch 28:50 minutes onward], Dec 18, Dec 19,
                 Dec 22, Dec 23, Dec 26, Dec 27,
                 Dec 29, Dec 30, Jan 1, Jan 2

    Additional Resources

    Review Material

    Topic Notes
    Probability prob.pdf
    Linear Algebra linalg.pdf
    Gaussian Distribution gaussians.pdf
    Convex Optimization (1) convex-1.pdf

    References

    1. Machine Learning: A Probabilistic Perspective. Kevin Murphy. MIT Press, 2012.
    2. Pattern Recognition and Machine Learning. Christopher Bishop. First Edition, Springer, 2006.
    3. Pattern Classification. Richard Duda, Peter Hart and David Stock. Second Edition, Wiley-Interscience, 2000.
    4. Machine Learning. Tom Mitchell. First Edition, McGraw-Hill, 1997.

    Assignment Submission Instructions

    1. You are free to discuss the assignment problems with other students in the class. But all your code should be produced independently without looking at/referring to anyone else's code.
    2. Python is the default programming languages for the course. You should use it for programming your assignments unless otherwise explicitly allowed.
    3. Code should be submitted using Moodle Page. Make sure to include commenrs for readability.
    4. Create a separate directory for each of the questions named by the question number. For instance, for question 1, all your submissions files should be put in the directory named Q1 (and so on for other questions). Put all the Question sub-directories in a single top level directory. This directory should be named as "yourentrynumber_firstname_lastname". For example, if your entry number is "2017anz7535" and your name is "Nitika Rao", your submission directory should be named as "2017anz7535_nitika_rao". You should zip your directory and name the resulting file as "yourentrynumber_firstname_lastname.zip" e.g. in the above example it will be "2017anz7535_nitika_rao.zip". This single zip file should be submitted online.
    5. Honor Code: Any cases of copying will be awarded a zero on the assignment and a penalty of -10. More severe penalties may follow.
    6. Late Policy: You are allowed a total of 5 late (buffer) days acorss all the assignments. You are free to decide how you would like to use them. You will get a peanlty of 10% deduction in marks (per day) for every additional late day in submission used beyond the allowed 5 buffer days. This applies to the first 3 assignments in the course (i.e., leaving out the last assigment based on a competition).

    Practice Questions

    Assignments

    1. Assignment 4
      Due Date: Sunday Jan 17, 2021, 11:59 pm
    2. Assignment 3 [Updated: Dec 6, 2020 @ 10:40 pm]
      Datasets:
           Part (1): Decision Tree Data
           Part (2): Kannda Digits Data,   MNIST Data [MNIST data provided only for extra fun - no Credits!]
      Due Date: Tuesday December 22, 2020, 11:59 pm
    3. Assignment 2
      Datasets: Yelp , FMNIST
      Due Date: Wednesday December 2, 2020, 11:50 pm
    4. Assignment 1
      Datasets: ass1_data.zip
      Due Date:Friday November 6, 2020, 11:50 pm

    Grading Policy (Tentative)

    Assignments Ass1: 7%, Ass2: 9%, Ass3: 9%, Ass4: 10%
    [Total Assignment Weight: 35%]
    Minor 25%
    Major 40%