COL780: Computer Vision



This is the course page for Computer Vision, for Semester I, 2018-2019, being taught by Subhashis Banerjee at the Department of Computer Science and Engineering, IIT, New Delhi.


Notice

  1. Class timings: 1200-1300 on Mondays, Tuesdays and Fridays in Bharti 101.

Honour code

  • All students are expected to follow the highest ethical standards.
  • Collaborations and discussions are encouraged. However, all students are required to write up all solutions entirely on their own. Any collaboration, or help taken, must be declared.
  • Students are encouraged to refer to books, papers and internet resources. They may even consult other individuals. However, the source must be clearly cited if any part of the solution (or even an idea) is taken from such a source.
  • Failure to declare any help taken will be interpreted as academic misconduct.

Topics discussed in class:

General introduction
Pinhole camera: https://en.wikipedia.org/wiki/Pinhole_camera_model
Optic flow: http://www.cs.toronto.edu/pub/jepson/teaching/vision/2503/opticalFlow.pdf, http://cs.nyu.edu/~fergus/teaching/vision_2012/13_opticalflow.pdf
Lucas-Kanade tracking https://www.ri.cmu.edu/pub_files/pub3/baker_simon_2002_3/baker_simon_2002_3.pdf
Scale and pyramids: http://persci.mit.edu/pub_pdfs/RCA84.pdf, http://persci.mit.edu/pub_pdfs/pyramid83.pdf,
http://persci.mit.edu/pub_pdfs/spline83.pdf
SIFT: https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
Vocabulary tree: http://www.cs.ubc.ca/~lowe/525/papers/nisterCVPR06.pdf
Bag of words, topic discovery using pLSA: http://www.robots.ox.ac.uk/~vgg/publications/papers/sivic03.pdf,
http://www.robots.ox.ac.uk/~vgg/publications/papers/sivic05b.pdf
Edge detection: http://www.cse.usf.edu/~r1k/MachineVisionBook/MachineVision.files/MachineVision_Chapter5.pdf
Segmentation: http://ftp.cs.toronto.edu/pub/jepson/teaching/vision/2503/segmentation.pdf,
http://www.cs.ucf.edu/~mtappen/cap5415/lecs/lec11.pdf
Normalised cut: https://people.eecs.berkeley.edu/~malik/papers/SM-ncut.pdf
Energy minimization using graph cut: http://www.cs.cornell.edu/rdz/Papers/BVZ-pami01-final.pdf
Projective geometry for Computer Vision: http://www.cse.iitd.ac.in/~suban/vision/geometry
Multiple views geometry: http://www.cse.iitd.ac.in/~suban/vision/multiple
Bundle adjustment, large scale 3D reconstruction, SLAM Tutorial, 3D reconstruction talk, 3D reconstruction demo,LSD SLAM
Deep learning, convolutional neural nets (CNNs) Learnability, Stanford cs2321n, http://deeplearning.net/tutorial/deeplearning.pdf, http://deeplearning.net/tutorial/,
http://deeplearning.stanford.edu/tutorial/, https://web.stanford.edu/class/cs294a/sparseAutoencoder_2011new.pdf,https://www.cs.toronto.edu/~hinton/ucltutorial.pdf,,
https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf, https://arxiv.org/pdf/1409.1556.pdf, http://www.robots.ox.ac.uk/~vgg/practicals/cnn/
Deep learning, CNN applications http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html, http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Kendall_PoseNet_A_Convolutional_ICCV_2015_paper.pdf,
Recurrent neural nets (RNNs), LSTM, Reinforcement learning https://deeplearning4j.org/lstm, http://colah.github.io/posts/2015-08-Understanding-LSTMs/, http://karpathy.github.io/2015/05/21/rnn-effectiveness/, http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf, UCL course on reinforcement learning


Assignments

Problem set 0
  1. Outline a procedure to measure the height of a lamp post and the angle that it may make with the vertical.
  2. Consider the problem of automatically driving a car from the IITD man gate to the JNU gate using a single camera. You may assume that car can be driven with python commands corresponding to the metaphorical lateral arrow keys for differential steering, and the metaphorical vertical arrow keys for differential positive and negative acceleration. Please decompose the problem into well identified sub-problems and identify the assumptions and the principles that may be invoked for solution. Please be as precise as possible.
Please submit on the Moodle page for COL780 (https://moodle.iitd.ac.in).
Deadline: August 3. Cutoff: Aug 5.
Assignment 0
  1. Become familiar with OpenCV
  2. Implement Gaussian mixture model based background subtraction. See here and here.
  3. Compare your implementation with the one in OpenCV.


Resources

  1. http://www.cse.iitd.ac.in/~suban/vision
  2. Notes on Linear Algebra and Optimization (available only locally).

Subhashis Banerjee / Dept. Computer Science and Engineering / IIT Delhi / Hauz Khas/ New Delhi 110016