|
COL780: Computer Vision
This is the course page for Computer Vision, for Semester I, 2018-2019, being taught by Subhashis Banerjee at the Department of Computer Science and Engineering, IIT, New Delhi.
Notice
- Class timings: 1200-1300 on Mondays, Tuesdays and Fridays in Bharti 101.
Honour code
- All students are expected to follow the highest ethical standards.
- Collaborations and discussions are encouraged. However, all students are
required to write up all solutions entirely on their own. Any collaboration,
or help taken, must be declared.
- Students are encouraged to refer to books, papers and internet resources.
They may even consult other individuals.
However, the source must be clearly cited if any part of the solution (or
even an idea) is taken from such a source.
- Failure to declare any help taken will be interpreted as academic misconduct.
Topics discussed in class:
General introduction |
Pinhole camera: | https://en.wikipedia.org/wiki/Pinhole_camera_model |
Optic flow: | http://www.cs.toronto.edu/pub/jepson/teaching/vision/2503/opticalFlow.pdf, http://cs.nyu.edu/~fergus/teaching/vision_2012/13_opticalflow.pdf |
Lucas-Kanade tracking | https://www.ri.cmu.edu/pub_files/pub3/baker_simon_2002_3/baker_simon_2002_3.pdf |
Scale and pyramids: | http://persci.mit.edu/pub_pdfs/RCA84.pdf, http://persci.mit.edu/pub_pdfs/pyramid83.pdf, http://persci.mit.edu/pub_pdfs/spline83.pdf |
SIFT: | https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf |
Vocabulary tree: | http://www.cs.ubc.ca/~lowe/525/papers/nisterCVPR06.pdf |
Bag of words, topic discovery using pLSA: | http://www.robots.ox.ac.uk/~vgg/publications/papers/sivic03.pdf, http://www.robots.ox.ac.uk/~vgg/publications/papers/sivic05b.pdf |
Edge detection: | http://www.cse.usf.edu/~r1k/MachineVisionBook/MachineVision.files/MachineVision_Chapter5.pdf |
Segmentation: | http://ftp.cs.toronto.edu/pub/jepson/teaching/vision/2503/segmentation.pdf, http://www.cs.ucf.edu/~mtappen/cap5415/lecs/lec11.pdf |
Normalised cut: | https://people.eecs.berkeley.edu/~malik/papers/SM-ncut.pdf |
Energy minimization using graph cut: | http://www.cs.cornell.edu/rdz/Papers/BVZ-pami01-final.pdf |
Projective geometry for Computer Vision: | http://www.cse.iitd.ac.in/~suban/vision/geometry |
Multiple views geometry: | http://www.cse.iitd.ac.in/~suban/vision/multiple |
Bundle adjustment, large scale 3D reconstruction, SLAM | Tutorial, 3D reconstruction talk, 3D reconstruction demo,LSD SLAM |
Deep learning, convolutional neural nets (CNNs) | Learnability, Stanford cs2321n, http://deeplearning.net/tutorial/deeplearning.pdf, http://deeplearning.net/tutorial/, http://deeplearning.stanford.edu/tutorial/, https://web.stanford.edu/class/cs294a/sparseAutoencoder_2011new.pdf,https://www.cs.toronto.edu/~hinton/ucltutorial.pdf,, https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf, https://arxiv.org/pdf/1409.1556.pdf, http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ |
Deep learning, CNN applications | http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html, http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Kendall_PoseNet_A_Convolutional_ICCV_2015_paper.pdf, |
Recurrent neural nets (RNNs), LSTM, Reinforcement learning | https://deeplearning4j.org/lstm, http://colah.github.io/posts/2015-08-Understanding-LSTMs/, http://karpathy.github.io/2015/05/21/rnn-effectiveness/, http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf, UCL course on reinforcement learning |
Assignments
Problem set 0
- Outline a procedure to measure the height of a lamp post and the angle that it may make with the vertical.
- Consider the problem of automatically driving a car from the IITD man gate to the JNU gate using a single camera. You may assume that car can be driven with python commands corresponding to the metaphorical lateral arrow keys for differential steering, and the metaphorical vertical arrow keys for differential positive and negative acceleration. Please decompose the problem into well identified sub-problems and identify the assumptions and the principles that may be invoked for solution. Please be as precise as possible.
Please submit on the Moodle page for COL780 (https://moodle.iitd.ac.in).
Deadline: August 3. Cutoff: Aug 5.
Assignment 0
- Become familiar with OpenCV
- Implement Gaussian mixture model based background subtraction. See here and here.
- Compare your implementation with the one in OpenCV.
Resources
- http://www.cse.iitd.ac.in/~suban/vision
- Notes on Linear Algebra and Optimization (available only locally).
Subhashis Banerjee / Dept. Computer Science and Engineering / IIT Delhi /
Hauz Khas/ New Delhi 110016
|