CS 6476 Computer Vision

Fall 2018, MW 4:30 to 5:45, Clough 152
Instructor: James Hays

TAs: Cusuh Ham (head TA), Min-Hung (Steve) Chen, Sean Foley, Jianan Gao, John Lambert, Amit Raj, Sainandan Ramakrishnan, Dilara Soylu, Vijay Upadhya

Computer Vision, art by kirkh.deviantart.com

Course Description

This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment, tracking, boundary detection, and recognition. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.

The Advanced Computer Vision course (CS7476) in spring (not offered 2019) will build on this course and deal with advanced and research related topics in Computer Vision, including Machine Learning, Graphics, and Robotics topics that impact Computer Vision.

Learning Objectives

Upon completion of this course, students should be able to:


No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:


Your final grade will be made up from You will lose 10% each day for late projects. However, you have three "late days" for the whole course. That is to say, the first 24 hours after the due date and time counts as 1 day, up to 48 hours is two and 72 for the third late day. This will not be reflected in the initial grade reports for your assignment, but they will be factored in and distributed at the end of the semester so that you get the most points possible.

These late days are intended to cover unexpected clustering of due dates, travel commitments, interviews, hackathons, etc. Don't ask for extensions to due dates because we are already giving you a pool of late days to manage yourself.

Academic Integrity

Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation of the Honor Code. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Georgia Tech Academic Honor Code and Student Code of Conduct, available online at www.honor.gatech.edu. For quizzes, no supporting materials are allowed (notes, calculators, phones, etc).

You are expected to implement the core components of each project on your own, but the extra credit opportunties often build on third party data sets or code. That's fine. Feel free to include results built on other software, as long as you are clear in your handin that it is not your own work.

You should not view or edit anyone else's code. You should not post code to Piazza, except for starter code / helper code that isn't related to the core project.

Learning Accommodations

If needed, we will make classroom accommodations for students with documented disabilities. These accommodations must be arranged in advance and in accordance with the ADAPTS office (www.adapts.gatech.edu).

Contact Info and Office Hours:

If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or staff. Office Hours


Highlighted projects

All Results

Image Filtering and Hybrid images
Local Feature Matching
Camera Calibration and Fundamental Matrix Estimation with RANSAC
Scene Recognition with Bag of Words
Face Detection with a Sliding Window
Deep Learning
New this course, all starter code and projects will be in Python with the use of various third party libraries. We will make an effort to support MacOS, Windows, and Linux. The course does not teach python and assumed you have enough familiarity with procedural programming languages to complete the projects.


Readings will be assigned in "Computer Vision: Algorithms and Applications" by Richard Szeliski. The book is available for free online or available for purchase.


Class Date Topic Slides Reading Projects
Mon, Aug 20 Introduction to computer vision pdf, pptx Szeliski 1
Image Formation and Filtering (Szeliski chapters 2 and 3)
Wed, Aug 22 Cameras and Optics pdf, pptx Szeliski 2.1, especially 2.1.5 Project 1 out
Mon, Aug 27 Light and Color and Image Filtering pdf, pptx Szeliski 2.2 and 2.3
Wed, Aug 29 Thinking in Frequency pdf, pptx Szeliski 3.2 and 3.5.2 and 8.1.1 and 4.2
Mon, Sept 3 No classes, Institute holiday Project 1 due
Feature Detection and Matching
Wed, Sept 5 Interest points and corners pdf, pptx Szeliski 4.1.1 and 4.1.2 Project 2 out
Mon, Sept 10 Guest Lecture: Frank Dellaert pdf
Wed, Sept 12 Local image features pdf, pptx Szeliski 4.1.3 and 4.3.2
Mon, Sept 17 Model fitting, Hough Transform pdf, pptx Szeliski 6.1 and 2.1
Wed, Sept 19 RANSAC and transformations pdf, pptx
Multiple Views and Motion
Mon, Sept 24 Stereo intro and Camera Calibration pdf, pptx Szeliski 11 and 6.2.1 project 2 due
Wed, Sept 26 Epipolar Geometry and Structure from Motion pdf, pptx Szeliski 7 project 3 out
Mon, Oct 1 Stereo Correspondence and Optical Flow pdf, pptx Szeliski 11 and 8.4
Wed, Oct 3 Quiz 1
Mon, Oct 8 No classes, Institute holiday
Wed, Oct 10 Machine learning crash course and recognition overview pdf, pptx Szeliski 5.3 and 14
Mon, Oct 15 Recognition and Bag of Words pdf, pptx Szeliski 14.3.2 Project 4 out
Wed, Oct 17 Large-scale retrieval: Spatial Verification, TF-IDF, Query Expansion, feature encoding pdf, pptx
Mon, Oct 22 Large-scale category recognition and advanced feature encoding pdf, pptx
Wed, Oct 24 Detection with sliding windows: Viola Jones pdf, pptx Szeliski 14.1
Mon, Oct 29 Detection with sliding windows: Dalal Triggs and Pascal VOC pdf, pptx
Wed, Oct 31 No classes
Mon, Nov 5 Big Data pdf, pptx Szeliski 14.5 Szeliski 4.2
Wed, Nov 7 Crowdsourcing and Human Computation pdf, pptx
Deep Learning
Mon, Nov 12 Neural networks Basics and Convolutional Networks pdf, pptx
Wed, Nov 14 Object Detectors Emerge in Deep Scene CNNs and Deeper Deep Architectures. pdf, pptx
Mon, Nov 19 Structured Output from Deep Networks pdf, pptx
Wed, Nov 21 No classes, Institute holiday
Mon, Nov 26 "Unsupervised" Learning and Colorization pdf, pptx
Wed, Nov 28 Quiz 2
Mon, Dec 3 (optional) Ongoing research presentations from PhD students.
Wed, Dec 5 No classes, reading period
Final Exam Period - not used


The materials from this class rely significantly on slides prepared by other instructors, especially Derek Hoiem and Svetlana Lazebnik. Each slide set and assignment contains acknowledgements. Feel free to use these slides for academic or research purposes, but please maintain all acknowledgements.