CS 4476-B Computer Vision

Fall 2021, MW 12:30 to 1:45, CCB 16. Synchronous remote lecture on Bluejeans
Instructor: James Hays

TAs: Ben Wilson (head TA), Bharat Mamidibathula, Gunhyun Park, Jonathan Leo, Otis Smith, Pranav Khorana, Sukriti Bhardwaj, Tony Zhang, Xueqing Li, Yash Kothari, Yoonwoo Kim

Computer Vision, art by kirkh.deviantart.com

Course Description

This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll explore methods for depth recovery from stereo images, camera calibration, automated alignment, tracking, boundary detection, and recognition. We'll use both classical machine learning and deep learning to approach these problems. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.

Learning Objectives

Upon completion of this course, students should be able to:

Prerequisites

No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:

Grading

Your final grade will be made up from

You will lose 10% each day for late projects. However, you have six "late days" for the whole course. That is to say, the first 24 hours after the due date and time counts as 1 day, up to 48 hours is two and 72 for the third late day. This will not be reflected in the initial grade reports for your assignment, but they will be factored in and distributed at the end of the semester so that you get the most points possible.

These late days are intended to cover unexpected clustering of due dates, travel commitments, interviews, hackathons, etc. Don't ask for extensions to due dates because we are already giving you a pool of late days to manage yourself. In fact, we're doubling the pool of late days this semester because of the difficult circumstances.

Academic Integrity

Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation of the Honor Code. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Georgia Tech Academic Honor Code and Student Code of Conduct, available online at www.honor.gatech.edu. We will use tools to find code sharing in projects.

You are expected to implement the core components of each project on your own, but the extra credit opportunties often build on third party data sets or code. That's fine. Feel free to include results built on other software, as long as you are clear in your handin that it is not your own work.

You should not view or edit anyone else's code. You should not post code to Canvas, except for starter code / helper code that isn't related to the core project.

Learning Accommodations

If needed, we will make classroom accommodations for students with documented disabilities. These accommodations must be arranged in advance and in accordance with the office of disability services. (disabilityservices.gatech.edu).

Contact Info and Office Hours:

If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or staff. Office Hours

Projects

Optional Project 0: Test environment setup. Github
Project 1: Convolution and Hybrid images. Github, Handout pdf
Project 2: SIFT Local Feature Matching. Github, Handout pdf
Project 3: Camera Calibration and Fundamental Matrix Estimation with RANSAC. Github, Handout pdf
Project 4: Recognition with deep learning Github, Handout pdf
Project 5: Semantic Segmentation Github, Handout pdf
All starter code and projects will be in Python with the use of various third party libraries. We will make an effort to support MacOS, Windows, and Linux. The course does not teach python and assumes you have enough familiarity with procedural programming languages to complete the projects.

Textbook

Readings will be assigned in "Computer Vision: Algorithms and Applications, 2nd edition" by Richard Szeliski. The book is available for free online or available for purchase.

Syllabus

Class Date Topic Slides Reading Projects
Mon, Aug 23 Introduction to computer vision pdf, pptx Szeliski 1
Image Formation and Filtering (Szeliski chapters 2 and 3)
Wed, Aug 25 Camera Projection and Image Filtering pdf, pptx Szeliski 2.1, especially 2.1.4
Wed, Aug 30 Thinking in Frequency pdf, pptx Szeliski 3.2 and 3.4
Mon, Sep 1 Thinking in Frequency, continued
Mon, Sep 6 No classes, Institute holiday
Wed, Sep 8 Light and Color and Sensors pdf, pptx Szeliski 2.2 and 2.3
Feature Detection and Matching
Mon, Sep 13 Interest points and corners pdf, pptx Szeliski 7.1.1 and 7.1.2
Wed, Sep 15 Local image features pdf, pptx Szeliski 7.1.3
Mon, Sep 20 Model fitting, Hough Transform pdf, pptx Szeliski 7.4.2 and 2.1
Wed, Sep 22 RANSAC and transformations pdf, pptx Szeliski 8.1 and 2.1
Multiple Views and Motion
Mon, Sep 27 Stereo intro pdf, pptx Szeliski 12 and 11.2.1
Wed, Sep 29 Camera Calibration, Epipolar Geometry pdf, pptx Szeliski 11
Mon, Oct 4 Dense Stereo Correspondence pdf, pptx Szeliski 12
Wed, Oct 6 Optical Flow pdf, pptx Szeliski 9.4
Recognition
Mon, Oct 11 No classes, Institute holiday
Wed, Oct 13 Machine learning crash course pdf, pptx Szeliski 5.1 and 5.2
Mon, Oct 18 Machine learning crash course, part 2 and neural network intro pdf, pptx Szeliski 5.1 and 5.2 and 5.3
Wed, Oct 20 Convolutional Neural Networks and Network Visualization pdf, pptx Szeliski 5.3 and 5.4
Mon, Oct 25 Classical recognition techniques and Deeper Deep Architectures pdf, pptx Szeliski 6.2.1
Wed, Oct 27 ResNet, Big Data pdf, pptx
Mon, Nov 1 Big Data, Crowdsourcing pdf, pptx Szeliski 6.3
Wed, Nov 3 "Unsupervised" Learning and Colorization pdf, pptx
Mon, Nov 8 Semantic Segmentation pdf, pptx Szeliski 5.5
Wed, Nov 10 Deep Object Detection and Structured Output from Deep Networks pdf, pptx Szeliski 5.4.7
Mon, Nov 15 3D Point Processing and Lidar pdf, pptx
Wed, Nov 17 Transformer architectures pdf, pptx
Mon, Nov 22 No lecture, work on Project 5
Wed, Nov 24 No lecture, Institute holiday
Mon, Nov 29 Guest lectures: John Lambert (multiview geometry) and Amit Raj (neural rendering) John Lambert pdf
Wed, Dec 1 Guest lecture: Sean Foley (remote sensing)
Mon, Dec 6 No lecture
Wed, Dec 8 No lecture
Final Exam Period Not used. No class or final exam

Acknowledgements

The materials from this class rely significantly on slides prepared by other instructors, especially Derek Hoiem and Svetlana Lazebnik. Each slide set and assignment contains acknowledgements. Feel free to use these slides for academic or research purposes, but please maintain all acknowledgements.