CS 4476-B / 6476-A Computer Vision

Spring 2022, TR 3:30 to 4:45, Clough 152. Synchronous lecture on Bluejeans
Instructor: James Hays

TAs: Cusuh Ham (head TA), Otis Smith, Pranav Khorana, Sukriti Bhardwaj, Xueqing Li, Yash Kothari, Yoonwoo Kim, Wei Xiong Toh

Computer Vision, art by kirkh.deviantart.com

Course Description

This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll explore methods for depth recovery from stereo images, camera calibration, automated alignment, tracking, boundary detection, and recognition. We'll use both classical machine learning and deep learning to approach these problems. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.

Learning Objectives

Upon completion of this course, students should be able to:

Prerequisites

No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful. The following skills are necessary for this class:

Compute Requirements

The institute has minimum compute requirements for all students that you are expected to meet. The projects are somewhat compute intensive, though, so a faster machine will let you iterate more quickly. The institute requirements say nothing about GPUs, though. The deep learning projects can benefit from (but will not require) a GPU. Since we cannot rely on students having GPUs, projects may use cloud services such as Google Colab.

Grading

Your final grade will be made up from
Final grades will be A/B/C etc for 90.00%+, 80.00%+, 70.00%+, etc.

You will lose 10% each day for late projects. However, you have six "late days" for the whole course. That is to say, the first 24 hours after the due date and time counts as 1 day, up to 48 hours is two and 72 for the third late day. This will not be reflected in the initial grade reports for your assignment, but they will be factored in and distributed at the end of the semester so that you get the most points possible. A late day cannot be split among projects (e.g. half a late day for project 1, and half a late day for project 2). They are used in integer amounts.

If you are taking this course, the expectation is that you have set aside the considerable amount of time needed to get your projects done in a timely manner. These late days are intended to cover unexpected clustering of due dates, travel commitments, interviews, hackathons, computer problems, extracurricular commitments, etc. Don't ask for extensions to due dates because we are already giving you a pool of late days to manage yourself. In fact, we're doubling the pool of late days this semester because of the difficult circumstances. If you are seriously ill and need more time on projects beyond what late days can cover, you should submit documentation to the Dean of Students office and they will reach out to us.

Academic Integrity

Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, code copying, or helping others commit a violation of the Honor Code. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Georgia Tech Academic Honor Code. We will use tools to find code sharing in projects.

You are expected to implement the core components of each project on your own, but the extra credit opportunties often build on third party data sets or code. That's fine. Feel free to include results built on other software, as long as you are clear in your handin that it is not your own work.

You should not view, edit, or copy anyone else's code. You should not post code to Canvas, except for starter code or helper code that isn't related to the core project.

More detailed guidance about what is and is not allowed in terms of collaboration with other students:
Allowed

Not allowed

Getting Help

The projects for this course are substantial and it will be common for students to need clarifications or guidance. Some of the explanation for projects will come from the lecture materials. Some guidance come from the project handout itself. If you are stumped or blocked, you can seek help from other students (within the parameters outlined above), you can ask for help on Piazza (but don't post any code unless your question is private), or you can attend TA office hours. Do not wait until the last few days of a project to post Piazza questions or attend office hours, because the TAs might be swamped.

The TAs job is not to debug your code. Do not dump entire source code files onto Piazza asking TAs to fix a bug. Do not hand your laptop to a TA at office hours and ask them to fix your code. The TAs primary goal is to make sure you understand the computer vision concepts necessary to complete the projects. Sometimes there will need to be discussions or troubleshooting at the code level, but don't be surprised if TAs try to minimize such interactions. If you want help from TAs, you should be ready to present the debugging that you've already done, e.g. "On this type of input, here's a visualization of the output that I'm getting".

Over the years, we have refined the projects and added unit tests and autograders so that students can more easily debug their own projects. Unfortunately, this sometimes leads to lazy development patterns. Students will express frustration that "my code passes the unit tests but the autograder gives errors" and expect the TAs to fix this situation. Keep in mind that the unit tests are necessary but not sufficient checks for code correctness. The unit tests help catch some commons errors, but it is expected that code might pass unit tests and fail the autograder if you did not implement something correctly. Even the autograder cannot catch all problems for all projects, because it can be difficult to check the correctness for certain computer vision algorithms. Your code might pass unit tests, get full marks from the autograder, but still have subtle errors that manifest elsewhere in the project. You will have more success in this class if you make an effort to understand the computer vision concepts being taught and not simply optimize for the autograder.

Learning Accommodations

If needed, we will make classroom accommodations for students with documented disabilities. These accommodations must be arranged in advance and in accordance with the office of disability services. (disabilityservices.gatech.edu).

Contact Info and Office Hours:

If possible, please use Piazza to ask questions and seek clarifications before emailing the instructor or staff. Office Hours

Projects

Optional Project 0: Test environment setup. Github
Project 1: Convolution and Hybrid images. Github
Project 2: SIFT Local Feature Matching. Github
Project 3: Camera Calibration and Fundamental Matrix Estimation with RANSAC. Github
Project 4: Recognition with deep learning. Github
Project 5: Semantic Segmentation. Github
All starter code and projects will be in Python with the use of various third party libraries. We will make an effort to support MacOS, Windows, and Linux. The course does not teach python and assumes you have enough familiarity with procedural programming languages to complete the projects.

Textbook

Readings will be assigned in "Computer Vision: Algorithms and Applications, 2nd edition" by Richard Szeliski. The book is available for free online or available for purchase.

Syllabus (tentative)

Class Date Topic Slides Reading
Tue, Jan 11 Introduction to computer vision pdf, pptx Szeliski 1
Image Formation and Filtering (Szeliski chapters 2 and 3)
Thu, Jan 13 Camera Projection and Image Filtering pdf, pptx Szeliski 2.1, especially 2.1.4
Tue, Jan 18 Thinking in Frequency pdf, pptx Szeliski 3.2 and 3.4
Thu, Jan 20 Thinking in Frequency, continued. pdf, pptx
Tue, Jan 25 Sampling and Aliasing. Light, cameras, eyes, and color. pdf, pptx Szeliski 2.2 and 2.3
Feature Detection and Matching
Thu, Jan 27 Interest points and corners pdf, pptx Szeliski 7.1.1 and 7.1.2
Tue, Feb 1 Local image features pdf, pptx Szeliski 7.1.3
Thu, Feb 3 Model fitting, Hough Transform pdf, pptx Szeliski 7.4.2 and 2.1
Tue, Feb 8 RANSAC and transformations pdf, pptx Szeliski 8.1 and 2.1
Multiple Views and Motion
Thu, Feb 10 Stereo intro pdf, pptx Szeliski 12 and 11.2.1
Tue, Feb 15 Remote guest lecture: John Lambert, GTSFM pdf
Thu, Feb 17 Camera Calibration, Epipolar Geometry pdf, pptx Szeliski 11
Tue, Feb 22 Dense Stereo Correspondence pdf, pptx Szeliski 12
Thu, Feb 24 Optical Flow pdf, pptx Szeliski 9.4
Recognition
Tue, Mar 1 Machine learning crash course pdf, pptx Szeliski 5.1 and 5.2
Thu, Mar 3 Machine learning crash course, part 2 pdf, pptx Szeliski 5.1 and 5.2 and 5.3
Tue, Mar 8 Neural Networks pdf, pptx Szeliski 5.3 and 5.4
Thu, Mar 10 Convolutional Networks and Network Visualization pdf, pptx Szeliski 5.3 and 5.4
Tue, Mar 15 Classical recognition techniques and Deeper Deep Architectures pdf, pptx Szeliski 6.2.1
Thu, Mar 17 ResNet, Big Data pdf, pptx
Tue, Mar 22 No lecture, Institute holiday
Thu, Mar 24 No lecture, Institute holiday
Tue, Mar 29 Guest lecture: Sean Foley pdf
Thu, Mar 31 Guest lecture: Amit Raj and Cusuh Ham
Tue, Apr 5 Big Data, Crowdsourcing pdf, pptx Szeliski 6.3
Thu, Apr 7 "Unsupervised" Learning and Colorization pdf, pptx
Tue, Apr 12 Semantic Segmentation pdf, pptx Szeliski 5.5
Thu, Apr 14 Deep Object Detection and Structured Output from Deep Networks pdf, pptx Szeliski 5.4.7
Tue, Apr 19 3D Point Processing and Lidar pdf, pptx
Thu, Apr 21 Transformer architectures pdf, pptx
Final Exam Period Not used. No class or final exam

Acknowledgements

The materials from this class rely significantly on slides prepared by other instructors, especially Derek Hoiem and Svetlana Lazebnik. Each slide set and assignment contains acknowledgements. Feel free to use these slides for academic or research purposes, but please maintain all acknowledgements.