CS 7476 Advanced Computer Vision
Spring 2016, MWF 12:05 to 12:55, Mason 2117.
Instructor: James Hays
TA: Nam Vo

Course Description
This course covers advanced research topics in computer vision. Building on the introductory materials in CS 6476 (Computer Vision), this class will prepare graduate students in both the theoretical foundations of computer vision as well as the practical approaches to building real Computer Vision systems. This course investigates current research topics in computer vision with an emphasis on recognition tasks and deep learning. We will examine data sources, features, and learning algorithms useful for understanding and manipulating visual data. Several topics will straddle the boundary between computer vision and computer graphics. Class topics will be pursued through independent reading, class discussion and presentations, and state-of-the-art projects.The goal of this course is to give students the background and skills necessary to perform research in computer vision and its application domains such as robotics, healthcare, and graphics. Students should understand the strengths and weaknesses of current approaches to research problems and identify interesting open questions and future research directions. Students will hopefully improve their critical reading and communication skills, as well.
Course Requirements
Reading and Summaries
Students will be expected to read one paper for each class. For each assigned paper, students must write a two or three sentence summary and identify at least one question or topic of interest for class discussion. Interesting topics for discussion could relate to strengths and weaknesses of the paper, possible future directions, connections to other research, uncertainty about the conclusions of the experiments, etc. Reading summaries must be posted to the class blog http://cs7476.blogspot.com/ by 11:59pm the day before each class. Feel free to reply to other comments on the blog and help each other understanding confusing aspects of the papers. The blog discussion will be the starting point for the class discussion. If you are presenting you don't need to post a summary to the blog.Class participation
All students are expected to take part in class discussions. If you do not fully understand a paper that is OK. We can work through the unclear aspects of a paper together in class. If you are unable to attend a specific class please let me know ahead of time (and have a good excuse!).Presentation(s)
Each student will lead the presentation of one paper during the semester. Ideally, students would implement some aspect of the presented material and perform experiments that help us to understand the algorithms. Presentations and all supplemental material should be ready one week before the presentation date so that students can meet with the instructor, go over the presentation, and possibly iterate before the in-class discussion. For the presentations it is fine to use slides and code from outside sources (for example, the paper authors) but be sure to give credit.Semester group projects
Students will work in pairs to complete a state-of-the-art research project on a topic relevant to the course. Students will propose a research topic early in the semester. After a project topic is finalized, students will meet occasionally with the instructor to discuss progress. Students will present their progress on their semester project twice during the course and the course will end with final project presentations. Students will also produce a conference-formatted write-up of their project. Projects will be published on the this web page. The ideal project is something with a clear enough direction to be completed in a couple of months, and enough novelty such that it could be published in a peer-reviewed venue with some refinement and extension.Prerequisites
Strong mathematical skills (linear algebra, calculus, probability and statistics) are needed. It is strongly recommended that students have taken one of the following courses (or equivalent courses at other institutions):- Computer Vision (e.g. 4495 / 6476)
- Computer Graphics
- Computational Photography
We will not rely on a textbook, although the free, online textbook "Computer Vision: Algorithms and Applications" by Richard Szeliski is a helpful resource.Grading
Your final grade will be made up from- 20% Reading summaries posted to class blog
- 15% Classroom participation and attendance
- 15% Research presentation(s)
- 50% Semester project
Final project writeups for 2016
Highlighted project:Avery Allen and Wenchen Li, Generative Adversarial Denoising Autoencoder for Face Completion. [webpage]
All projects:
Cusuh Ham, Sketch-Based Image Synthesis. [webpage]
John Turner and Siddharth Raja, O'FaMACap dataset (Obama Face&Mouth Image/Audio/Caption) and LSTM-based lipreader. [webpage]
Carl Saldanha, Visual Question Generation. [webpage]
Varun Agrawal and Palash Shastri, Deep Learning on the Yelp Image Dataset. [webpage]
Vasavi Gajarla and Aditi Gupta, Emotion Detection and Sentiment Analysis of Images. [pdf]
Avinash Bhaskaran and Anusha Sridhar Rao, Structure from Motion using Uncalibrated Cameras. [pdf]
Huda Alamri and Julia Deeb, Diving Deeper into IM2GPS. [pdf]
Jonathan Suit, Generating Facial Expressions. [pdf]
Punarva Katte and Prabhudev Prakash, Billboard Content Recognition for Driver Assistance Systems. [pdf]
Sam Seifert, Autocomplete Sketch Tool. [pdf]
Shantanu Deshpande and Naman Goyal, Sketch Based Image Retrieval. [pdf]
Stefano Fenu and Carden Bagwell, Image Colorization using Residual Networks. [pdf]
Tentative Schedule
Date | Paper | Paper, Project page | Presenter |
Mon, Jan 11 | Introduction | James | |
Wed, Jan 13 | Scene Completion Using Millions of Photographs. James Hays, Alexei A. Efros. ACM Transactions on Graphics (SIGGRAPH 2007). August 2007, vol. 26, No. 3. | project page | James |
Fri, Jan 15 | Microsoft COCO: Common Objects in Context. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C. Lawrence Zitnick. ECCV 2014. | project page, paper | James |
Mon, Jan 18 | No Classes | ||
Wed, Jan 20 | Photo Clip Art. Jean-Francois Lalonde, Derek Hoeim, Alexei A. Efros, Carsten Rother, John Winn and Antonio Criminisi. ACM Transactions on Graphics (SIGGRAPH 2007). | project page | James |
Fri, Jan 22 | Snow Day | ||
Mon, Jan 25 | Learning to predict where humans look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. IEEE International Conference on Computer Vision (ICCV), 2009. | project page | James |
Wed, Jan 27 | CVPR 2014 Tutorial on Deep Learning. Graham Taylor, Marc'Aurelio Ranzato, and Honglak Lee. Read only the first two sets of labeled Introduction and Supervised learning. | CVPR 2014 tutorial | James |
Fri, Jan 29 | ImageNet Classification with Deep Convolutional Neural Networks. Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton. NIPS 2012. | James | |
Mon, Feb 1 | ImageNet: A Large-Scale Hierarchical Image Database. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei. IEEE Computer Vision and Pattern Recognition (CVPR), 2009 | pdf, project page | Carl |
Wed, Feb 3 | Sketch2Photo: Internet Image Montage. ACM SIGGRAPH ASIA 2009, ACM Transactions on Graphics. Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu. | project page | James |
Fri, Feb 5 | How do humans sketch objects? Mathias Eitz, James Hays, and Marc Alexa. Siggraph 2012. | project page | Sam |
Mon, Feb 8 | Project Status Updates. | Everyone | |
Wed, Feb 10 | Project Status Updates. | Everyone | |
Fri, Feb 12 | What makes Paris look like Paris? Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. Siggraph 2012. | project page | Julia |
Mon, Feb 15 | Going Deeper with Convolutions. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. 2014. | arXiv | James |
Wed, Feb 17 | PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. Siggraph 2009. (rescheduled -- not a deep learning paper) | project page | Anusha |
Fri, Feb 19 | Deep Neural Decision Forests. Peter Kontschieder, Madalina Fiterau, Antonio Criminisi, and Samuel Rota Bulo. ICCV 2015. | Project page | Varun |
Mon, Feb 22 | Learning Deep Features for Scene Recognition using Places Database. B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. NIPS 2014. | project page, pdf, demo | John T |
Wed, Feb 25 | Object Detectors Emerge in Deep Scene CNNs. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba. ICLR, 2015. | project page, arXiv | James |
Fri, Feb 27 | Understanding Deep Image Representations by Inverting Them. Aravindh Mahendran, Andrea Vedaldi. CVPR 2015. | arXiv | James |
Mon, Feb 29 | Selective Search for Object Recognition. J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders. IJCV 2013. | project page | Avinash |
Wed, Mar 2 | DeepBox: Learning Objectness with Convolutional Networks. Weicheng Kuo, Bharath Hariharan, Jitendra Malik. ICCV 2015. | arXiv | Punarva |
Fri, Mar 4 | Diagnosing error in object detectors. Derek Hoiem, Yodsawalai Chodpathumwan, and Qieyun Dai. ECCV 2012. | project page | Aditi |
Mon, Mar 7 | Fast R-CNN. Ross Girshick. ICCV 2015. | arXiv, code | Wenchen |
Wed, Mar 9 | Fully Convolutional Networks for Semantic Segmentation. Jonathan Long, Evan Shelhamer, Trevor Darrell. CVPR 2015. | arXiv | Siddharth |
Fri, Mar 11 | Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. MS COCO detection challenge winner 2015. | arXiv | Prateek |
Mon, Mar 14 | Unsupervised Visual Representation Learning by Context Prediction. Carl Doersch, Abhinav Gupta, Alexei A. Efros. ICCV 2015. | project page | Stefano |
Wed, Mar 16 | Learning Visual Similarity for Product Design with Convolutional Neural Networks. Sean Bell, Kavita Bala. Siggraph 2015. | author page, pdf | Palash |
Fri, Mar 18 | AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections. Jun-Yan Zhu, Yong Jae Lee, Alexei Efros. Siggraph 2014. | project page | Avery |
Mon, Mar 21 | No Classes | ||
Wed, Mar 23 | No Classes | ||
Fri, Mar 25 | No Classes | ||
Mon, Mar 28 | Project Status Updates. | Everyone | |
Wed, Mar 30 | Project Status Updates. | Everyone | |
Fri, Apr 1 | Exploring Nearest Neighbor Approaches for Image Captioning. Jacob Devlin, Saurabh Gupta, Ross Girshick, Margaret Mitchell, C Lawrence Zitnick. arXiv, 2015. | arXiv | Naman |
Mon, Apr 4 | VQA: Visual Question Answering. S. Antol*, A. Agrawal*, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. ICCV, 2015. | project page, arXiv | Vasavi |
Wed, Apr 6 | Visual Madlibs: Fill in the blank Description Generation and Question Answering. Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg. ICCV, 2015. | project page, pdf | Huda |
Fri, Apr 8 | Dense Semantic Correspondence Where Every Pixel is a Classifier. Hilton Bristow, Jack Valmadre, Simon Lucey. ICCV 2015. | arXiv | Carden |
Mon, Apr 11 | Learning to Generate Chairs, Tables and Cars with Convolutional Networks. Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko, Thomas Brox. CVPR 2015. | arXiv | Shantanu |
Wed, Apr 13 | A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge. 2015. | implementation, arXiv | Cusuh |
Fri, Apr 15 | Class cancelled - work on projects! | ||
Mon, Apr 18 | Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. Alec Radford, Luke Metz, Soumith Chintala. 2015. | project page, arXiv | Jonathan S |
Wed, Apr 20 | LSDA: Large Scale Detection Through Adaptation. Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko. 2014. | arXiv | Varun |
Fri, Apr 22 | The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. Patsorn Sangkloy, Nathan Burnell, Cusuh Ham, James Hays. Siggraph 2016 | paper | |
Fri, Apr 29th (8:00 to 10:50 Exam slot) | Final Project Presentations | Everyone |
