CS 7476 Advanced Computer Vision

Spring 2016, MWF 12:05 to 12:55, Mason 2117.
Instructor: James Hays
TA: Nam Vo

Course Description

This course covers advanced research topics in computer vision. Building on the introductory materials in CS 6476 (Computer Vision), this class will prepare graduate students in both the theoretical foundations of computer vision as well as the practical approaches to building real Computer Vision systems. This course investigates current research topics in computer vision with an emphasis on recognition tasks and deep learning. We will examine data sources, features, and learning algorithms useful for understanding and manipulating visual data. Several topics will straddle the boundary between computer vision and computer graphics. Class topics will be pursued through independent reading, class discussion and presentations, and state-of-the-art projects.

The goal of this course is to give students the background and skills necessary to perform research in computer vision and its application domains such as robotics, healthcare, and graphics. Students should understand the strengths and weaknesses of current approaches to research problems and identify interesting open questions and future research directions. Students will hopefully improve their critical reading and communication skills, as well.

Course Requirements

Reading and Summaries

Students will be expected to read one paper for each class. For each assigned paper, students must write a two or three sentence summary and identify at least one question or topic of interest for class discussion. Interesting topics for discussion could relate to strengths and weaknesses of the paper, possible future directions, connections to other research, uncertainty about the conclusions of the experiments, etc. Reading summaries must be posted to the class blog http://cs7476.blogspot.com/ by 11:59pm the day before each class. Feel free to reply to other comments on the blog and help each other understanding confusing aspects of the papers. The blog discussion will be the starting point for the class discussion. If you are presenting you don't need to post a summary to the blog.

Class participation

All students are expected to take part in class discussions. If you do not fully understand a paper that is OK. We can work through the unclear aspects of a paper together in class. If you are unable to attend a specific class please let me know ahead of time (and have a good excuse!).

Presentation(s)

Each student will lead the presentation of one paper during the semester. Ideally, students would implement some aspect of the presented material and perform experiments that help us to understand the algorithms. Presentations and all supplemental material should be ready one week before the presentation date so that students can meet with the instructor, go over the presentation, and possibly iterate before the in-class discussion. For the presentations it is fine to use slides and code from outside sources (for example, the paper authors) but be sure to give credit.

Semester group projects

Students will work in pairs to complete a state-of-the-art research project on a topic relevant to the course. Students will propose a research topic early in the semester. After a project topic is finalized, students will meet occasionally with the instructor to discuss progress. Students will present their progress on their semester project twice during the course and the course will end with final project presentations. Students will also produce a conference-formatted write-up of their project. Projects will be published on the this web page. The ideal project is something with a clear enough direction to be completed in a couple of months, and enough novelty such that it could be published in a peer-reviewed venue with some refinement and extension.

Prerequisites

Strong mathematical skills (linear algebra, calculus, probability and statistics) are needed. It is strongly recommended that students have taken one of the following courses (or equivalent courses at other institutions):

Computer Vision (e.g. 4495 / 6476)
Computer Graphics
Computational Photography

If you aren't sure whether you have the background needed for the course, you can try reading some of the papers below or you can simply come to class during the first weeks.

Textbook

We will not rely on a textbook, although the free, online textbook "Computer Vision: Algorithms and Applications" by Richard Szeliski is a helpful resource.

Grading

Your final grade will be made up from

20% Reading summaries posted to class blog
15% Classroom participation and attendance
15% Research presentation(s)
50% Semester project

Office Hours:

James Hays, Monday and Wednesday 1-2pm, CCB 315
Nam Vo, Friday 2-4pm, CCB 308L

Final project writeups for 2016

Highlighted project:
Avery Allen and Wenchen Li, Generative Adversarial Denoising Autoencoder for Face Completion. [webpage]

All projects:
Cusuh Ham, Sketch-Based Image Synthesis. [webpage]
John Turner and Siddharth Raja, O'FaMACap dataset (Obama Face&Mouth Image/Audio/Caption) and LSTM-based lipreader. [webpage]
Carl Saldanha, Visual Question Generation. [webpage]
Varun Agrawal and Palash Shastri, Deep Learning on the Yelp Image Dataset. [webpage]
Vasavi Gajarla and Aditi Gupta, Emotion Detection and Sentiment Analysis of Images. [pdf]
Avinash Bhaskaran and Anusha Sridhar Rao, Structure from Motion using Uncalibrated Cameras. [pdf]
Huda Alamri and Julia Deeb, Diving Deeper into IM2GPS. [pdf]
Jonathan Suit, Generating Facial Expressions. [pdf]
Punarva Katte and Prabhudev Prakash, Billboard Content Recognition for Driver Assistance Systems. [pdf]
Sam Seifert, Autocomplete Sketch Tool. [pdf]
Shantanu Deshpande and Naman Goyal, Sketch Based Image Retrieval. [pdf]
Stefano Fenu and Carden Bagwell, Image Colorization using Residual Networks. [pdf]

Tentative Schedule

Date	Paper	Paper, Project page	Presenter
Mon, Jan 11	Introduction		James
Wed, Jan 13	Scene Completion Using Millions of Photographs. James Hays, Alexei A. Efros. ACM Transactions on Graphics (SIGGRAPH 2007). August 2007, vol. 26, No. 3.	project page	James
Fri, Jan 15	Microsoft COCO: Common Objects in Context. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C. Lawrence Zitnick. ECCV 2014.	project page, paper	James
Mon, Jan 18	No Classes
Wed, Jan 20	Photo Clip Art. Jean-Francois Lalonde, Derek Hoeim, Alexei A. Efros, Carsten Rother, John Winn and Antonio Criminisi. ACM Transactions on Graphics (SIGGRAPH 2007).	project page	James
Fri, Jan 22	Snow Day
Mon, Jan 25	Learning to predict where humans look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. IEEE International Conference on Computer Vision (ICCV), 2009.	project page	James
Wed, Jan 27	CVPR 2014 Tutorial on Deep Learning. Graham Taylor, Marc'Aurelio Ranzato, and Honglak Lee. Read only the first two sets of labeled Introduction and Supervised learning.	CVPR 2014 tutorial	James
Fri, Jan 29	ImageNet Classification with Deep Convolutional Neural Networks. Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton. NIPS 2012.	pdf	James
Mon, Feb 1	ImageNet: A Large-Scale Hierarchical Image Database. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei. IEEE Computer Vision and Pattern Recognition (CVPR), 2009	pdf, project page	Carl
Wed, Feb 3	Sketch2Photo: Internet Image Montage. ACM SIGGRAPH ASIA 2009, ACM Transactions on Graphics. Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu.	project page	James
Fri, Feb 5	How do humans sketch objects? Mathias Eitz, James Hays, and Marc Alexa. Siggraph 2012.	project page	Sam
Mon, Feb 8	Project Status Updates.		Everyone
Wed, Feb 10	Project Status Updates.		Everyone
Fri, Feb 12	What makes Paris look like Paris? Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. Siggraph 2012.	project page	Julia
Learned Representations, ConvNets, Visualizations
Mon, Feb 15	Going Deeper with Convolutions. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. 2014.	arXiv	James
Wed, Feb 17	PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. Siggraph 2009. (rescheduled -- not a deep learning paper)	project page	Anusha
Fri, Feb 19	Deep Neural Decision Forests. Peter Kontschieder, Madalina Fiterau, Antonio Criminisi, and Samuel Rota Bulo. ICCV 2015.	Project page	Varun
Mon, Feb 22	Learning Deep Features for Scene Recognition using Places Database. B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. NIPS 2014.	project page, pdf, demo	John T
Wed, Feb 25	Object Detectors Emerge in Deep Scene CNNs. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba. ICLR, 2015.	project page, arXiv	James
Fri, Feb 27	Understanding Deep Image Representations by Inverting Them. Aravindh Mahendran, Andrea Vedaldi. CVPR 2015.	arXiv	James
Object Proposals
Mon, Feb 29	Selective Search for Object Recognition. J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders. IJCV 2013.	project page	Avinash
Wed, Mar 2	DeepBox: Learning Objectness with Convolutional Networks. Weicheng Kuo, Bharath Hariharan, Jitendra Malik. ICCV 2015.	arXiv	Punarva
ConvNet detection and segmentation
Fri, Mar 4	Diagnosing error in object detectors. Derek Hoiem, Yodsawalai Chodpathumwan, and Qieyun Dai. ECCV 2012.	project page	Aditi
Mon, Mar 7	Fast R-CNN. Ross Girshick. ICCV 2015.	arXiv, code	Wenchen
Wed, Mar 9	Fully Convolutional Networks for Semantic Segmentation. Jonathan Long, Evan Shelhamer, Trevor Darrell. CVPR 2015.	arXiv	Siddharth
Fri, Mar 11	Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. MS COCO detection challenge winner 2015.	arXiv	Prateek
Weakly Supervised ConvNets
Mon, Mar 14	Unsupervised Visual Representation Learning by Context Prediction. Carl Doersch, Abhinav Gupta, Alexei A. Efros. ICCV 2015.	project page	Stefano
Siamese / Ranking / Triplet ConvNets
Wed, Mar 16	Learning Visual Similarity for Product Design with Convolutional Neural Networks. Sean Bell, Kavita Bala. Siggraph 2015.	author page, pdf	Palash
Data-driven Image Synthesis
Fri, Mar 18	AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections. Jun-Yan Zhu, Yong Jae Lee, Alexei Efros. Siggraph 2014.	project page	Avery
Mon, Mar 21	No Classes
Wed, Mar 23	No Classes
Fri, Mar 25	No Classes
Mon, Mar 28	Project Status Updates.		Everyone
Wed, Mar 30	Project Status Updates.		Everyone
Images and Words
Fri, Apr 1	Exploring Nearest Neighbor Approaches for Image Captioning. Jacob Devlin, Saurabh Gupta, Ross Girshick, Margaret Mitchell, C Lawrence Zitnick. arXiv, 2015.	arXiv	Naman
Mon, Apr 4	VQA: Visual Question Answering. S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh. ICCV, 2015.	project page, arXiv	Vasavi
Wed, Apr 6	Visual Madlibs: Fill in the blank Description Generation and Question Answering. Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg. ICCV, 2015.	project page, pdf	Huda
Texture, Image Statistics, correspondence
Fri, Apr 8	Dense Semantic Correspondence Where Every Pixel is a Classifier. Hilton Bristow, Jack Valmadre, Simon Lucey. ICCV 2015.	arXiv	Carden
Generative ConvNets
Mon, Apr 11	Learning to Generate Chairs, Tables and Cars with Convolutional Networks. Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko, Thomas Brox. CVPR 2015.	arXiv	Shantanu
Wed, Apr 13	A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge. 2015.	implementation, arXiv	Cusuh
Fri, Apr 15	Class cancelled - work on projects!
Mon, Apr 18	Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. Alec Radford, Luke Metz, Soumith Chintala. 2015.	project page, arXiv	Jonathan S
Wed, Apr 20	LSDA: Large Scale Detection Through Adaptation. Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko. 2014.	arXiv	Varun
Fri, Apr 22	The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. Patsorn Sangkloy, Nathan Burnell, Cusuh Ham, James Hays. Siggraph 2016	paper
Fri, Apr 29th (8:00 to 10:50 Exam slot)	Final Project Presentations		Everyone

Previous topics (which you should know)

Date	Paper	Paper, Project page	Presenter
Fundamental representations
?	Object recognition from local scale-invariant features, David Lowe, ICCV 1999.	pdf, project page	?
?	Video Google: A Text Retrieval Approach to Object Matching in Videos. Sivic, J. and Zisserman, A. Proceedings of the International Conference on Computer Vision (2003)	pdf, project page	?
?	Histograms of Oriented Gradients for Human Detection. Navneet Dalal and Bill Triggs. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition, 2005.	.pdf	?
?	Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce, CVPR 2006.	pdf, slides	?
Databases
?	LabelMe: a Database and Web-based Tool for Image Annotation. B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman. International Journal of Computer Vision, 2008.	pdf, project page	?
?	80 million tiny images: a large dataset for non-parametric object and scene recognition. A. Torralba, R. Fergus, W. T. Freeman. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30(11), 2008.	pdf, project page	?
?	Describing Objects by Their Attributes. A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth. CVPR 2009	project page	?
?	SUN Database: Exploring a Large Collection of Scene Categories J. Xiao, K. Ehinger, J. Hays, A. Oliva, and A. Torralba. IJCV 2014.	project page, pdf	?

CS 7476 Advanced Computer Vision

Spring 2016, MWF 12:05 to 12:55, Mason 2117.
Instructor: James Hays
TA: Nam Vo

Course Description

Course Requirements

Reading and Summaries

Class participation

Presentation(s)

Semester group projects

Prerequisites

Textbook

Grading

Office Hours:

Final project writeups for 2016

Tentative Schedule

Suggested Topics

Previous topics (which you should know)

Other previous topics

CS 7476 Advanced Computer Vision

Spring 2016, MWF 12:05 to 12:55, Mason 2117. Instructor: James Hays TA: Nam Vo

Course Description

Course Requirements

Reading and Summaries

Class participation

Presentation(s)

Semester group projects

Prerequisites

Textbook

Grading

Office Hours:

Final project writeups for 2016

Tentative Schedule

Suggested Topics

Previous topics (which you should know)

Other previous topics

Spring 2016, MWF 12:05 to 12:55, Mason 2117.
Instructor: James Hays
TA: Nam Vo