Assistant Professor in the School of Interactive Computing at Georgia Tech and a member of the Machine Learning Center. Research interests include computer vision, machine learning, domain adaptation, robustness, and fairness.
Prior to joining Georgia Tech, Dr. Hoffman was a Visiting Research Scientist at Facebook AI Research and a postdoctoral scholar at Stanford University and UC Berkeley. She received her PhD from UC Berkeley, EECS in 2016 where she was a member of BAIR and BDD.
Prospective Students: Read before contacting.
If you are interested in joining my group and are not currently at Georgia Tech, please apply directly to the college. Unfortunately, due to the volume of requests I receive, I may not be able to respond to individual requests from students outside Tech. If you are already a PhD student at Georgia Tech, feel free to contact me directly via email and include your resume and research interests. For GT MS or undergraduate students, we list information here when a position is available. No positions are available for Spring 2025. My group is not accepting visitors from outside Georgia Tech at this time.
Bio | CV | Google Scholar | Github | Twitter
|
Fiona Ryan George Stoica Simar Kareer Pratik Ramesh Sahil Khose Mengqi Zhang Bhavika Devnani Bogi Ecsedi Ajay Bati Vincent Cartillier Prithvijit Chattopadhyay Daniel Bolya Anisha Pal Sahil Khose Vivek Vijaykumar Viraj Prabhu Jakob Bjorner Sriram Venkata Yenamandra Bharat Goyal Taylor Hearn Aaditya Singh Aayushi Agarwal Deepanshi Deepanshi Sean Foley Sruthi Sudhakar Deeksha Kartik Bhavika Devnani Kartik Sarangmath Sachit Kuhar Arvind Krisnakumar Shivam Khare Rohit Mittapalli Fu Lin Luis Bermudez |
My research lies at the intersection of computer vision and machine learning and focuses on tackling real-world variation and scale while minimizing human supervision. I develop learning algorithms which facilitate transfer of information through unsupervised and semi-supervised model adaptation and generalization. |
Anisha Pal*, Julia Kruk*, Mansi Phute, Manognya Bhattaram, Diyi Yang, Duen Horng Chau, Judy Hoffman. Neural Information Processing Systems (NeurIPS), 2024 pdf | bibtex | code @inproceedings{2024_SemiTruth, author = {Pal*, Anisha and Kruk*, Julia and Phute, Mansi and Bhattaram, Manognya and Yang, Diyi and Chau, Duen Horng and Hoffman, Judy}, title = {Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors}, year = 2024, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Sahil Khose*, Anisha Pal*, Aayushi Agarwal*, Deepanshi*, Judy Hoffman, Prithvijit Chattopadhyay. European Conference in Computer Vision (ECCV), 2024 pdf | bibtex | code | project page Press: GT Article | @inproceedings{2024_Skyscenes, author = {Khose*, Sahil and Pal*, Anisha and Agarwal*, Aayushi and Deepanshi*, and Hoffman, Judy and Chattopadhyay, Prithvijit}, title = {SkyScenes: A Synthetic Dataset for Aerial Scene Understanding}, year = 2024, booktitle = {European Conference in Computer Vision (ECCV)} } |
|
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei Huang, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray. IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2024 (Oral Presentation) pdf | bibtex @inproceedings{2024_EgoExo, author = {Grauman, Kristen and Westbury, Andrew and Torresani, Lorenzo and Kitani, Kris and Malik, Jitendra and Afouras, Triantafyllos and Ashutosh, Kumar and Baiyya, Vijay and Bansal, Siddhant and Boote, Bikram and Byrne, Eugene and Chavis, Zach and Chen, Joya and Cheng, Feng and Chu, Fu-Jen and Crane, Sean and Dasgupta, Avijit and Dong, Jing and Escobar, Maria and Forigua, Cristhian and Gebreselasie, Abrham and Haresh, Sanjay and Huang, Jing and Islam, Md Mohaiminul and Jain, Suyog and Khirodkar, Rawal and Kukreja, Devansh and Liang, Kevin J and Liu, Jia-Wei and Majumder, Sagnik and Mao, Yongsen and Martin, Miguel and Mavroudi, Effrosyni and Nagarajan, Tushar and Ragusa, Francesco and Ramakrishnan, Santhosh Kumar and Seminara, Luigi and Somayazulu, Arjun and Song, Yale and Su, Shan and Xue, Zihui and Zhang, Edward and Zhang, Jinxu and Castillo, Angela and Chen, Changan and Fu, Xinzhu and Furuta, Ryosuke and Gonzalez, Cristina and Gupta, Prince and Hu, Jiabo and Huang, Yifei and Huang, Yiming and Khoo, Weslie and Kumar, Anush and Kuo, Robert and Lakhavani, Sach and Liu, Miao and Luo, Mi and Luo, Zhengyi and Meredith, Brighid and Miller, Austin and Oguntola, Oluwatumininu and Pan, Xiaqing and Peng, Penny and Pramanick, Shraman and Ramazanova, Merey and Ryan, Fiona and Shan, Wei and Somasundaram, Kiran and Song, Chenan and Southerland, Audrey and Tateno, Masatoshi and Wang, Huiyu and Wang, Yuchen and Yagi, Takuma and Yan, Mingfei and Yang, Xitong and Yu, Zecheng and Zha, Shengxin Cindy and Zhao, Chen and Zhao, Ziwei and Zhu, Zhifan and Zhuo, Jeff and Arbelaez, Pablo and Bertasius, Gedas and Crandall, David and Damen, Dima and Engel, Jakob and Farinella, Giovanni Maria and Furnari, Antonino and Ghanem, Bernard and Hoffman, Judy and Jawahar, C. V. and Newcombe, Richard and Park, Hyun Soo and Rehg, James M. and Sato, Yoichi and Savva, Manolis and Shi, Jianbo and Shou, Mike Zheng and Wray, Michael}, title = {Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives}, year = 2024, booktitle = {IEEE/CVF Computer Vision and Pattern Recognition (CVPR)} } |
|
Prithvijit Chattopadhyay, Bharat Goyal, Bogi Ecsedi, Viraj Prabhu, Judy Hoffman. International Conference on Learning Representations (ICLR), 2024 pdf | bibtex @inproceedings{2024_Augcal, author = {Chattopadhyay, Prithvijit and Goyal, Bharat and Ecsedi, Bogi and Prabhu, Viraj and Hoffman, Judy}, title = {AUGCAL: Sim-to-Real Adaptation by Improving Uncertainty Calibration on Augmented Synthetic Images}, year = 2024, booktitle = {International Conference on Learning Representations (ICLR)} } |
|
Daniel Bolya, Chaitanya Ryali, Judy Hoffman, Christoph Feichtenhofer. International Conference on Learning Representations (ICLR), 2024 pdf | bibtex @inproceedings{2024_Zipit, author = {Bolya, Daniel and Ryali, Chaitanya and Hoffman, Judy and Feichtenhofer, Christoph}, title = {Window Attention is Bugged: How not to Interpolate Position Embeddings}, year = 2024, booktitle = {International Conference on Learning Representations (ICLR)} } |
|
George Stoica, Daniel Bolya, Jakob Bjorner, Pratik Ramesh, Taylor Hearn, Judy Hoffman. International Conference on Learning Representations (ICLR), 2024 pdf | bibtex | code @inproceedings{2024_Zipit, author = {Stoica, George and Bolya, Daniel and Bjorner, Jakob and Ramesh, Pratik and Hearn, Taylor and Hoffman, Judy}, title = {ZipIt! Merging Models from Different Tasks without Training}, year = 2024, booktitle = {International Conference on Learning Representations (ICLR)} } |
|
Simar Kareer, Vivek Vijaykumar, Harsh Maheshwari, Prithvijit Chattopadhyay, Judy Hoffman, Viraj Prabhu. Transactions on Machine Learning Research (TMLR), 2024 pdf | bibtex | code | project page @inproceedings{2024TMLR_videoDA, author = {Kareer, Simar and Vijaykumar, Vivek and Maheshwari, Harsh and Chattopadhyay, Prithvijit and Hoffman, Judy and Prabhu, Viraj}, title = {We're Not Using Videos Effectively: An Updated Video Domain Adaptation Baseline}, year = 2024, booktitle = {Transactions on Machine Learning Research (TMLR)} } |
|
Simar Kareer, Dhruv Patel*, Ryan Punamiya*, Pranay Mathur*, Shuo Cheng, Chen Wang, Judy Hoffman*, Danfei Xu*. corl-workshop, 2024 pdf | bibtex | code | project page | video @inproceedings{2024Egomimic, author = {Kareer, Simar and Patel*, Dhruv and Punamiya*, Ryan and Mathur*, Pranay and Cheng, Shuo and Wang, Chen and Hoffman*, Judy and Xu*, Danfei}, title = {EgoMimic | Scaling Imitation Learning through Egocentric Video}, year = 2024, booktitle = {corl-workshop} } |
|
Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Prabhu, Gowthami Somepalli, Prithivijit Chattopadhyay, Adrien Bardes, Mark Ibrahim, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, Tom Goldstein. Neural Information Processing Systems (NeurIPS), 2023 pdf | bibtex | code @inproceedings{2023neurips_backbone, author = {Goldblum, Micah and Souri, Hossein and Ni, Renkun and Shu, Manli and Prabhu, Viraj and Somepalli, Gowthami and Chattopadhyay, Prithivijit and Bardes, Adrien and Ibrahim, Mark and Hoffman, Judy and Chellappa, Rama and Wilson, Andrew Gordon and Goldstein, Tom}, title = {Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Vision Tasks}, year = 2023, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Viraj Prabhu, Sriram Yenamandra, Prithvijit Chattopadhyay, Judy Hoffman. Neural Information Processing Systems (NeurIPS), 2023 pdf | bibtex | code | project page Press: GT Article | @inproceedings{2023neurips_lance, author = {Prabhu, Viraj and Yenamandra, Sriram and Chattopadhyay, Prithvijit and Hoffman, Judy}, title = {LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images}, year = 2023, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Sriram Yenamandra, Pratik Ramesh, Viraj Prabhu, Judy Hoffman. IEEE/CVF International Conference in Computer Vision (ICCV), 2023 pdf | bibtex | code @inproceedings{2023ICCV_FACTS, author = {Yenamandra, Sriram and Ramesh, Pratik and Prabhu, Viraj and Hoffman, Judy}, title = {FACTS: First Amplify Correlations and Then Slice to Discover Bias}, year = 2023, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Prithvijit Chattopadhyay*, Kartik Sarangmath*, Vivek Vijaykumar, Judy Hoffman. IEEE/CVF International Conference in Computer Vision (ICCV), 2023 pdf | bibtex | code @inproceedings{2023iccv_PASTA, author = {Chattopadhyay*, Prithvijit and Sarangmath*, Kartik and Vijaykumar, Vivek and Hoffman, Judy}, title = {PASTA: Proportional Amplitude Spectrum Training Augmentation for Syn-to-Real Domain Generalization}, year = 2023, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Aaditya Singh*, Kartik Sarangmath*, Prithvijit Chattopadhyay, Judy Hoffman. IEEE/CVF International Conference in Computer Vision (ICCV), 2023 pdf | bibtex | code @inproceedings{2023iccv_lowshotrobust, author = {Singh*, Aaditya and Sarangmath*, Kartik and Chattopadhyay, Prithvijit and Hoffman, Judy}, title = {Benchmarking Low-Shot Robustness to Natural Distribution Shifts}, year = 2023, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Haekyu Park, Seongmin Lee, Benjamin Hoover, Austin P. Wright, Omar Shaikh, Rahul Duggal, Nilaksh Das, Kevin Li, Judy Hoffman, Duen Horng Chau. ACM International Conference on Information and Knowledge Management (CIKM), 2023 pdf | bibtex | code @inproceedings{2023CIKM_conceptevo, author = {Park, Haekyu and Lee, Seongmin and Hoover, Benjamin and Wright, Austin P. and Shaikh, Omar and Duggal, Rahul and Das, Nilaksh and Li, Kevin and Hoffman, Judy and Chau, Duen Horng}, title = {Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries}, year = 2023, booktitle = {ACM International Conference on Information and Knowledge Management (CIKM)} } |
|
Chaitanya Ryali*, Yuan-Ting Hu*, Daniel Bolya*, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer. International Conference in Machine Learning (ICML), 2023 (Oral Presentation) pdf | bibtex | code @inproceedings{2023ICML_sMVIT, author = {Ryali*, Chaitanya and Hu*, Yuan-Ting and Bolya*, Daniel and Wei, Chen and Fan, Haoqi and Huang, Po-Yao and Aggarwal, Vaibhav and Chowdhury, Arkabandhu and Poursaeed, Omid and Hoffman, Judy and Malik, Jitendra and Li, Yanghao and Feichtenhofer, Christoph}, title = {Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles}, year = 2023, booktitle = {International Conference in Machine Learning (ICML)} } |
|
Daniel Bolya, Judy Hoffman. CVPR Workshop on Efficient Deep Learning for Computer Vision, 2023 (Oral Presentation) pdf | abstract | bibtex | code The landscape of image generation has been forever changed by open vocabulary diffusion models. However, at their core these models use transformers, which makes generation slow. Better implementations to increase the throughput of these transformers have emerged, but they still evaluate the entire model. In this paper, we instead speed up diffusion models by exploiting natural redundancy in generated images by merging redundant tokens. After making some diffusion-specific improvements to Token Merging (ToMe), our ToMe for Stable Diffusion can reduce the number of tokens in an existing Stable Diffusion model by up to 60% while still producing high quality images without any extra training. In the process, we speed up image generation by up to 2x and reduce memory consumption by up to 5.6x. Furthermore, this speed-up stacks with efficient implementations such as xFormers, minimally impacting quality while being up to 5.4x faster for large images. @inproceedings{2023_cvprw_tomesd, author = {Bolya, Daniel and Hoffman, Judy}, title = {Token Merging for Fast Stable Diffusion}, year = 2023, booktitle = {CVPR Workshop on Efficient Deep Learning for Computer Vision} } |
|
Sruthi Sudhakar, Viraj Uday Prabhu, Olga Russakovsky, Judy Hoffman. CVPR Workshop on Secure and Safe Autonomous Driving (SSAD), 2023 pdf | bibtex @inproceedings{2023_cvprw_predictiveConfounders, author = {Sudhakar, Sruthi and Prabhu, Viraj Uday and Russakovsky, Olga and Hoffman, Judy}, title = {ICON2: Reliably Benchmarking Predictive Inequity in Object Detection}, year = 2023, booktitle = {CVPR Workshop on Secure and Safe Autonomous Driving (SSAD)} } |
|
Sachit Kuhar, Alexey Tumanov, Judy Hoffman. 3rd On-Device Intelligence Workshop at MLSys, 2023 bibtex @inproceedings{2023_signedbinary, author = {Kuhar, Sachit and Tumanov, Alexey and Hoffman, Judy}, title = {Signed Binary Weight Networks}, year = 2023, booktitle = {3rd On-Device Intelligence Workshop at MLSys} } |
|
Arun Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil Katyal, Dinesh Manocha, Celso de Melo, Rama Chellappa. International Conference in Robotics and Automation (ICRA), 2023 pdf | bibtex | code @inproceedings{2023ICRA_Synth, author = {Reddy, Arun and Shah, Ketul and Paul, William and Mocharla, Rohita and Hoffman, Judy and Katyal, Kapil and Manocha, Dinesh and Melo, Celso de and Chellappa, Rama}, title = {Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances}, year = 2023, booktitle = {International Conference in Robotics and Automation (ICRA)} } |
|
Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, Judy Hoffman. International Conference on Learning Representations (ICLR), 2023 (Notable Top 5%) pdf | bibtex | code @inproceedings{2023ICLR_TokenMerging, author = {Bolya, Daniel and Fu, Cheng-Yang and Dai, Xiaoliang and Zhang, Peizhao and Feichtenhofer, Christoph and Hoffman, Judy}, title = {Token Merging: Your ViT But Faster}, year = 2023, booktitle = {International Conference on Learning Representations (ICLR)} } |
|
Viraj Prabhu, David Acuna, Yuan-Hong Liao, Rafid Mahmood, Marc T. Law, Judy Hoffman, Sanja Fidler, James Lucas. Transactions on Machine Learning Research (TMLR), 2023 pdf | bibtex @inproceedings{2023TMLR_CARE, author = {Prabhu, Viraj and Acuna, David and Liao, Yuan-Hong and Mahmood, Rafid and Law, Marc T. and Hoffman, Judy and Fidler, Sanja and Lucas, James}, title = {Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting}, year = 2023, booktitle = {Transactions on Machine Learning Research (TMLR)} } |
|
Chia-Wen Kuo, Chih-Yao Ma, Judy Hoffman, Zsolt Kira. IEEE/CVF Winter Conference on Applications in Computer Vision (WACV), 2023 |
|
Viraj Prabhu*, Sriram Yenamandra*, Aaditya Singh, Judy Hoffman. Neural Information Processing Systems (NeurIPS), 2022 pdf | bibtex | code Press: GT CoC | @inproceedings{2022NeurIPS_pacmac, author = {Prabhu*, Viraj and Yenamandra*, Sriram and Singh, Aaditya and Hoffman, Judy}, title = {Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency}, year = 2022, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Arjun Majumdar*, Gunjan Aggarwal*, Bhavika Devnani, Judy Hoffman, Dhruv Batra. Neural Information Processing Systems (NeurIPS), 2022 |
|
George Stoica, Taylor Hearn, Bhavika Devnani, Judy Hoffman. NeurIPS Workshop on Vision Transformers Theory and Applications, 2022 (Best Paper Award) |
|
Viraj Prabhu*, Shivam Khare*, Deeksha Kartik, Judy Hoffman. Workshop on Computer Vision in the Wild, ECCV, 2022 pdf | bibtex @inproceedings{2022arXiv_Zaugco, author = {Prabhu*, Viraj and Khare*, Shivam and Kartik, Deeksha and Hoffman, Judy}, title = {AUGCO: Augmentation Consistency-guided Self-training for Source-free Domain Adaptive Semantic Segmentation}, year = 2022, booktitle = {Workshop on Computer Vision in the Wild, ECCV} } |
|
Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Judy Hoffman. ECCV Workshop on Computational Aspects of Deep Learning, 2022 (Best Paper Award) pdf | bibtex @inproceedings{2022ECCVW_HydraAttention, author = {Bolya, Daniel and Fu, Cheng-Yang and Dai, Xiaoliang and Zhang, Peizhao and Hoffman, Judy}, title = {Hydra Attention: Efficient Attention with Many Heads}, year = 2022, booktitle = {ECCV Workshop on Computational Aspects of Deep Learning} } |
|
Viraj Prabhu, Ramprasaath R. Selvaraju, Judy Hoffman, Nikhil Naik. Computer Vision and Pattern Recognition (CVPR) L3D Workshop, 2022 pdf | abstract | bibtex Despite the rapid progress in deep visual recognition, modern computer vision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies. We investigate the effectiveness of unsupervised domain adaptation (UDA) of such models across geographies at closing this performance gap. To do so, we first curate two shifts from existing datasets to study the Geographical DA problem, and discover new challenges beyond data distribution shift: context shift, wherein object surroundings may change significantly across geographies, and subpopulation shift, wherein the intra-category distributions may shift. We demonstrate the inefficacy of standard DA methods at Geographical DA, highlighting the need for specialized geographical adaptation solutions to address the challenge of making object recognition work for everyone. @inproceedings{2022CVPR_GeoDA, author = {Prabhu, Viraj and Selvaraju, Ramprasaath R. and Hoffman, Judy and Naik, Nikhil}, title = {Can domain adaptation make object recognition work for everyone?}, year = 2022, booktitle = {Computer Vision and Pattern Recognition (CVPR) L3D Workshop} } |
|
Seongmin Lee, Zijie J. Wang, Judy Hoffman, Duen Horng (Polo) Chau. Computer Vision and Pattern Recognition (CVPR) Demo Track, 2022 pdf | bibtex | project page @inproceedings{2022CVPR_VisCUIT, author = {Lee, Seongmin and Wang, Zijie J. and Hoffman, Judy and Chau, Duen Horng (Polo)}, title = {VISCUIT: Visual Auditor for Bias in CNN Image Classifier}, year = 2022, booktitle = {Computer Vision and Pattern Recognition (CVPR) Demo Track} } |
|
Daniel Bolya*, Rohit Mittapali*, Judy Hoffman. Neural Information Processing Systems (NeurIPS), 2021 pdf | abstract | bibtex | code | project page | video With the preponderance of pretrained deep learning models available off-the-shelf from model banks today, finding the best weights to fine-tune to your use-case can be a daunting task. Several methods have recently been proposed to find good models for transfer learning, but they either don't scale well to large model banks or don't perform well on the diversity of off-the-shelf models. Ideally the question we want to answer is, given some data and a source model, can you quickly predict the model's accuracy after fine-tuning? We formalize this setting as Scalable Diverse Model Selection and propose several benchmarks for evaluating on this task. We find that existing model selection and transferability estimation methods perform poorly here and analyze why this is the case. We then introduce simple techniques to improve the performance and speed of these algorithms. Finally, we iterate on existing methods to create PARC, which outperforms all other methods on diverse model selection. We intend to release the benchmarks and method code in hope to inspire future work in model selection for accessible transfer learning. @inproceedings{2021NeurIPSModelFinder, author = {Bolya*, Daniel and Mittapali*, Rohit and Hoffman, Judy}, title = {Scalable Diverse Model Selection for Accessible Transfer Learning}, year = 2021, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Sruthi Sudhakar, Viraj Prabhu, Arvind Krishnakumar, Judy Hoffman. British Machine Vision Conference (BMVC), 2021 pdf | abstract | bibtex | project page As transformer architectures become increasingly prevalent in computer vision, itis critical to understand their fairness implications. We perform the first study of thefairness of transformers applied to computer vision and benchmark several bias miti-gation approaches from prior work. We visualize the feature space of the transformerself-attention modules and discover that a significant portion of the bias is encoded in thequery matrix. With this knowledge, we proposeTADeT, a targeted alignment strategyfor debiasing transformers that aims to discover and remove bias primarily from querymatrix features. We measure performance using Balanced Accuracy and Standard Ac-curacy, and fairness using Equalized Odds and Balanced Accuracy Difference.TADeTconsistently leads to improved fairness over prior work on multiple attribute predictiontasks on the CelebA dataset, without compromising performance. @inproceedings{2021BMVC_TADET, author = {Sudhakar, Sruthi and Prabhu, Viraj and Krishnakumar, Arvind and Hoffman, Judy}, title = {Mitigating Bias in Visual Transformers via Targeted Alignment}, year = 2021, booktitle = {British Machine Vision Conference (BMVC)} } |
|
Arvind Krishnakumar, Viraj Prabhu, Sruthi Sudhakar, Judy Hoffman. British Machine Vision Conference (BMVC), 2021 pdf | abstract | bibtex | code | project page Deep learning models have been shown to learn spurious correlations from data that sometimes lead to systematic failures for certain subpopulations. Prior work has typically diagnosed this by crowdsourcing annotations for various protected attributes and measur- ing performance, which is both expensive to acquire and difficult to scale. In this work, we propose UDIS, an unsupervised algorithm for surfacing and analyzing such failure modes. UDIS identifies subpopulations via hierarchical clustering of dataset embeddings and surfaces systematic failure modes by visualizing low performing clusters along with their gradient-weighted class-activation maps. We show the effectiveness of UDIS in identifying failure modes in models trained for image classification on the CelebA and MSCOCO datasets. UDIS is available at https://github.com/akrishna77/ bias- discovery. @inproceedings{2021BMVC_UDIS, author = {Krishnakumar, Arvind and Prabhu, Viraj and Sudhakar, Sruthi and Hoffman, Judy}, title = {UDIS: Unsupervised Discovery of Bias in Deep Visual Recognition Models}, year = 2021, booktitle = {British Machine Vision Conference (BMVC)} } |
|
Viraj Prabhu, Shivam Khare, Deeksha Karthik, Judy Hoffman. IEEE/CVF International Conference in Computer Vision (ICCV), 2021 pdf | abstract | bibtex | code | project page Many existing approaches for unsupervised domain adaptation (UDA) focus on adapting under only data distribution shift and offer limited success under additional cross-domain label distribution shift. Recent work based on self-training using target pseudo-labels has shown promise, but on challenging shifts pseudo-labels may be highly unreliable, and using them for self-training may cause error accumulation and domain misalignment. We propose Selective Entropy Optimization via Committee Consistency (SENTRY), a UDA algorithm that judges the reliability of a target instance based on its predictive consistency under a committee of random image transformations. Our algorithm then selectively minimizes predictive entropy to increase confidence on highly consistent target instances, while maximizing predictive entropy to reduce confidence on highly inconsistent ones. In combination with pseudo-label based approximate target class balancing, our approach leads to significant improvements over the state-of-the-art on 27/31 domain shifts from standard UDA benchmarks as well as benchmarks designed to stress-test adaptation under label distribution shift. @inproceedings{2021arXivSENTRY, author = {Prabhu, Viraj and Khare, Shivam and Karthik, Deeksha and Hoffman, Judy}, title = {Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation}, year = 2021, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Prithvijit Chattopadhyay, Judy Hoffman, Roozbeh Mottaghi, Ani Kembhavi. IEEE/CVF International Conference in Computer Vision (ICCV), 2021 (Oral Presentation) |
|
Viraj Prabhu, Arjun Chandrasekaran, Kate Saenko, Judy Hoffman. IEEE/CVF International Conference in Computer Vision (ICCV), 2021 pdf | abstract | bibtex | project page | video Generalizing deep neural networks to new target domains is critical to their real-world utility. In practice, it may be feasible to get some target data labeled, but to be cost-effective it is desirable to select a maximally-informative subset via active learning (AL). We study the problem of AL under a domain shift, called Active Domain Adaptation (Active DA). We empirically demonstrate how existing AL approaches based solely on model uncertainty or diversity sampling are suboptimal for Active DA. Our algorithm, Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings (ADA-CLUE), i) identifies target instances for labeling that are both uncertain under the model and diverse in feature space, and ii) leverages the available source and target data for adaptation by optimizing a semi-supervised adversarial entropy loss that is complementary to our active sampling objective. On standard image classification-based domain adaptation benchmarks, ADA-CLUE consistently outperforms competing active adaptation, active learning, and domain adaptation methods across domain shifts of varying severity. @inproceedings{2021CLUE, author = {Prabhu, Viraj and Chandrasekaran, Arjun and Saenko, Kate and Hoffman, Judy}, title = {Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings}, year = 2021, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Baifeng Shi, Qi Dai, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu. IEEE/CVF International Conference in Computer Vision (ICCV), 2021 abstract | bibtex Training temporal action detection in videos requires large amounts of labeled data, yet such annotation is expensive to collect. Incorporating unlabeled or weakly-labeled data to train action detection model could help reduce annotation cost. In this work, we first introduce the Semi-supervised Action Detection (SSAD) task with a mixture of labeled and unlabeled data and analyze different types of errors in the proposed SSAD baselines which are directly adapted from the semi-supervised classification literature. Identifying that the main source of error is action incompleteness (i.e., missing parts of actions), we alleviate it by designing an unsupervised foreground attention (UFA) module utilizing the conditional independence between foreground and background motion. Then we incorporate weakly-labeled data into SSAD and propose Omni-supervised Action Detection (OSAD) with three levels of supervision. To overcome the accompanying action-context confusion problem in OSAD baselines, an information bottleneck (IB) is designed to suppress the scene information in non-action frames while preserving the action information. We extensively benchmark against the baselines for SSAD and OSAD on our created data splits in THUMOS14 and ActivityNet1.2, and demonstrate the effectiveness of the proposed UFA and IB methods. Lastly, the benefit of our full OSAD-IB model under limited annotation budgets is shown by exploring the optimal annotation strategy for labeled, unlabeled and weakly-labeled data. @inproceedings{2021temporalIccv, author = {Shi, Baifeng and Dai, Qi and Hoffman, Judy and Saenko, Kate and Darrell, Trevor and Xu, Huijuan}, title = {Temporal Action Detection with Multi-level Supervision}, year = 2021, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Or Litany, Ari Morcos, Srinath Sridhar, Leonidas Guibas, Judy Hoffman. IEEE/CVF Winter Conference on Applications in Computer Vision (WACV), 2021 pdf | abstract | bibtex We seek to learn a representation on a large annotated data source that generalizes to a target domain using limited new supervision. Many prior approaches to this problem have focused on learning disentangled representations so that as individual factors vary in a new domain, only a portion of the representation need be updated. In this work, we seek the generalization power of disentangled representations, but relax the requirement of explicit latent disentanglement and instead encourage linearity of individual factors of variation by requiring them to be manipulable by learned linear transformations. We dub these transformations latent canonicalizers, as they aim to modify the value of a factor to a pre-determined (but arbitrary) canonical value (e.g., recoloring the image foreground to black). Assuming a source domain with access to meta-labels specifying the factors of variation within an image, we demonstrate experimentally that our method helps reduce the number of observations needed to generalize to a similar target domain when compared to a number of supervised baselines. @inproceedings{2020WacvLatentCanon, author = {Litany, Or and Morcos, Ari and Sridhar, Srinath and Guibas, Leonidas and Hoffman, Judy}, title = {Representation Learning Through Latent Canonicalizations}, year = 2021, booktitle = {IEEE/CVF Winter Conference on Applications in Computer Vision (WACV)} } |
|
Yogesh Balaji, Tom Goldstein, Judy Hoffman. arXiv, 2020 pdf | abstract | bibtex | code Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails to generalize well to unperturbed test set. We hypothesize that this poor generalization is a consequence of adversarial training with uniform perturbation radius around every training sample. Samples close to decision boundary can be morphed into a different class under a small perturbation budget, and enforcing large margins around these samples produce poor decision boundaries that generalize poorly. Motivated by this hypothesis, we propose instance adaptive adversarial training -- a technique that enforces sample-specific perturbation margins around every training sample. We show that using our approach, test accuracy on unperturbed samples improve with a marginal drop in robustness. Extensive experiments on CIFAR-10, CIFAR-100 and Imagenet datasets demonstrate the effectiveness of our proposed approach. @inproceedings{2020arXivInstanceAdaptive, author = {Balaji, Yogesh and Goldstein, Tom and Hoffman, Judy}, title = {Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets}, year = 2020, booktitle = {arXiv} } |
|
Ningshan Zhang, Mehryar Mohri, Judy Hoffman. Annals of Mathematics and Artificial Intelligence, 2020 |
|
Baifeng Shi, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu. Neural Information Processing Systems (NeurIPS), 2020 abstract | bibtex | code | project page | video Supervised learning requires a large amount of training data, limiting its appli- cation where labeled data is scarce. To compensate for data scarcity, one pos- sible method is to utilize auxiliary tasks to provide additional supervision for the main task. Assigning and optimizing the importance weights for different auxiliary tasks remains an crucial and largely understudied research question. In this work, we propose a method to automatically reweight auxiliary tasks in order to reduce the data requirement on the main task. Specifically, we formu- late the weighted likelihood function of auxiliary tasks as a surrogate prior for the main task. By adjusting the auxiliary task weights to minimize the diver- gence between the surrogate prior and the true prior of the main task, we obtain a more accurate prior estimation, achieving the goal of minimizing the required amount of training data for the main task and avoiding a costly grid search. In multiple experimental settings (e.g. semi-supervised learning, multi-label classifi- cation), we demonstrate that our algorithm can effectively utilize limited labeled data of the main task with the benefit of auxiliary tasks compared with previous task reweighting methods. We also show that under extreme cases with only a few extra examples (e.g. few-shot domain adaptation), our algorithm results in significant improvement over the baseline. @inproceedings{2020NeurIPSAux, author = {Shi, Baifeng and Hoffman, Judy and Saenko, Kate and Darrell, Trevor and Xu, Huijuan}, title = {Auxiliary Task Reweighting for Minimum-data Learning}, year = 2020, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Samyak Datta, Oleksandr Maksymets, Judy Hoffman, Stefan Lee, Dhruv Batra, Devi Parikh. Conference on Robot Learning (CoRL), 2020 abstract | bibtex Recent work has presented embodied agents that can navigate to point-goal targets in novel indoor environments with near-perfect accuracy. However, these agents are equipped with idealized sensors for localization and take deterministic actions. This setting is practically sterile by comparison to the dirty reality of noisy sensors and actuations in the real world -- wheels can slip, motion sensors have error, actuations can rebound. In this work, we take a step towards this noisy reality, developing point-goal navigation agents that rely on visual estimates of egomotion under noisy action dynamics. We find these agents outperform naive adaptions of current point-goal agents to this setting as well as those incorporating classic localization baselines. Further, our model conceptually divides learning agent dynamics or odometry (where am I?) from task-specific navigation policy (where do I want to go?). This enables a seamless adaption to changing dynamics (a different robot or floor type) by simply re-calibrating the visual odometry model -- circumventing the expense of re-training of the navigation policy. Our agent was the runner-up in the PointNav track of CVPR 2020 Habitat Challenge. @inproceedings{2020CorlEgo, author = {Datta, Samyak and Maksymets, Oleksandr and Hoffman, Judy and Lee, Stefan and Batra, Dhruv and Parikh, Devi}, title = {Integrating Egocentric Localization for More Realistic Point-Goal Navigation Agents}, year = 2020, booktitle = {Conference on Robot Learning (CoRL)} } |
|
Harish Haresamudram, Apoorva Beedu, Varun Agrawal, Patrick L Grady, Irfan Essa, Judy Hoffman, Thomas Ploetz. International Symposium on Wearable Computers (ISWC), 2020 pdf | bibtex @inproceedings{2020ISWC, author = {Haresamudram, Harish and Beedu, Apoorva and Agrawal, Varun and Grady, Patrick L and Essa, Irfan and Hoffman, Judy and Ploetz, Thomas}, title = {Masked Reconstruction based Self-Supervision for Human Activity Recognition}, year = 2020, booktitle = {International Symposium on Wearable Computers (ISWC)} } |
|
Prithvijit Chattopadhyay, Yogesh Balaji, Judy Hoffman. European Conference in Computer Vision (ECCV), 2020 pdf | abstract | bibtex | code | video We introduce Domain-specific Masks for Generalization, a model for improving both in-domain and out-of-domain generalization performance. For domain generalization, the goal is to learn from a set of source domains to produce a single model that will best generalize to an unseen target domain. As such, many prior approaches focus on learning representations which persist across all source domains with the assumption that these domain agnostic representations will generalize well. However, often individual domains contain characteristics which are unique and when leveraged can significantly aid in-domain recognition performance. To produce a model which best generalizes to both seen and unseen domains, we propose learning domain specific masks. The masks are encouraged to learn a balance of domain-invariant and domain-specific features, thus enabling a model which can benefit from the predictive power of specialized features while retaining the universal applicability of domain-invariant features. We demonstrate competitive performance compared to naive baselines and state-of-the-art methods on both PACS and DomainNet. @inproceedings{2020EccvDMG, author = {Chattopadhyay, Prithvijit and Balaji, Yogesh and Hoffman, Judy}, title = {Learning to Balance Specificity and Invariance for In and Out of Domain Generalization}, year = 2020, booktitle = {European Conference in Computer Vision (ECCV)} } |
|
Daniel Bolya, Sean Foley, James Hays, Judy Hoffman. European Conference in Computer Vision (ECCV), 2020 (Spotlight Presentation) pdf | abstract | bibtex | code | project page | video We introduce TIDE, a framework and associated toolbox1 for analyzing the sources of error in object detection and instance segmenta- tion algorithms. Importantly, our framework is applicable across datasets and can be applied directly to output prediction files without required knowledge of the underlying prediction system. Thus, our framework can be used as a drop-in replacement for the standard mAP computation while providing a comprehensive analysis of each model’s strengths and weaknesses. We segment errors into six types and, crucially, are the first to introduce a technique for measuring the contribution of each error in a way that isolates its effect on overall performance. We show that such a representation is critical for drawing accurate, comprehensive conclusions through in-depth analysis across 4 datasets and 7 recognition models. @inproceedings{2020EccvTIDE, author = {Bolya, Daniel and Foley, Sean and Hays, James and Hoffman, Judy}, title = {TIDE: A General Toolbox for Identifying Object Detection Errors}, year = 2020, booktitle = {European Conference in Computer Vision (ECCV)} } |
|
Fu Lin, Rohit Mittapali, Prithvijit Chattopadhyay, Daniel Bolya, Judy Hoffman. Adversarial Robustness in the Real World (AROW), ECCV, 2020 (Best paper runner up) pdf | abstract | bibtex Convolutional Neural Networks (CNNs) have been shown to be vulnerable to adversarial examples, which are known to locate in subspaces close to where normal data lies but are not naturally occurring and have low probability. In this work, we investigate the potential effect defense techniques have on the geometry of the likelihood landscape - likelihood of the input images under the trained model. We first propose a way to visualize the likelihood landscape by leveraging an energy-based model interpretation of discriminative classifiers. Then we introduce a measure to quantify the flatness of the likelihood landscape. We observe that a subset of adversarial defense techniques results in a similar effect of flattening the likelihood landscape. We further explore directly regularizing towards a flat landscape for adversarial robustness. @inproceedings{2020EccvWLikelihood, author = {Lin, Fu and Mittapali, Rohit and Chattopadhyay, Prithvijit and Bolya, Daniel and Hoffman, Judy}, title = {Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses}, year = 2020, booktitle = {Adversarial Robustness in the Real World (AROW), ECCV} } |
|
Daniel Gordon, Abhishek Kadian, Devi Parikh, Judy Hoffman, Dhruv Batra. IEEE/CVF International Conference in Computer Vision (ICCV), 2019 pdf | abstract | bibtex | code We propose SplitNet, a method for decoupling visual perception and policy learning. By incorporating auxiliary tasks and selective learning of portions of the model, we explicitly decompose the learning objectives for visual navigation into perceiving the world and acting on that perception. We show dramatic improvements over baseline models on transferring between simulators, an encouraging step towards Sim2Real. Additionally, SplitNet generalizes better to unseen environments from the same simulator and transfers faster and more effectively to novel embodied navigation tasks. Further, given only a small sample from a target domain, SplitNet can match the performance of traditional end-to-end pipelines which receive the entire dataset. @inproceedings{2019iccvsplitnet, author = {Gordon, Daniel and Kadian, Abhishek and Parikh, Devi and Hoffman, Judy and Batra, Dhruv}, title = {SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation}, year = 2019, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Judy Hoffman, Daniel A. Roberts, Sho Yaida. Conference on the Mathematical Theory of Deep Learning (DeepMath), 2019 pdf | abstract | bibtex | code Design of reliable systems must guarantee stability against input perturbations. In machine learning, such guarantee entails preventing overfitting and ensuring robustness of models against corruption of input data. In order to maximize stability, we analyze and develop a computationally efficient implementation of Jacobian regularization that increases classification margins of neural networks. The stabilizing effect of the Jacobian regularizer leads to significant improvements in robustness, as measured against both random and adversarial input perturbations, without severely degrading generalization properties on clean data. @inproceedings{2019DeepMathJacobian, author = {Hoffman, Judy and Roberts, Daniel A. and Yaida, Sho}, title = {Robust Learning with Jacobian Regularization}, year = 2019, booktitle = {Conference on the Mathematical Theory of Deep Learning (DeepMath)} } |
|
Benjamin Wilson, Judy Hoffman, Jamie Morgenstern. Workshop on Fairness Accountability Transparency and Ethics in Computer Vision at CVPR, 2019 pdf | abstract | bibtex | code Press: Vox | Business Insider | The Guardian | NBC News | In this work, we investigate whether state-of-theart object detection systems have equitable predictive performance on pedestrians with different skin tones. This work is motivated by many recent examples of ML and vision systems displaying higher error rates for certain demographic groups than others. We annotate an existing large scale dataset which contains pedestrians, BDD100K, with Fitzpatrick skin tones in ranges [1-3] or [4-6]. We then provide an in depth comparative analysis of performance between these two skin tone groupings, finding that neither time of day nor occlusion explain this behavior, suggesting this disparity is not merely the result of pedestrians in the 4-6 range appearing in more difficult scenes for detection. We investigate to what extent time of day, occlusion, and reweighting the supervised loss during training affect this predictive bias. @inproceedings{2019FateCV, author = {Wilson, Benjamin and Hoffman, Judy and Morgenstern, Jamie}, title = {Predictive Inequity in Object Detection}, year = 2019, booktitle = {Workshop on Fairness Accountability Transparency and Ethics in Computer Vision at CVPR} } |
|
Judy Hoffman, Mehryar Mohri, Ningshan Zhang. Neural Information Processing Systems (NeurIPS), 2018 pdf | abstract | bibtex We present a number of novel contributions to the multiple-source adaptation problem. We derive new normalized solutions with strong theoretical guarantees for the cross-entropy loss and other similar losses. We also provide new guarantees that hold in the case where the conditional probabilities for the source domains are distinct. Moreover, we give new algorithms for determining the distributionweighted combination solution for the cross-entropy loss and other losses. We report the results of a series of experiments with real-world datasets. We find that our algorithm outperforms competing approaches by producing a single robustmodel that performs well on any target mixture distribution. Altogether, our theory, algorithms, and empirical results provide a full solution for the multiple-source adaptation problem with very practical benefits. @inproceedings{2018neurips_madap, author = {Hoffman, Judy and Mohri, Mehryar and Zhang, Ningshan}, title = {Algorithms and Theory for Multiple-Source Adaptation}, year = 2018, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alyosha Efros, Trevor Darrell. International Conference in Machine Learning (ICML), 2018 pdf | abstract | bibtex | code | slides Domain adaptation is critical for success in new, unseen environments. Adversarial adaptation models have shown tremendous progress towards adapting to new environments by focusing either on discovering domain invariant representations or by mapping between unpaired image domains. While feature space methods are difficult to interpret and sometimes fail to capture pixel-level and low-level domain shifts, image space methods sometimes fail to incorporate high level semantic knowledge relevant for the end task. We propose a model which adapts between domains using both generative image space alignment and latent representation space alignment. Our approach, Cycle-Consistent Adversarial Domain Adaptation (CyCADA), guides transfer between domains according to a specific discriminatively trained task and avoids divergence by enforcing consistency of the relevant semantics before and after adaptation. We evaluate our method on a variety of visual recognition and prediction settings, including digit classification and semantic segmentation of road scenes, advancing state-of-the-art performance for unsupervised adaptation from synthetic to real world driving domains. @inproceedings{2018icmlcycada, author = {Hoffman, Judy and Tzeng, Eric and Park, Taesung and Zhu, Jun-Yan and Isola, Phillip and Saenko, Kate and Efros, Alyosha and Darrell, Trevor}, title = {CyCADA: Cycle Consistent Adversarial Domain Adaptation}, year = 2018, booktitle = {International Conference in Machine Learning (ICML)} } |
|
Andreea Bobu, Eric Tzeng, Judy Hoffman, Trevor Darrell. International Conference on Learning Representations (ICLR) Workshop Track, 2018 pdf | abstract | bibtex Domain adaptation typically focuses on adapting a model from a single source domain to a target domain. However, in practice, this paradigm of adapting from one source to one target is limiting, as different aspects of the real world such as illumination and weather conditions vary continuously and cannot be effectively captured by two static domains. Approaches that attempt to tackle this problem by adapting from a single source to many different target domains simultaneously are consistently unable to learn across all domain shifts. Instead, we propose an adaptation method that exploits the continuity between gradually varying domains by adapting in sequence from the source to the most similar target domain. By incrementally adapting while simultaneously efficiently regularizing against prior examples, we obtain a single strong model capable of recognition within all observed domains. @inproceedings{2018iclrwBobu, author = {Bobu, Andreea and Tzeng, Eric and Hoffman, Judy and Darrell, Trevor}, title = {Adapting to Continuously Shifting Domains}, year = 2018, booktitle = {International Conference on Learning Representations (ICLR) Workshop Track} } |
|
Liyue Shen, Serena Yeung, Judy Hoffman, Greg Mori, Li Fei-Fei. IEEE/CVF Winter Conference on Applications in Computer Vision (WACV), 2018 pdf | abstract | bibtex Recognizing human object interactions (HOI) is an important part of distinguishing the rich variety of human action in the visual world. While recent progress has been made in improving HOI recognition in the fully supervised setting, the space of possible human-object interactions is large and it is impractical to obtain labeled training data for all interactions of interest. In this work, we tackle the challenge of scaling HOI recognition to the long tail of categories through a zero-shot learning approach. We introduce a factorized model for HOI detection that disentangles reasoning on verbs and objects, and at test-time can therefore produce detections for novel verb-object pairs. We present experiments on the recently introduced large-scale HICODET dataset, and show that our model is able to both perform comparably to state-of-the-art in fully-supervised HOI detection, while simultaneously achieving effective zeroshot detection of new HOI categories. @inproceedings{2018wacv_hico, author = {Shen, Liyue and Yeung, Serena and Hoffman, Judy and Mori, Greg and Fei-Fei, Li}, title = {Scaling Human-Object Interaction Recognition through Zero-Shot Learning}, year = 2018, booktitle = {IEEE/CVF Winter Conference on Applications in Computer Vision (WACV)} } |
|
Zelun Luo, Yuliang Zou, Judy Hoffman, Li Fei-Fei. Neural Information Processing Systems (NeurIPS), 2017 pdf | abstract | bibtex | project page We propose a framework that learns a representation transferable across different domains and tasks in a label efficient manner. Our approach battles domain shift with a domain adversarial loss, and generalizes the embedding to novel task using a metric learning-based approach. Our model is simultaneously optimized on labeled source data and unlabeled or sparsely labeled data in the target domain. Our method shows compelling results on novel classes within a new domain even when only a few labeled examples per class are available, outperforming the prevalent fine-tuning approach. In addition, we demonstrate the effectiveness of our framework on the transfer learning task from image object recognition to video action recognition. @inproceedings{2017NipsLuo, author = {Luo, Zelun and Zou, Yuliang and Hoffman, Judy and Fei-Fei, Li}, title = {Label Efficient Learning of Transferable Representations across Domains and Tasks}, year = 2017, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Timnit Gebru, Judy Hoffman, Li Fei-Fei. IEEE/CVF International Conference in Computer Vision (ICCV), 2017 pdf | abstract | bibtex While fine-grained object recognition is an important problem in computer vision, current models are unlikely to accurately classify objects in the wild. These fully supervised models need additional annotated images to classify objects in every new scenario, a task that is infeasible. However, sources such as e-commerce websites and field guides provide annotated images for many classes. In this work, we study fine-grained domain adaptation as a step towards overcoming the dataset shift between easily acquired annotated images and the real world. Adaptation has not been studied in the fine-grained setting where annotations such as attributes could be used to increase performance. Our work uses an attribute based multi-task adaptation loss to increase accuracy from a baseline of 4.1% to 19.1% in the semi-supervised adaptation case. Prior do- main adaptation works have been benchmarked on small datasets such as [46] with a total of 795 images for some domains, or simplistic datasets such as [41] consisting of digits. We perform experiments on a subset of a new challenging fine-grained dataset consisting of 1,095,021 images of 2, 657 car categories drawn from e-commerce websites and Google Street View. @inproceedings{2017iccvGebru, author = {Gebru, Timnit and Hoffman, Judy and Fei-Fei, Li}, title = {Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach}, year = 2017, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick. IEEE/CVF International Conference in Computer Vision (ICCV), 2017 (Oral Presentation) pdf | abstract | bibtex | code | project page Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases in the data rather than learning to perform visual reasoning. Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer. Both the program generator and the execution engine are implemented by neural networks, and are trained using a combination of backpropagation and REINFORCE. Using the CLEVR benchmark for visual reasoning, we show that our model significantly outperforms strong baselines and generalizes better in a variety of settings. @inproceedings{2017iccvJohnson, author = {Johnson, Justin and Hariharan, Bharath and Maaten, Laurens van der and Hoffman, Judy and Fei-Fei, Li and Zitnick, C. Lawrence and Girshick, Ross}, title = {Inferring and Executing Programs for Visual Reasoning}, year = 2017, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Eric Tzeng, Judy Hoffman, Trevor Darrell, Kate Saenko. IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2017 pdf | abstract | bibtex | code Adversarial learning methods are a promising approach to training robust deep networks, and can generate complex samples across diverse domains. They also can improve recognition despite the presence of domain shift or dataset bias: several adversarial approaches to unsupervised domain adaptation have recently been introduced, which reduce the difference between the training and test domain distributions and thus improve generalization performance. Prior generative approaches show compelling visualizations, but are not optimal on discriminative tasks and can be limited to smaller shifts. Prior discriminative approaches could handle larger domain shifts, but imposed tied weights on the model and did not exploit a GAN-based loss. We first outline a novel generalized framework for adversarial adaptation, which subsumes recent state-of-the-art approaches as special cases, and we use this generalized view to better relate the prior approaches. We propose a previously unexplored instance of our general framework which combines discriminative modeling, untied weight sharing, and a GAN loss, which we call Adversarial Discriminative Domain Adaptation (ADDA). We show that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and demonstrate the promise of our approach by exceeding state-of-the-art unsupervised adaptation results on standard cross-domain digit classification tasks and a new more difficult cross-modality object classification task. @inproceedings{2017cvprAdda, author = {Tzeng, Eric and Hoffman, Judy and Darrell, Trevor and Saenko, Kate}, title = {Adversarial Discriminative Domain Adaptation}, year = 2017, booktitle = {IEEE/CVF Computer Vision and Pattern Recognition (CVPR)} } |
|
Evan Shelhamer*, Kate Rakelly*, Judy Hoffman*, Trevor Darrell. Video Semantic Segmentation Workshop at European Conference in Computer Vision, 2016 pdf | abstract | bibtex Recent years have seen tremendous progress in still-image segmentation; however the na¨ıve application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video. We propose a video recognition framework that relies on two key observations: 1) while pixels may change rapidly from frame to frame, the semantic content of a scene evolves more slowly, and 2) execution can be viewed as an aspect of architecture, yielding purpose-fit computation schedules for networks. We define a novel family of “clockwork” convnets driven by fixed or adaptive clock signals that schedule the processing of different layers at different update rates according to their semantic stability. We design a pipeline schedule to reduce latency for real-time recognition and a fixed-rate schedule to reduce overall computation. Finally, we extend clockwork scheduling to adaptive video processing by incorporating data-driven clocks that can be tuned on unlabeled video. The accuracy and efficiency of clockwork convnets are evaluated on the Youtube-Objects, NYUD, and Cityscapes video datasets. @inproceedings{2016eccvw_clockwork, author = {Shelhamer*, Evan and Rakelly*, Kate and Hoffman*, Judy and Darrell, Trevor}, title = {Clockwork Convnets for Video Semantic Segmentation}, year = 2016, booktitle = {Video Semantic Segmentation Workshop at European Conference in Computer Vision} } |
|
Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell. Workshop on Algorithmic Foundations in Robotics (WAFR), 2016 pdf | abstract | bibtex Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world. @inproceedings{2016wafrTzeng, author = {Tzeng, Eric and Devin, Coline and Hoffman, Judy and Finn, Chelsea and Abbeel, Pieter and Levine, Sergey and Saenko, Kate and Darrell, Trevor}, title = {Adapting deep visuomotor representations with weak pairwise constraints}, year = 2016, booktitle = {Workshop on Algorithmic Foundations in Robotics (WAFR)} } |
|
Xingchao Peng, Judy Hoffman, Stella Yu, Kate Saenko. International Conference on Image Processing (ICIP), 2016 pdf | abstract | bibtex We address the difficult problem of distinguishing fine-grained object categories in low resolution images. Wepropose a simple an effective deep learning approach that transfers fine-grained knowledge gained from high resolution training data to the coarse low-resolution test scenario. Such fine-to-coarse knowledge transfer has many real world applications, such as identifying objects in surveillance photos or satellite images where the image resolution at the test time is very low but plenty of high resolution photos of similar objects are available. Our extensive experiments on two standard benchmark datasets containing fine-grained car models and bird species demonstrate that our approach can effectively transfer fine-detail knowledge to coarse-detail imagery. @inproceedings{2016icipPeng, author = {Peng, Xingchao and Hoffman, Judy and Yu, Stella and Saenko, Kate}, title = {Fine-To-Coarse Knowledge Transfer For Low-Res Image Classification}, year = 2016, booktitle = {International Conference on Image Processing (ICIP)} } |
|
Saurabh Gupta, Judy Hoffman, Jitendra Malik. IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2016 pdf | abstract | bibtex | code In this work we propose a technique that transfers supervision between images from different modalities. We use learned representations from a large labeled modality as a supervisory signal for training representations for a new unlabeled paired modality. Our method enables learning of rich representations for unlabeled modalities and can be used as a pre-training procedure for new modalities with limited labeled data. We show experimental results where we transfer supervision from labeled RGB images to unlabeled depth and optical flow images and demonstrate large improvements for both these cross modal supervision transfers. @inproceedings{2016cvprGupta, author = {Gupta, Saurabh and Hoffman, Judy and Malik, Jitendra}, title = {Cross Modal Distillation for Supervision Transfer}, year = 2016, booktitle = {IEEE/CVF Computer Vision and Pattern Recognition (CVPR)} } |
|
Judy Hoffman, Saurabh Gupta, Trevor Darrell. IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2016 (Spotlight Presentation) pdf | abstract | bibtex | slides We present a modality hallucination architecture for training an RGB object detection model which incorporates depth side information at training time. Our convolutional hallucination network learns a new and complementary RGB image representation which is taught to mimic convolutional mid-level features from a depth network. At test time images are processed jointly through the RGB and hallucination networks to produce improved detection performance. Thus, our method transfers information commonly extracted from depth training data to a network which can extract that information from the RGB counterpart. We present results on the standard NYUDv2 dataset and report improvement on the RGB detection task. @inproceedings{2016cvprHoffman, author = {Hoffman, Judy and Gupta, Saurabh and Darrell, Trevor}, title = {Learning with Side Information through Modality Hallucination}, year = 2016, booktitle = {IEEE/CVF Computer Vision and Pattern Recognition (CVPR)} } |
|
Judy Hoffman, Saurabh Gupta, Jian Leong, Sergio Guadarrama, Trevor Darrell. International Conference in Robotics and Automation (ICRA), 2016 pdf | abstract | bibtex | slides Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world. @inproceedings{2016icraHoffman, author = {Hoffman, Judy and Gupta, Saurabh and Leong, Jian and Guadarrama, Sergio and Darrell, Trevor}, title = {Cross-Modal Adaptation for RGB-D Detection}, year = 2016, booktitle = {International Conference in Robotics and Automation (ICRA)} } |
|
Oscar Beijbom, Judy Hoffman, Evan Yao, Trevor Darrell, Alberto Rodriguez-Ramirez, Manuel Gonzlez-Rivero, Ove Hoegh-Guldberg. Transfer and Multi-Task Learning: Trends and New Perspectives, Workshop at NeurIPS, 2015 pdf | bibtex @inproceedings{2015nipswBeijbom, author = {Beijbom, Oscar and Hoffman, Judy and Yao, Evan and Darrell, Trevor and Rodriguez-Ramirez, Alberto and Gonzlez-Rivero, Manuel and Hoegh-Guldberg, Ove}, title = {Quantification in-the-wild: data-sets and baselines}, year = 2015, booktitle = {Transfer and Multi-Task Learning: Trends and New Perspectives, Workshop at NeurIPS} } |
|
Damian Mrowca, Marcus Rohrbach, Judy Hoffman, Ronghang Hu, Kate Saenko, Trevor Darrell. IEEE/CVF International Conference in Computer Vision (ICCV), 2015 pdf | abstract | bibtex Large scale object detection with thousands of classes introduces the problem of many contradicting false positive detections, which have to be suppressed. Class-independent non-maximum suppression has traditionally been used for this step, but it does not scale well as the number of classes grows. Traditional non-maximum suppression does not consider label- and instance-level relationships nor does it allow an exploitation of the spatial layout of detection proposals. We propose a new multi-class spatial semantic regularisation method based on affinity propagation clustering, which simultaneously optimises across all categories and all proposed locations in the image, to improve both the localisation and categorisation of selected detection proposals. Constraints are shared across the labels through the semantic WordNet hierarchy. Our approach proves to be especially useful in large scale settings with thousands of classes, where spatial and semantic interactions are very frequent and only weakly supervised detectors can be built due to a lack of bounding box annotations. Detection experiments are conducted on the ImageNet and COCO dataset, and in settings with thousands of detected categories. Our method provides a significant precision improvement by reducing false positives, while simultaneously improving the recall. @inproceedings{2015iccvMrowca, author = {Mrowca, Damian and Rohrbach, Marcus and Hoffman, Judy and Hu, Ronghang and Saenko, Kate and Darrell, Trevor}, title = {Spatial Semantic Regularisation for Large Scale Object Detection}, year = 2015, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Eric Tzeng*, Judy Hoffman*, Trevor Darrell, Kate Saenko. IEEE/CVF International Conference in Computer Vision (ICCV), 2015 pdf | abstract | bibtex | code Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias. Fine-tuning deep models in a new domain can require a significant amount of labeled data, which for many applications is simply not available. We propose a new CNN architecture to exploit unlabeled and sparsely labeled target domain data. Our approach simultaneously optimizes for domain invariance to facilitate domain transfer and uses a soft label distribution matching loss to transfer information between tasks. Our proposed adaptation method offers empirical performance which exceeds previously published results on two standard benchmark visual domain adaptation tasks, evaluated across supervised and semi-supervised adaptation settings. @inproceedings{2015iccvTzeng, author = {Tzeng*, Eric and Hoffman*, Judy and Darrell, Trevor and Saenko, Kate}, title = {Simultaneous Deep Transfer Across Domains and Tasks}, year = 2015, booktitle = {IEEE/CVF International Conference in Computer Vision (ICCV)} } |
|
Judy Hoffman, Deepak Pathak, Trevor Darrell, Kate Saenko. IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2015 pdf | abstract | bibtex We develop methods for detector learning which exploit joint training over both weak and strong labels and which transfer learned perceptual representations from strongly-labeled auxiliary tasks. Previous methods for weak-label learning often learn detector models independently using latent variable optimization, but fail to share deep representation knowledge across classes and usually require strong initialization. Other previous methods transfer deep representations from domains with strong labels to those with only weak labels, but do not optimize over individual latent boxes, and thus may miss specific salient structures for a particular category. We propose a model that subsumes these previous approaches, and simultaneously trains a representation and detectors for categories with either weak or strong labels present. We provide a novel formulation of a joint multiple instance learning method that includes examples from classification-style data when available, and also performs domain transfer learning to improve the underlying detector representation. Our model outperforms known methods on ImageNet-200 detection with weak labels. @inproceedings{2015cvprHoffman, author = {Hoffman, Judy and Pathak, Deepak and Darrell, Trevor and Saenko, Kate}, title = {Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning}, year = 2015, booktitle = {IEEE/CVF Computer Vision and Pattern Recognition (CVPR)} } |
|
Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko. Neural Information Processing Systems (NeurIPS), 2014 pdf | abstract | bibtex | project page | slides A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) have emerged as clear winners on object classification benchmarks, in part due to training with 1.2M+ labeled classification images. Unfortunately, only a small fraction of those labels are available for the detection task. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect detection data and label it with precise bounding boxes. In this paper, we propose Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Our method has the potential to enable detection for the tens of thousands of categories that lack bounding box annotations, yet have plenty of classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge demonstrates the efficacy of our approach. This algorithm enables us to produce a >7.6K detector by using available classification data from leaf nodes in the ImageNet tree. We additionally demonstrate how to modify our architecture to produce a fast detector (running at 2fps for the 7.6K detector). @inproceedings{2014nipsHoffman, author = {Hoffman, Judy and Guadarrama, Sergio and Tzeng, Eric and Hu, Ronghang and Donahue, Jeff and Girshick, Ross and Darrell, Trevor and Saenko, Kate}, title = {LSDA: Large Scale Detection through Adaptation}, year = 2014, booktitle = {Neural Information Processing Systems (NeurIPS)} } |
|
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell. International Conference in Machine Learning (ICML), 2014 pdf | abstract | bibtex | code We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms. @inproceedings{2014icmlDonahue, author = {Donahue, Jeff and Jia, Yangqing and Vinyals, Oriol and Hoffman, Judy and Zhang, Ning and Tzeng, Eric and Darrell, Trevor}, title = {DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition}, year = 2014, booktitle = {International Conference in Machine Learning (ICML)} } |
|
Judy Hoffman, Trevor Darrell, Kate Saenko. IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2014 pdf | abstract | bibtex | project page | video We pose the following question: what happens when test data not only differs from training data, but differs from it in a continually evolving way? The classic domain adaptation paradigm considers the world to be separated into stationary domains with clear boundaries between them. However, in many real-world applications, examples cannot be naturally separated into discrete domains, but arise from a continuously evolving underlying process. Examples include video with gradually changing lighting and spam email with evolving spammer tactics. We formulate a novel problem of adapting to such continuous domains, and present a solution based on smoothly varying embeddings. Recent work has shown the utility of considering discrete visual domains as fixed points embedded in a manifold of lower-dimensional subspaces. Adaptation can be achieved via transforms or kernels learned between such stationary source and target subspaces. We propose a method to consider non-stationary domains, which we refer to as Continuous Manifold Adaptation (CMA). We treat each target sample as potentially being drawn from a different subspace on the domain manifold, and present a novel technique for continuous transform-based adaptation. Our approach can learn to distinguish categories using training data collected at some point in the past, and continue to update its model of the categories for some time into the future, without receiving any additional labels. Experiments on two visual datasets demonstrate the value of our approach for several popular feature representations. @inproceedings{2014cvprHoffman, author = {Hoffman, Judy and Darrell, Trevor and Saenko, Kate}, title = {Continuous Manifold Based Adaptation for Evolving Visual Domains}, year = 2014, booktitle = {IEEE/CVF Computer Vision and Pattern Recognition (CVPR)} } |
|
Daniel Goehring, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell. International Conference in Robotics and Automation (ICRA), 2014 pdf | bibtex | project page @inproceedings{2014icraGoering, author = {Goehring, Daniel and Hoffman, Judy and Rodner, Erik and Saenko, Kate and Darrell, Trevor}, title = {Interactive Adaptation of Real-Time Object Detectors}, year = 2014, booktitle = {International Conference in Robotics and Automation (ICRA)} } |
|
Judy Hoffman, Erik Rodner, Jeff Donahue, Brian Kulis, Kate Saenko. International Journal in Computer Vision (IJCV), 2013 pdf | abstract | bibtex We address the problem of visual domain adaptation for transferring object models from one dataset or visual domain to another. We introduce a unified flexible model for both supervised and semi-supervised learning that allows us to learn transformations between domains. Additionally, we present two instantiations of the model, one for general feature adaptation/alignment, and one specifically designed for classification. First, we show how to extend metric learning methods for domain adaptation, allowing for learning metrics independent of the domain shift and the final classifier used. Furthermore, we go beyond classical metric learning by extending the method to asymmetric, category independent transformations. Our framework can adapt features even when the target domain does not have any labeled examples for some categories, and when the target and source features have different dimensions. Finally, we develop a joint learning framework for adaptive classifiers, which outperforms competing methods in terms of multi-class accuracy and scalability. We demonstrate the ability of our approach to adapt object recognition models under a variety of situations, such as differing imaging conditions, feature types, and codebooks. The experiments show its strong performance compared to previous approaches and its applicability to largescale scenarios @article{2013ijcvHoffman, author = {Hoffman, Judy and Rodner, Erik and Donahue, Jeff and Kulis, Brian and Saenko, Kate}, title = {Asymmetric and Category Invariant Feature Transformations for Domain Adaptation}, year = 2013, journal = {International Journal in Computer Vision (IJCV)} } |
|
Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell. IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2013 pdf | bibtex | poster @inproceedings{2013cvprDonahue, author = {Donahue, Jeff and Hoffman, Judy and Rodner, Erik and Saenko, Kate and Darrell, Trevor}, title = {Semi-Supervised Domain Adaptation with Instance Constraints}, year = 2013, booktitle = {IEEE/CVF Computer Vision and Pattern Recognition (CVPR)} } |
|
Judy Hoffman, Erik Rodner, Jeff Donahue, Kate Saenko, Trevor Darrell. International Conference on Learning Representations (ICLR), 2013 (Oral Presentation) pdf | abstract | bibtex | code | slides We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches. @inproceedings{2013iclrHoffman, author = {Hoffman, Judy and Rodner, Erik and Donahue, Jeff and Saenko, Kate and Darrell, Trevor}, title = {Efficient Learning of Domain-invariant Image Representations}, year = 2013, booktitle = {International Conference on Learning Representations (ICLR)} } |
|
Judy Hoffman, Brian Kulis, Trevor Darrell, Kate Saenko. European Conference in Computer Vision (ECCV), 2012 pdf | bibtex | code | slides | poster | video @inproceedings{2012eccvHoffman, author = {Hoffman, Judy and Kulis, Brian and Darrell, Trevor and Saenko, Kate}, title = {Discovering Latent Domains For Multisource Domain Adaptation}, year = 2012, booktitle = {European Conference in Computer Vision (ECCV)} } |
|
Glen Hartmann, Matthias Grundmann, Judy Hoffman, David Tsai, Vivek Kwatra, Omid Madani, Sudheendra Vijayanarasimhan, Irfan Essa, James Rehg, Rahul Sukthankar. eccvw-webscale, 2012 (Best Paper Award) pdf | abstract | bibtex We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches. @inproceedings{2012eccvwHartmann, author = {Hartmann, Glen and Grundmann, Matthias and Hoffman, Judy and Tsai, David and Kwatra, Vivek and Madani, Omid and Vijayanarasimhan, Sudheendra and Essa, Irfan and Rehg, James and Sukthankar, Rahul}, title = {Weakly Supervised Learning of Object Segmentations from Web-Scale Video}, year = 2012, booktitle = {eccvw-webscale} } |
|
Judy Hoffman, Kate Saenko, Brian Kulis, Trevor Darrell. Domain Adaptation Workshop at Neural Information Processing Symposium (NeurIps), 2011 (Best Student Paper Award) bibtex @inproceedings{2011nipswHoffman, author = {Hoffman, Judy and Saenko, Kate and Kulis, Brian and Darrell, Trevor}, title = {Domain Adaptation with Multiple Latent Domains}, year = 2011, booktitle = {Domain Adaptation Workshop at Neural Information Processing Symposium (NeurIps)} } |
|
Leonard Jaillet, Judy Hoffman, Jur van den Berg, Pieter Abbeel, Josep M. Porta, Ken Goldberg. International Conference on Intelligent Robotics and Systems (IROS), 2011 pdf | abstract | bibtex Existing sampling-based robot motion planning methods are often inefficient at finding trajectories for kinodynamic systems, especially in the presence of narrow passages between obstacles and uncertainty in control and sensing. To address this, we propose EG-RRT, an Environment-Guided variant of RRT designed for kinodynamic robot systems that combines elements from several prior approaches and may incorporate a cost model based on the LQG-MP framework to estimate the probability of collision under uncertainty in control and sensing. We compare the performance of EG-RRT with several prior approaches on challenging sample problems. Results suggest that EG-RRT offers significant improvements in performance. @inproceedings{2011irosJaillet, author = {Jaillet, Leonard and Hoffman, Judy and Berg, Jur van den and Abbeel, Pieter and Porta, Josep M. and Goldberg, Ken}, title = {EG-RRT: Environment-Guided Random Trees for Kinodynamic Motion Planning with Uncertainty and Obstacles}, year = 2011, booktitle = {International Conference on Intelligent Robotics and Systems (IROS)} } |
|
|
Georgia Tech - Spring 2024 (Instructor) |
|
Georgia Tech - Fall 2023 (Instructor) |
|
Georgia Tech - Spring 2023 (Instructor) |
|
Georgia Tech - Fall 2022 (Instructor) |
|
Georgia Tech - Spring 2022 (Instructor) |
|
Georgia Tech - Fall 2021 (Instructor) |
|
Georgia Tech - Spring 2021 (Instructor) |
|
Georgia Tech - Spring 2020 (Instructor) |
|
Georgia Tech - Fall 2019 (Instructor) |
|
UC Berkeley - Spring 2013 (Teaching Assistant) |
|
UC Berkeley - Fall 2009 (Teaching Assistant) |
|
|
My research is made possible by the generous support of the following organizations. |