Judy Hoffman
Email judy (at) gatech.edu

Assistant Professor in the School of Interactive Computing at Georgia Tech and a member of the Machine Learning Center. Research interests include computer vision, machine learning, domain adaptation, robustness, and fairness.

Prior to joining Georgia Tech, Dr. Hoffman was a Visiting Research Scientist at Facebook AI Research and a postdoctoral scholar at Stanford University and UC Berkeley. She received her PhD from UC Berkeley, EECS in 2016 where she was a member of BAIR and BDD.

Prospective Students: Read before contacting.

If you are interested in joining my group and are not currently at Georgia Tech, please apply directly to the college. Unfortunately, due to the volume of requests I receive, I may not be able to respond to individual requests from students outside Tech. If you are already a PhD student at Georgia Tech, feel free to contact me directly via email and include your resume and research interests. For GT MS or undergraduate students, we list information here when a position is available. No positions are available for Spring 2025. My group is not accepting visitors from outside Georgia Tech at this time.

Bio | CV | Google Scholar | Github | Twitter

News
Hoffman AI Research Lab

Fiona Ryan
PhD Student
Co-advised by Jim Rehg (2020-)

George Stoica
PhD Student
(2021-)

Simar Kareer
PhD Student
(2022-)

Pratik Ramesh
PhD Student
(2023-)

Sahil Khose
PhD Student
(2024-)

Mengqi Zhang
PhD Student
(2024-)

Bhavika Devnani
PhD Student
(2024-)

Bogi Ecsedi
BS Student
(2023-)

Ajay Bati
BS Student
(2023-)

Alumni

Vincent Cartillier
Postdoc Co-advised by Irfan Essa (2024)

Prithvijit Chattopadhyay
PhD (2019-2024)
Next NVIDIA AI Research

Daniel Bolya
PhD (2019-2024)
Next Meta FAIR

Anisha Pal
MS (2023-2024)

Sahil Khose
MS (2023-2024)
Next PhD GT

Vivek Vijaykumar
BS/MS (2021-2024)

Viraj Prabhu
PhD 2019-2023, next Salesforce AI Research

Jakob Bjorner
BS, Co-advised by Kartik Goyal

Bharat Goyal
BS, Spring 2023

Taylor Hearn
MS 2022-2023

Aaditya Singh
MS 2022-2023
Next AWS

Aayushi Agarwal
MS 2021-2023

Deepanshi Deepanshi
MS 2021-2023
Next Emory

Sean Foley
MS (Co-advised w/ James Hays)

Sruthi Sudhakar
BS 2021-2022
Next PhD Student Columbia

Deeksha Kartik
MS 2021-2022
Next PathAI

Bhavika Devnani
MS 2021-2022
Next Apple AI

Kartik Sarangmath
BS / MS 2021-2022
Next Third AI

Sachit Kuhar
MS during Spring 2022

Arvind Krisnakumar
MS 2020-2021
Next AWS

Shivam Khare
MS 2020-2021
Next Twitter AI

Rohit Mittapalli
BS 2020-2021
Next Startup

Fu Lin
MS Spring 2020
Next AWS Beijing

Luis Bermudez
MS during Spring 2020
Next Intel

Research

My research lies at the intersection of computer vision and machine learning and focuses on tackling real-world variation and scale while minimizing human supervision. I develop learning algorithms which facilitate transfer of information through unsupervised and semi-supervised model adaptation and generalization.

sym [NEW]Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors
Anisha Pal*, Julia Kruk*, Mansi Phute, Manognya Bhattaram, Diyi Yang, Duen Horng Chau, Judy Hoffman.
Neural Information Processing Systems (NeurIPS), 2024
pdf | bibtex | code

@inproceedings{2024_SemiTruth, 
author = {Pal*, Anisha and Kruk*, Julia and 
	Phute, Mansi and Bhattaram, 
	Manognya and Yang, Diyi and 
	Chau, Duen Horng and Hoffman, 
	Judy},
title = {Semi-Truths: A Large-Scale Dataset 
	of AI-Augmented Images for 
	Evaluating Robustness of 
	AI-Generated Image detectors},
year = 2024,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym SkyScenes: A Synthetic Dataset for Aerial Scene Understanding
Sahil Khose*, Anisha Pal*, Aayushi Agarwal*, Deepanshi*, Judy Hoffman, Prithvijit Chattopadhyay.
European Conference in Computer Vision (ECCV), 2024
pdf | bibtex | code | project page
Press: GT Article |

@inproceedings{2024_Skyscenes, 
author = {Khose*, Sahil and Pal*, Anisha and 
	Agarwal*, Aayushi and 
	Deepanshi*,  and Hoffman, Judy 
	and Chattopadhyay, Prithvijit},
title = {SkyScenes: A Synthetic Dataset for 
	Aerial Scene Understanding},
year = 2024,
booktitle = {European Conference in Computer 
	Vision (ECCV)}
}
sym Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei Huang, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray.
IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2024 (Oral Presentation)
pdf | bibtex

@inproceedings{2024_EgoExo, 
author = {Grauman, Kristen and Westbury, 
	Andrew and Torresani, Lorenzo 
	and Kitani, Kris and Malik, 
	Jitendra and Afouras, 
	Triantafyllos and Ashutosh, 
	Kumar and Baiyya, Vijay and 
	Bansal, Siddhant and Boote, 
	Bikram and Byrne, Eugene and 
	Chavis, Zach and Chen, Joya 
	and Cheng, Feng and Chu, 
	Fu-Jen and Crane, Sean and 
	Dasgupta, Avijit and Dong, 
	Jing and Escobar, Maria and 
	Forigua, Cristhian and 
	Gebreselasie, Abrham and 
	Haresh, Sanjay and Huang, Jing 
	and Islam, Md Mohaiminul and 
	Jain, Suyog and Khirodkar, 
	Rawal and Kukreja, Devansh and 
	Liang, Kevin J and Liu, 
	Jia-Wei and Majumder, Sagnik 
	and Mao, Yongsen and Martin, 
	Miguel and Mavroudi, Effrosyni 
	and Nagarajan, Tushar and 
	Ragusa, Francesco and 
	Ramakrishnan, Santhosh Kumar 
	and Seminara, Luigi and 
	Somayazulu, Arjun and Song, 
	Yale and Su, Shan and Xue, 
	Zihui and Zhang, Edward and 
	Zhang, Jinxu and Castillo, 
	Angela and Chen, Changan and 
	Fu, Xinzhu and Furuta, Ryosuke 
	and Gonzalez, Cristina and 
	Gupta, Prince and Hu, Jiabo 
	and Huang, Yifei and Huang, 
	Yiming and Khoo, Weslie and 
	Kumar, Anush and Kuo, Robert 
	and Lakhavani, Sach and Liu, 
	Miao and Luo, Mi and Luo, 
	Zhengyi and Meredith, Brighid 
	and Miller, Austin and 
	Oguntola, Oluwatumininu and 
	Pan, Xiaqing and Peng, Penny 
	and Pramanick, Shraman and 
	Ramazanova, Merey and Ryan, 
	Fiona and Shan, Wei and 
	Somasundaram, Kiran and Song, 
	Chenan and Southerland, Audrey 
	and Tateno, Masatoshi and 
	Wang, Huiyu and Wang, Yuchen 
	and Yagi, Takuma and Yan, 
	Mingfei and Yang, Xitong and 
	Yu, Zecheng and Zha, Shengxin 
	Cindy and Zhao, Chen and Zhao, 
	Ziwei and Zhu, Zhifan and 
	Zhuo, Jeff and Arbelaez, Pablo 
	and Bertasius, Gedas and 
	Crandall, David and Damen, 
	Dima and Engel, Jakob and 
	Farinella, Giovanni Maria and 
	Furnari, Antonino and Ghanem, 
	Bernard and Hoffman, Judy and 
	Jawahar, C. V. and Newcombe, 
	Richard and Park, Hyun Soo and 
	Rehg, James M. and Sato, 
	Yoichi and Savva, Manolis and 
	Shi, Jianbo and Shou, Mike 
	Zheng and Wray, Michael},
title = {Ego-Exo4D: Understanding Skilled 
	Human Activity from First- and 
	Third-Person Perspectives},
year = 2024,
booktitle = {IEEE/CVF Computer Vision and 
	Pattern Recognition (CVPR)}
}
sym AUGCAL: Sim-to-Real Adaptation by Improving Uncertainty Calibration on Augmented Synthetic Images
Prithvijit Chattopadhyay, Bharat Goyal, Bogi Ecsedi, Viraj Prabhu, Judy Hoffman.
International Conference on Learning Representations (ICLR), 2024
pdf | bibtex

@inproceedings{2024_Augcal, 
author = {Chattopadhyay, Prithvijit and 
	Goyal, Bharat and Ecsedi, Bogi 
	and Prabhu, Viraj and Hoffman, 
	Judy},
title = {AUGCAL: Sim-to-Real Adaptation by 
	Improving Uncertainty 
	Calibration on Augmented 
	Synthetic Images},
year = 2024,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR)}
}
sym Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya, Chaitanya Ryali, Judy Hoffman, Christoph Feichtenhofer.
International Conference on Learning Representations (ICLR), 2024
pdf | bibtex

@inproceedings{2024_Zipit, 
author = {Bolya, Daniel and Ryali, Chaitanya 
	and Hoffman, Judy and 
	Feichtenhofer, Christoph},
title = {Window Attention is Bugged: How 
	not to Interpolate Position 
	Embeddings},
year = 2024,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR)}
}
sym ZipIt! Merging Models from Different Tasks without Training
George Stoica, Daniel Bolya, Jakob Bjorner, Pratik Ramesh, Taylor Hearn, Judy Hoffman.
International Conference on Learning Representations (ICLR), 2024
pdf | bibtex | code

@inproceedings{2024_Zipit, 
author = {Stoica, George and Bolya, Daniel 
	and Bjorner, Jakob and Ramesh, 
	Pratik and Hearn, Taylor and 
	Hoffman, Judy},
title = {ZipIt! Merging Models from 
	Different Tasks without 
	Training},
year = 2024,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR)}
}
sym We're Not Using Videos Effectively: An Updated Video Domain Adaptation Baseline
Simar Kareer, Vivek Vijaykumar, Harsh Maheshwari, Prithvijit Chattopadhyay, Judy Hoffman, Viraj Prabhu.
Transactions on Machine Learning Research (TMLR), 2024
pdf | bibtex | code | project page

@inproceedings{2024TMLR_videoDA, 
author = {Kareer, Simar and Vijaykumar, 
	Vivek and Maheshwari, Harsh 
	and Chattopadhyay, Prithvijit 
	and Hoffman, Judy and Prabhu, 
	Viraj},
title = {We're Not Using Videos 
	Effectively: An Updated Video 
	Domain Adaptation Baseline},
year = 2024,
booktitle = {Transactions on Machine Learning 
	Research (TMLR)}
}
sym [NEW]EgoMimic | Scaling Imitation Learning through Egocentric Video
Simar Kareer, Dhruv Patel*, Ryan Punamiya*, Pranay Mathur*, Shuo Cheng, Chen Wang, Judy Hoffman*, Danfei Xu*.
corl-workshop, 2024
pdf | bibtex | code | project page | video

@inproceedings{2024Egomimic, 
author = {Kareer, Simar and Patel*, Dhruv 
	and Punamiya*, Ryan and 
	Mathur*, Pranay and Cheng, 
	Shuo and Wang, Chen and 
	Hoffman*, Judy and Xu*, Danfei},
title = {EgoMimic | Scaling Imitation 
	Learning through Egocentric 
	Video},
year = 2024,
booktitle = {corl-workshop}
}
sym Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Vision Tasks
Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Prabhu, Gowthami Somepalli, Prithivijit Chattopadhyay, Adrien Bardes, Mark Ibrahim, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, Tom Goldstein.
Neural Information Processing Systems (NeurIPS), 2023
pdf | bibtex | code

@inproceedings{2023neurips_backbone, 
author = {Goldblum, Micah and Souri, Hossein 
	and Ni, Renkun and Shu, Manli 
	and Prabhu, Viraj and 
	Somepalli, Gowthami and 
	Chattopadhyay, Prithivijit and 
	Bardes, Adrien and Ibrahim, 
	Mark and Hoffman, Judy and 
	Chellappa, Rama and Wilson, 
	Andrew Gordon and Goldstein, 
	Tom},
title = {Battle of the Backbones: A 
	Large-Scale Comparison of 
	Pretrained Models across 
	Vision Tasks},
year = 2023,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images
Viraj Prabhu, Sriram Yenamandra, Prithvijit Chattopadhyay, Judy Hoffman.
Neural Information Processing Systems (NeurIPS), 2023
pdf | bibtex | code | project page
Press: GT Article |

@inproceedings{2023neurips_lance, 
author = {Prabhu, Viraj and Yenamandra, 
	Sriram and Chattopadhyay, 
	Prithvijit and Hoffman, Judy},
title = {LANCE: Stress-testing Visual 
	Models by Generating 
	Language-guided Counterfactual 
	Images},
year = 2023,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym FACTS: First Amplify Correlations and Then Slice to Discover Bias
Sriram Yenamandra, Pratik Ramesh, Viraj Prabhu, Judy Hoffman.
IEEE/CVF International Conference in Computer Vision (ICCV), 2023
pdf | bibtex | code

@inproceedings{2023ICCV_FACTS, 
author = {Yenamandra, Sriram and Ramesh, 
	Pratik and Prabhu, Viraj and 
	Hoffman, Judy},
title = {FACTS: First Amplify Correlations 
	and Then Slice to Discover 
	Bias},
year = 2023,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym PASTA: Proportional Amplitude Spectrum Training Augmentation for Syn-to-Real Domain Generalization
Prithvijit Chattopadhyay*, Kartik Sarangmath*, Vivek Vijaykumar, Judy Hoffman.
IEEE/CVF International Conference in Computer Vision (ICCV), 2023
pdf | bibtex | code

@inproceedings{2023iccv_PASTA, 
author = {Chattopadhyay*, Prithvijit and 
	Sarangmath*, Kartik and 
	Vijaykumar, Vivek and Hoffman, 
	Judy},
title = {PASTA: Proportional Amplitude 
	Spectrum Training Augmentation 
	for Syn-to-Real Domain 
	Generalization},
year = 2023,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Benchmarking Low-Shot Robustness to Natural Distribution Shifts
Aaditya Singh*, Kartik Sarangmath*, Prithvijit Chattopadhyay, Judy Hoffman.
IEEE/CVF International Conference in Computer Vision (ICCV), 2023
pdf | bibtex | code

@inproceedings{2023iccv_lowshotrobust, 
author = {Singh*, Aaditya and Sarangmath*, 
	Kartik and Chattopadhyay, 
	Prithvijit and Hoffman, Judy},
title = {Benchmarking Low-Shot Robustness 
	to Natural Distribution Shifts},
year = 2023,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Haekyu Park, Seongmin Lee, Benjamin Hoover, Austin P. Wright, Omar Shaikh, Rahul Duggal, Nilaksh Das, Kevin Li, Judy Hoffman, Duen Horng Chau.
ACM International Conference on Information and Knowledge Management (CIKM), 2023
pdf | bibtex | code

@inproceedings{2023CIKM_conceptevo, 
author = {Park, Haekyu and Lee, Seongmin and 
	Hoover, Benjamin and Wright, 
	Austin P. and Shaikh, Omar and 
	Duggal, Rahul and Das, Nilaksh 
	and Li, Kevin and Hoffman, 
	Judy and Chau, Duen Horng},
title = {Concept Evolution in Deep Learning 
	Training: A Unified 
	Interpretation Framework and 
	Discoveries},
year = 2023,
booktitle = {ACM International Conference on 
	Information and Knowledge 
	Management (CIKM)}
}
sym Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Chaitanya Ryali*, Yuan-Ting Hu*, Daniel Bolya*, Chen Wei, Haoqi Fan, Po-Yao Huang, Vaibhav Aggarwal, Arkabandhu Chowdhury, Omid Poursaeed, Judy Hoffman, Jitendra Malik, Yanghao Li, Christoph Feichtenhofer.
International Conference in Machine Learning (ICML), 2023 (Oral Presentation)
pdf | bibtex | code

@inproceedings{2023ICML_sMVIT, 
author = {Ryali*, Chaitanya and Hu*, 
	Yuan-Ting and Bolya*, Daniel 
	and Wei, Chen and Fan, Haoqi 
	and Huang, Po-Yao and 
	Aggarwal, Vaibhav and 
	Chowdhury, Arkabandhu and 
	Poursaeed, Omid and Hoffman, 
	Judy and Malik, Jitendra and 
	Li, Yanghao and Feichtenhofer, 
	Christoph},
title = {Hiera: A Hierarchical Vision 
	Transformer without the 
	Bells-and-Whistles},
year = 2023,
booktitle = {International Conference in 
	Machine Learning (ICML)}
}
sym Token Merging for Fast Stable Diffusion
Daniel Bolya, Judy Hoffman.
CVPR Workshop on Efficient Deep Learning for Computer Vision, 2023 (Oral Presentation)
pdf | abstract | bibtex | code

The landscape of image generation has been forever changed by open vocabulary diffusion models. However, at their core these models use transformers, which makes generation slow. Better implementations to increase the throughput of these transformers have emerged, but they still evaluate the entire model. In this paper, we instead speed up diffusion models by exploiting natural redundancy in generated images by merging redundant tokens. After making some diffusion-specific improvements to Token Merging (ToMe), our ToMe for Stable Diffusion can reduce the number of tokens in an existing Stable Diffusion model by up to 60% while still producing high quality images without any extra training. In the process, we speed up image generation by up to 2x and reduce memory consumption by up to 5.6x. Furthermore, this speed-up stacks with efficient implementations such as xFormers, minimally impacting quality while being up to 5.4x faster for large images.

@inproceedings{2023_cvprw_tomesd, 
author = {Bolya, Daniel and Hoffman, Judy},
title = {Token Merging for Fast Stable 
	Diffusion},
year = 2023,
booktitle = {CVPR Workshop on Efficient Deep 
	Learning for Computer Vision}
}
sym ICON2: Reliably Benchmarking Predictive Inequity in Object Detection
Sruthi Sudhakar, Viraj Uday Prabhu, Olga Russakovsky, Judy Hoffman.
CVPR Workshop on Secure and Safe Autonomous Driving (SSAD), 2023
pdf | bibtex

@inproceedings{2023_cvprw_predictiveConfounders, 
author = {Sudhakar, Sruthi and Prabhu, Viraj 
	Uday and Russakovsky, Olga and 
	Hoffman, Judy},
title = {ICON2: Reliably Benchmarking 
	Predictive Inequity in Object 
	Detection},
year = 2023,
booktitle = {CVPR Workshop on Secure and Safe 
	Autonomous Driving (SSAD)}
}
sym Signed Binary Weight Networks
Sachit Kuhar, Alexey Tumanov, Judy Hoffman.
3rd On-Device Intelligence Workshop at MLSys, 2023
bibtex

@inproceedings{2023_signedbinary, 
author = {Kuhar, Sachit and Tumanov, Alexey 
	and Hoffman, Judy},
title = {Signed Binary Weight Networks},
year = 2023,
booktitle = {3rd On-Device Intelligence 
	Workshop at MLSys}
}
sym Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances
Arun Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil Katyal, Dinesh Manocha, Celso de Melo, Rama Chellappa.
International Conference in Robotics and Automation (ICRA), 2023
pdf | bibtex | code

@inproceedings{2023ICRA_Synth, 
author = {Reddy, Arun and Shah, Ketul and 
	Paul, William and Mocharla, 
	Rohita and Hoffman, Judy and 
	Katyal, Kapil and Manocha, 
	Dinesh and Melo, Celso de and 
	Chellappa, Rama},
title = {Synthetic-to-Real Domain 
	Adaptation for Action 
	Recognition: A Dataset and 
	Baseline Performances},
year = 2023,
booktitle = {International Conference in 
	Robotics and Automation (ICRA)}
}
sym Token Merging: Your ViT But Faster
Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, Judy Hoffman.
International Conference on Learning Representations (ICLR), 2023 (Notable Top 5%)
pdf | bibtex | code

@inproceedings{2023ICLR_TokenMerging, 
author = {Bolya, Daniel and Fu, Cheng-Yang 
	and Dai, Xiaoliang and Zhang, 
	Peizhao and Feichtenhofer, 
	Christoph and Hoffman, Judy},
title = {Token Merging: Your ViT But Faster},
year = 2023,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR)}
}
sym Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting
Viraj Prabhu, David Acuna, Yuan-Hong Liao, Rafid Mahmood, Marc T. Law, Judy Hoffman, Sanja Fidler, James Lucas.
Transactions on Machine Learning Research (TMLR), 2023
pdf | bibtex

@inproceedings{2023TMLR_CARE, 
author = {Prabhu, Viraj and Acuna, David and 
	Liao, Yuan-Hong and Mahmood, 
	Rafid and Law, Marc T. and 
	Hoffman, Judy and Fidler, 
	Sanja and Lucas, James},
title = {Bridging the Sim2Real gap with 
	CARE: Supervised Detection 
	Adaptation with Conditional 
	Alignment and Reweighting},
year = 2023,
booktitle = {Transactions on Machine Learning 
	Research (TMLR)}
}
sym Structure-Encoding Auxiliary Tasks for Improved Visual Representation in Vision-and-Language Navigation
Chia-Wen Kuo, Chih-Yao Ma, Judy Hoffman, Zsolt Kira.
IEEE/CVF Winter Conference on Applications in Computer Vision (WACV), 2023
pdf | bibtex

@inproceedings{2023wacv_vislangnav, 
author = {Kuo, Chia-Wen and Ma, Chih-Yao and 
	Hoffman, Judy and Kira, Zsolt},
title = {Structure-Encoding Auxiliary Tasks 
	for Improved Visual 
	Representation in 
	Vision-and-Language Navigation},
year = 2023,
booktitle = {IEEE/CVF Winter Conference on 
	Applications in Computer 
	Vision (WACV)}
}
sym Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency
Viraj Prabhu*, Sriram Yenamandra*, Aaditya Singh, Judy Hoffman.
Neural Information Processing Systems (NeurIPS), 2022
pdf | bibtex | code
Press: GT CoC |

@inproceedings{2022NeurIPS_pacmac, 
author = {Prabhu*, Viraj and Yenamandra*, 
	Sriram and Singh, Aaditya and 
	Hoffman, Judy},
title = {Adapting Self-Supervised Vision 
	Transformers by Probing 
	Attention-Conditioned Masking 
	Consistency},
year = 2022,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings
Arjun Majumdar*, Gunjan Aggarwal*, Bhavika Devnani, Judy Hoffman, Dhruv Batra.
Neural Information Processing Systems (NeurIPS), 2022
pdf | bibtex

@inproceedings{2023NeurIPS_zeroNav, 
author = {Majumdar*, Arjun and Aggarwal*, 
	Gunjan and Devnani, Bhavika 
	and Hoffman, Judy and Batra, 
	Dhruv},
title = {ZSON: Zero-Shot Object-Goal 
	Navigation using Multimodal 
	Goal Embeddings},
year = 2022,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Bi-Directional Self-Attention for Vision Transformers
George Stoica, Taylor Hearn, Bhavika Devnani, Judy Hoffman.
NeurIPS Workshop on Vision Transformers Theory and Applications, 2022 (Best Paper Award)
pdf | bibtex

@inproceedings{2022NeurIPSW_BiSA, 
author = {Stoica, George and Hearn, Taylor 
	and Devnani, Bhavika and 
	Hoffman, Judy},
title = {Bi-Directional Self-Attention for 
	Vision Transformers},
year = 2022,
booktitle = {NeurIPS Workshop on Vision 
	Transformers Theory and 
	Applications}
}
sym AUGCO: Augmentation Consistency-guided Self-training for Source-free Domain Adaptive Semantic Segmentation
Viraj Prabhu*, Shivam Khare*, Deeksha Kartik, Judy Hoffman.
Workshop on Computer Vision in the Wild, ECCV, 2022
pdf | bibtex

@inproceedings{2022arXiv_Zaugco, 
author = {Prabhu*, Viraj and Khare*, Shivam 
	and Kartik, Deeksha and 
	Hoffman, Judy},
title = {AUGCO: Augmentation 
	Consistency-guided 
	Self-training for Source-free 
	Domain Adaptive Semantic 
	Segmentation},
year = 2022,
booktitle = {Workshop on Computer Vision in the 
	Wild, ECCV}
}
sym Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Judy Hoffman.
ECCV Workshop on Computational Aspects of Deep Learning, 2022 (Best Paper Award)
pdf | bibtex

@inproceedings{2022ECCVW_HydraAttention, 
author = {Bolya, Daniel and Fu, Cheng-Yang 
	and Dai, Xiaoliang and Zhang, 
	Peizhao and Hoffman, Judy},
title = {Hydra Attention: Efficient 
	Attention with Many Heads},
year = 2022,
booktitle = {ECCV Workshop on Computational 
	Aspects of Deep Learning}
}
sym Can domain adaptation make object recognition work for everyone?
Viraj Prabhu, Ramprasaath R. Selvaraju, Judy Hoffman, Nikhil Naik.
Computer Vision and Pattern Recognition (CVPR) L3D Workshop, 2022
pdf | abstract | bibtex

Despite the rapid progress in deep visual recognition, modern computer vision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies. We investigate the effectiveness of unsupervised domain adaptation (UDA) of such models across geographies at closing this performance gap. To do so, we first curate two shifts from existing datasets to study the Geographical DA problem, and discover new challenges beyond data distribution shift: context shift, wherein object surroundings may change significantly across geographies, and subpopulation shift, wherein the intra-category distributions may shift. We demonstrate the inefficacy of standard DA methods at Geographical DA, highlighting the need for specialized geographical adaptation solutions to address the challenge of making object recognition work for everyone.

@inproceedings{2022CVPR_GeoDA, 
author = {Prabhu, Viraj and Selvaraju, 
	Ramprasaath R. and Hoffman, 
	Judy and Naik, Nikhil},
title = {Can domain adaptation make object 
	recognition work for everyone?},
year = 2022,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR) L3D 
	Workshop}
}
sym VISCUIT: Visual Auditor for Bias in CNN Image Classifier
Seongmin Lee, Zijie J. Wang, Judy Hoffman, Duen Horng (Polo) Chau.
Computer Vision and Pattern Recognition (CVPR) Demo Track, 2022
pdf | bibtex | project page

@inproceedings{2022CVPR_VisCUIT, 
author = {Lee, Seongmin and Wang, Zijie J. 
	and Hoffman, Judy and Chau, 
	Duen Horng (Polo)},
title = {VISCUIT: Visual Auditor for Bias 
	in CNN Image Classifier},
year = 2022,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR) Demo Track}
}
sym Scalable Diverse Model Selection for Accessible Transfer Learning
Daniel Bolya*, Rohit Mittapali*, Judy Hoffman.
Neural Information Processing Systems (NeurIPS), 2021
pdf | abstract | bibtex | code | project page | video

With the preponderance of pretrained deep learning models available off-the-shelf from model banks today, finding the best weights to fine-tune to your use-case can be a daunting task. Several methods have recently been proposed to find good models for transfer learning, but they either don't scale well to large model banks or don't perform well on the diversity of off-the-shelf models. Ideally the question we want to answer is, given some data and a source model, can you quickly predict the model's accuracy after fine-tuning? We formalize this setting as Scalable Diverse Model Selection and propose several benchmarks for evaluating on this task. We find that existing model selection and transferability estimation methods perform poorly here and analyze why this is the case. We then introduce simple techniques to improve the performance and speed of these algorithms. Finally, we iterate on existing methods to create PARC, which outperforms all other methods on diverse model selection. We intend to release the benchmarks and method code in hope to inspire future work in model selection for accessible transfer learning.

@inproceedings{2021NeurIPSModelFinder, 
author = {Bolya*, Daniel and Mittapali*, 
	Rohit and Hoffman, Judy},
title = {Scalable Diverse Model Selection 
	for Accessible Transfer 
	Learning},
year = 2021,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Mitigating Bias in Visual Transformers via Targeted Alignment
Sruthi Sudhakar, Viraj Prabhu, Arvind Krishnakumar, Judy Hoffman.
British Machine Vision Conference (BMVC), 2021
pdf | abstract | bibtex | project page

As transformer architectures become increasingly prevalent in computer vision, itis critical to understand their fairness implications. We perform the first study of thefairness of transformers applied to computer vision and benchmark several bias miti-gation approaches from prior work. We visualize the feature space of the transformerself-attention modules and discover that a significant portion of the bias is encoded in thequery matrix. With this knowledge, we proposeTADeT, a targeted alignment strategyfor debiasing transformers that aims to discover and remove bias primarily from querymatrix features. We measure performance using Balanced Accuracy and Standard Ac-curacy, and fairness using Equalized Odds and Balanced Accuracy Difference.TADeTconsistently leads to improved fairness over prior work on multiple attribute predictiontasks on the CelebA dataset, without compromising performance.

@inproceedings{2021BMVC_TADET, 
author = {Sudhakar, Sruthi and Prabhu, Viraj 
	and Krishnakumar, Arvind and 
	Hoffman, Judy},
title = {Mitigating Bias in Visual 
	Transformers via Targeted 
	Alignment},
year = 2021,
booktitle = {British Machine Vision Conference 
	(BMVC)}
}
sym UDIS: Unsupervised Discovery of Bias in Deep Visual Recognition Models
Arvind Krishnakumar, Viraj Prabhu, Sruthi Sudhakar, Judy Hoffman.
British Machine Vision Conference (BMVC), 2021
pdf | abstract | bibtex | code | project page

Deep learning models have been shown to learn spurious correlations from data that sometimes lead to systematic failures for certain subpopulations. Prior work has typically diagnosed this by crowdsourcing annotations for various protected attributes and measur- ing performance, which is both expensive to acquire and difficult to scale. In this work, we propose UDIS, an unsupervised algorithm for surfacing and analyzing such failure modes. UDIS identifies subpopulations via hierarchical clustering of dataset embeddings and surfaces systematic failure modes by visualizing low performing clusters along with their gradient-weighted class-activation maps. We show the effectiveness of UDIS in identifying failure modes in models trained for image classification on the CelebA and MSCOCO datasets. UDIS is available at https://github.com/akrishna77/ bias- discovery.

@inproceedings{2021BMVC_UDIS, 
author = {Krishnakumar, Arvind and Prabhu, 
	Viraj and Sudhakar, Sruthi and 
	Hoffman, Judy},
title = {UDIS: Unsupervised Discovery of 
	Bias in Deep Visual 
	Recognition Models},
year = 2021,
booktitle = {British Machine Vision Conference 
	(BMVC)}
}
sym Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation
Viraj Prabhu, Shivam Khare, Deeksha Karthik, Judy Hoffman.
IEEE/CVF International Conference in Computer Vision (ICCV), 2021
pdf | abstract | bibtex | code | project page

Many existing approaches for unsupervised domain adaptation (UDA) focus on adapting under only data distribution shift and offer limited success under additional cross-domain label distribution shift. Recent work based on self-training using target pseudo-labels has shown promise, but on challenging shifts pseudo-labels may be highly unreliable, and using them for self-training may cause error accumulation and domain misalignment. We propose Selective Entropy Optimization via Committee Consistency (SENTRY), a UDA algorithm that judges the reliability of a target instance based on its predictive consistency under a committee of random image transformations. Our algorithm then selectively minimizes predictive entropy to increase confidence on highly consistent target instances, while maximizing predictive entropy to reduce confidence on highly inconsistent ones. In combination with pseudo-label based approximate target class balancing, our approach leads to significant improvements over the state-of-the-art on 27/31 domain shifts from standard UDA benchmarks as well as benchmarks designed to stress-test adaptation under label distribution shift.

@inproceedings{2021arXivSENTRY, 
author = {Prabhu, Viraj and Khare, Shivam 
	and Karthik, Deeksha and 
	Hoffman, Judy},
title = {Selective Entropy Optimization via 
	Committee Consistency for 
	Unsupervised Domain Adaptation},
year = 2021,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym RobustNav: Towards Benchmarking Robustness in Embodied Navigation
Prithvijit Chattopadhyay, Judy Hoffman, Roozbeh Mottaghi, Ani Kembhavi.
IEEE/CVF International Conference in Computer Vision (ICCV), 2021 (Oral Presentation)
pdf | abstract | bibtex | code | project page

As an attempt towards assessing the robustness of embodied navigation agents, we propose RobustNav, a framework to quantify the performance of embodied navigation agents when exposed to a wide variety of visual – affecting RGB inputs – and dynamics – affecting transition dynamics – corruptions. Most recent efforts in visual navigation have typically focused on generalizing to novel target environments with similar appearance and dynamics characteristics. With RobustNav, we find that some standard embodied navigation agents significantly underperform (or fail) in the presence of visual or dynamics corruptions. We systematically analyze the kind of idiosyncrasies that emerge in the behavior of such agents when operating under corruptions. Finally, for visual corruptions in RobustNav, we show that while standard techniques to improve robustness such as data-augmentation and self-supervised adaptation offer some zero-shot resistance and improvements in navigation performance, there is still a long way to go in terms of recovering lost performance relative to clean “non-corrupt” settings, warranting more research in this direction.

@inproceedings{2021RobustNav, 
author = {Chattopadhyay, Prithvijit and 
	Hoffman, Judy and Mottaghi, 
	Roozbeh and Kembhavi, Ani},
title = {RobustNav: Towards Benchmarking 
	Robustness in Embodied 
	Navigation},
year = 2021,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings
Viraj Prabhu, Arjun Chandrasekaran, Kate Saenko, Judy Hoffman.
IEEE/CVF International Conference in Computer Vision (ICCV), 2021
pdf | abstract | bibtex | project page | video

Generalizing deep neural networks to new target domains is critical to their real-world utility. In practice, it may be feasible to get some target data labeled, but to be cost-effective it is desirable to select a maximally-informative subset via active learning (AL). We study the problem of AL under a domain shift, called Active Domain Adaptation (Active DA). We empirically demonstrate how existing AL approaches based solely on model uncertainty or diversity sampling are suboptimal for Active DA. Our algorithm, Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings (ADA-CLUE), i) identifies target instances for labeling that are both uncertain under the model and diverse in feature space, and ii) leverages the available source and target data for adaptation by optimizing a semi-supervised adversarial entropy loss that is complementary to our active sampling objective. On standard image classification-based domain adaptation benchmarks, ADA-CLUE consistently outperforms competing active adaptation, active learning, and domain adaptation methods across domain shifts of varying severity.

@inproceedings{2021CLUE, 
author = {Prabhu, Viraj and Chandrasekaran, 
	Arjun and Saenko, Kate and 
	Hoffman, Judy},
title = {Active Domain Adaptation via 
	Clustering 
	Uncertainty-weighted 
	Embeddings},
year = 2021,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Temporal Action Detection with Multi-level Supervision
Baifeng Shi, Qi Dai, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu.
IEEE/CVF International Conference in Computer Vision (ICCV), 2021
abstract | bibtex

Training temporal action detection in videos requires large amounts of labeled data, yet such annotation is expensive to collect. Incorporating unlabeled or weakly-labeled data to train action detection model could help reduce annotation cost. In this work, we first introduce the Semi-supervised Action Detection (SSAD) task with a mixture of labeled and unlabeled data and analyze different types of errors in the proposed SSAD baselines which are directly adapted from the semi-supervised classification literature. Identifying that the main source of error is action incompleteness (i.e., missing parts of actions), we alleviate it by designing an unsupervised foreground attention (UFA) module utilizing the conditional independence between foreground and background motion. Then we incorporate weakly-labeled data into SSAD and propose Omni-supervised Action Detection (OSAD) with three levels of supervision. To overcome the accompanying action-context confusion problem in OSAD baselines, an information bottleneck (IB) is designed to suppress the scene information in non-action frames while preserving the action information. We extensively benchmark against the baselines for SSAD and OSAD on our created data splits in THUMOS14 and ActivityNet1.2, and demonstrate the effectiveness of the proposed UFA and IB methods. Lastly, the benefit of our full OSAD-IB model under limited annotation budgets is shown by exploring the optimal annotation strategy for labeled, unlabeled and weakly-labeled data.

@inproceedings{2021temporalIccv, 
author = {Shi, Baifeng and Dai, Qi and 
	Hoffman, Judy and Saenko, Kate 
	and Darrell, Trevor and Xu, 
	Huijuan},
title = {Temporal Action Detection with 
	Multi-level Supervision},
year = 2021,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Representation Learning Through Latent Canonicalizations
Or Litany, Ari Morcos, Srinath Sridhar, Leonidas Guibas, Judy Hoffman.
IEEE/CVF Winter Conference on Applications in Computer Vision (WACV), 2021
pdf | abstract | bibtex

We seek to learn a representation on a large annotated data source that generalizes to a target domain using limited new supervision. Many prior approaches to this problem have focused on learning disentangled representations so that as individual factors vary in a new domain, only a portion of the representation need be updated. In this work, we seek the generalization power of disentangled representations, but relax the requirement of explicit latent disentanglement and instead encourage linearity of individual factors of variation by requiring them to be manipulable by learned linear transformations. We dub these transformations latent canonicalizers, as they aim to modify the value of a factor to a pre-determined (but arbitrary) canonical value (e.g., recoloring the image foreground to black). Assuming a source domain with access to meta-labels specifying the factors of variation within an image, we demonstrate experimentally that our method helps reduce the number of observations needed to generalize to a similar target domain when compared to a number of supervised baselines.

@inproceedings{2020WacvLatentCanon, 
author = {Litany, Or and Morcos, Ari and 
	Sridhar, Srinath and Guibas, 
	Leonidas and Hoffman, Judy},
title = {Representation Learning Through 
	Latent Canonicalizations},
year = 2021,
booktitle = {IEEE/CVF Winter Conference on 
	Applications in Computer 
	Vision (WACV)}
}
sym Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
Yogesh Balaji, Tom Goldstein, Judy Hoffman.
arXiv, 2020
pdf | abstract | bibtex | code

Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails to generalize well to unperturbed test set. We hypothesize that this poor generalization is a consequence of adversarial training with uniform perturbation radius around every training sample. Samples close to decision boundary can be morphed into a different class under a small perturbation budget, and enforcing large margins around these samples produce poor decision boundaries that generalize poorly. Motivated by this hypothesis, we propose instance adaptive adversarial training -- a technique that enforces sample-specific perturbation margins around every training sample. We show that using our approach, test accuracy on unperturbed samples improve with a marginal drop in robustness. Extensive experiments on CIFAR-10, CIFAR-100 and Imagenet datasets demonstrate the effectiveness of our proposed approach.

@inproceedings{2020arXivInstanceAdaptive, 
author = {Balaji, Yogesh and Goldstein, Tom 
	and Hoffman, Judy},
title = {Instance adaptive adversarial 
	training: Improved accuracy 
	tradeoffs in neural nets},
year = 2020,
booktitle = {arXiv}
}
sym Multiple-Source Adaptation Theory and Algorithms
Ningshan Zhang, Mehryar Mohri, Judy Hoffman.
Annals of Mathematics and Artificial Intelligence, 2020
pdf | bibtex

@article{2020amai, 
author = {Zhang, Ningshan and Mohri, Mehryar 
	and Hoffman, Judy},
title = {Multiple-Source Adaptation Theory 
	and Algorithms},
year = 2020,
journal = {Annals of Mathematics and 
	Artificial Intelligence}
}
sym Auxiliary Task Reweighting for Minimum-data Learning
Baifeng Shi, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu.
Neural Information Processing Systems (NeurIPS), 2020
abstract | bibtex | code | project page | video

Supervised learning requires a large amount of training data, limiting its appli- cation where labeled data is scarce. To compensate for data scarcity, one pos- sible method is to utilize auxiliary tasks to provide additional supervision for the main task. Assigning and optimizing the importance weights for different auxiliary tasks remains an crucial and largely understudied research question. In this work, we propose a method to automatically reweight auxiliary tasks in order to reduce the data requirement on the main task. Specifically, we formu- late the weighted likelihood function of auxiliary tasks as a surrogate prior for the main task. By adjusting the auxiliary task weights to minimize the diver- gence between the surrogate prior and the true prior of the main task, we obtain a more accurate prior estimation, achieving the goal of minimizing the required amount of training data for the main task and avoiding a costly grid search. In multiple experimental settings (e.g. semi-supervised learning, multi-label classifi- cation), we demonstrate that our algorithm can effectively utilize limited labeled data of the main task with the benefit of auxiliary tasks compared with previous task reweighting methods. We also show that under extreme cases with only a few extra examples (e.g. few-shot domain adaptation), our algorithm results in significant improvement over the baseline.

@inproceedings{2020NeurIPSAux, 
author = {Shi, Baifeng and Hoffman, Judy and 
	Saenko, Kate and Darrell, 
	Trevor and Xu, Huijuan},
title = {Auxiliary Task Reweighting for 
	Minimum-data Learning},
year = 2020,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Integrating Egocentric Localization for More Realistic Point-Goal Navigation Agents
Samyak Datta, Oleksandr Maksymets, Judy Hoffman, Stefan Lee, Dhruv Batra, Devi Parikh.
Conference on Robot Learning (CoRL), 2020
abstract | bibtex

Recent work has presented embodied agents that can navigate to point-goal targets in novel indoor environments with near-perfect accuracy. However, these agents are equipped with idealized sensors for localization and take deterministic actions. This setting is practically sterile by comparison to the dirty reality of noisy sensors and actuations in the real world -- wheels can slip, motion sensors have error, actuations can rebound. In this work, we take a step towards this noisy reality, developing point-goal navigation agents that rely on visual estimates of egomotion under noisy action dynamics. We find these agents outperform naive adaptions of current point-goal agents to this setting as well as those incorporating classic localization baselines. Further, our model conceptually divides learning agent dynamics or odometry (where am I?) from task-specific navigation policy (where do I want to go?). This enables a seamless adaption to changing dynamics (a different robot or floor type) by simply re-calibrating the visual odometry model -- circumventing the expense of re-training of the navigation policy. Our agent was the runner-up in the PointNav track of CVPR 2020 Habitat Challenge.

@inproceedings{2020CorlEgo, 
author = {Datta, Samyak and Maksymets, 
	Oleksandr and Hoffman, Judy 
	and Lee, Stefan and Batra, 
	Dhruv and Parikh, Devi},
title = {Integrating Egocentric 
	Localization for More 
	Realistic Point-Goal 
	Navigation Agents},
year = 2020,
booktitle = {Conference on Robot Learning 
	(CoRL)}
}
sym Masked Reconstruction based Self-Supervision for Human Activity Recognition
Harish Haresamudram, Apoorva Beedu, Varun Agrawal, Patrick L Grady, Irfan Essa, Judy Hoffman, Thomas Ploetz.
International Symposium on Wearable Computers (ISWC), 2020
pdf | bibtex

@inproceedings{2020ISWC, 
author = {Haresamudram, Harish and Beedu, 
	Apoorva and Agrawal, Varun and 
	Grady, Patrick L and Essa, 
	Irfan and Hoffman, Judy and 
	Ploetz, Thomas},
title = {Masked Reconstruction based 
	Self-Supervision for Human 
	Activity Recognition},
year = 2020,
booktitle = {International Symposium on 
	Wearable Computers (ISWC)}
}
sym Learning to Balance Specificity and Invariance for In and Out of Domain Generalization
Prithvijit Chattopadhyay, Yogesh Balaji, Judy Hoffman.
European Conference in Computer Vision (ECCV), 2020
pdf | abstract | bibtex | code | video

We introduce Domain-specific Masks for Generalization, a model for improving both in-domain and out-of-domain generalization performance. For domain generalization, the goal is to learn from a set of source domains to produce a single model that will best generalize to an unseen target domain. As such, many prior approaches focus on learning representations which persist across all source domains with the assumption that these domain agnostic representations will generalize well. However, often individual domains contain characteristics which are unique and when leveraged can significantly aid in-domain recognition performance. To produce a model which best generalizes to both seen and unseen domains, we propose learning domain specific masks. The masks are encouraged to learn a balance of domain-invariant and domain-specific features, thus enabling a model which can benefit from the predictive power of specialized features while retaining the universal applicability of domain-invariant features. We demonstrate competitive performance compared to naive baselines and state-of-the-art methods on both PACS and DomainNet.

@inproceedings{2020EccvDMG, 
author = {Chattopadhyay, Prithvijit and 
	Balaji, Yogesh and Hoffman, 
	Judy},
title = {Learning to Balance Specificity 
	and Invariance for In and Out 
	of Domain Generalization},
year = 2020,
booktitle = {European Conference in Computer 
	Vision (ECCV)}
}
sym TIDE: A General Toolbox for Identifying Object Detection Errors
Daniel Bolya, Sean Foley, James Hays, Judy Hoffman.
European Conference in Computer Vision (ECCV), 2020 (Spotlight Presentation)
pdf | abstract | bibtex | code | project page | video

We introduce TIDE, a framework and associated toolbox1 for analyzing the sources of error in object detection and instance segmenta- tion algorithms. Importantly, our framework is applicable across datasets and can be applied directly to output prediction files without required knowledge of the underlying prediction system. Thus, our framework can be used as a drop-in replacement for the standard mAP computation while providing a comprehensive analysis of each model’s strengths and weaknesses. We segment errors into six types and, crucially, are the first to introduce a technique for measuring the contribution of each error in a way that isolates its effect on overall performance. We show that such a representation is critical for drawing accurate, comprehensive conclusions through in-depth analysis across 4 datasets and 7 recognition models.

@inproceedings{2020EccvTIDE, 
author = {Bolya, Daniel and Foley, Sean and 
	Hays, James and Hoffman, Judy},
title = {TIDE: A General Toolbox for 
	Identifying Object Detection 
	Errors},
year = 2020,
booktitle = {European Conference in Computer 
	Vision (ECCV)}
}
sym Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses
Fu Lin, Rohit Mittapali, Prithvijit Chattopadhyay, Daniel Bolya, Judy Hoffman.
Adversarial Robustness in the Real World (AROW), ECCV, 2020 (Best paper runner up)
pdf | abstract | bibtex

Convolutional Neural Networks (CNNs) have been shown to be vulnerable to adversarial examples, which are known to locate in subspaces close to where normal data lies but are not naturally occurring and have low probability. In this work, we investigate the potential effect defense techniques have on the geometry of the likelihood landscape - likelihood of the input images under the trained model. We first propose a way to visualize the likelihood landscape by leveraging an energy-based model interpretation of discriminative classifiers. Then we introduce a measure to quantify the flatness of the likelihood landscape. We observe that a subset of adversarial defense techniques results in a similar effect of flattening the likelihood landscape. We further explore directly regularizing towards a flat landscape for adversarial robustness.

@inproceedings{2020EccvWLikelihood, 
author = {Lin, Fu and Mittapali, Rohit and 
	Chattopadhyay, Prithvijit and 
	Bolya, Daniel and Hoffman, 
	Judy},
title = {Likelihood Landscapes: A Unifying 
	Principle Behind Many 
	Adversarial Defenses},
year = 2020,
booktitle = {Adversarial Robustness in the Real 
	World (AROW), ECCV}
}
sym SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation
Daniel Gordon, Abhishek Kadian, Devi Parikh, Judy Hoffman, Dhruv Batra.
IEEE/CVF International Conference in Computer Vision (ICCV), 2019
pdf | abstract | bibtex | code

We propose SplitNet, a method for decoupling visual perception and policy learning. By incorporating auxiliary tasks and selective learning of portions of the model, we explicitly decompose the learning objectives for visual navigation into perceiving the world and acting on that perception. We show dramatic improvements over baseline models on transferring between simulators, an encouraging step towards Sim2Real. Additionally, SplitNet generalizes better to unseen environments from the same simulator and transfers faster and more effectively to novel embodied navigation tasks. Further, given only a small sample from a target domain, SplitNet can match the performance of traditional end-to-end pipelines which receive the entire dataset.

@inproceedings{2019iccvsplitnet, 
author = {Gordon, Daniel and Kadian, 
	Abhishek and Parikh, Devi and 
	Hoffman, Judy and Batra, Dhruv},
title = {SplitNet: Sim2Sim and Task2Task 
	Transfer for Embodied Visual 
	Navigation},
year = 2019,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Robust Learning with Jacobian Regularization
Judy Hoffman, Daniel A. Roberts, Sho Yaida.
Conference on the Mathematical Theory of Deep Learning (DeepMath), 2019
pdf | abstract | bibtex | code

Design of reliable systems must guarantee stability against input perturbations. In machine learning, such guarantee entails preventing overfitting and ensuring robustness of models against corruption of input data. In order to maximize stability, we analyze and develop a computationally efficient implementation of Jacobian regularization that increases classification margins of neural networks. The stabilizing effect of the Jacobian regularizer leads to significant improvements in robustness, as measured against both random and adversarial input perturbations, without severely degrading generalization properties on clean data.

@inproceedings{2019DeepMathJacobian, 
author = {Hoffman, Judy and Roberts, Daniel 
	A. and Yaida, Sho},
title = {Robust Learning with Jacobian 
	Regularization},
year = 2019,
booktitle = {Conference on the Mathematical 
	Theory of Deep Learning 
	(DeepMath)}
}
sym Predictive Inequity in Object Detection
Benjamin Wilson, Judy Hoffman, Jamie Morgenstern.
Workshop on Fairness Accountability Transparency and Ethics in Computer Vision at CVPR, 2019
pdf | abstract | bibtex | code
Press: Vox | Business Insider | The Guardian | NBC News |

In this work, we investigate whether state-of-theart object detection systems have equitable predictive performance on pedestrians with different skin tones. This work is motivated by many recent examples of ML and vision systems displaying higher error rates for certain demographic groups than others. We annotate an existing large scale dataset which contains pedestrians, BDD100K, with Fitzpatrick skin tones in ranges [1-3] or [4-6]. We then provide an in depth comparative analysis of performance between these two skin tone groupings, finding that neither time of day nor occlusion explain this behavior, suggesting this disparity is not merely the result of pedestrians in the 4-6 range appearing in more difficult scenes for detection. We investigate to what extent time of day, occlusion, and reweighting the supervised loss during training affect this predictive bias.

@inproceedings{2019FateCV, 
author = {Wilson, Benjamin and Hoffman, Judy 
	and Morgenstern, Jamie},
title = {Predictive Inequity in Object 
	Detection},
year = 2019,
booktitle = {Workshop on Fairness 
	Accountability Transparency 
	and Ethics in Computer Vision 
	at CVPR}
}
sym Algorithms and Theory for Multiple-Source Adaptation
Judy Hoffman, Mehryar Mohri, Ningshan Zhang.
Neural Information Processing Systems (NeurIPS), 2018
pdf | abstract | bibtex

We present a number of novel contributions to the multiple-source adaptation problem. We derive new normalized solutions with strong theoretical guarantees for the cross-entropy loss and other similar losses. We also provide new guarantees that hold in the case where the conditional probabilities for the source domains are distinct. Moreover, we give new algorithms for determining the distributionweighted combination solution for the cross-entropy loss and other losses. We report the results of a series of experiments with real-world datasets. We find that our algorithm outperforms competing approaches by producing a single robustmodel that performs well on any target mixture distribution. Altogether, our theory, algorithms, and empirical results provide a full solution for the multiple-source adaptation problem with very practical benefits.

@inproceedings{2018neurips_madap, 
author = {Hoffman, Judy and Mohri, Mehryar 
	and Zhang, Ningshan},
title = {Algorithms and Theory for 
	Multiple-Source Adaptation},
year = 2018,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym CyCADA: Cycle Consistent Adversarial Domain Adaptation
Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alyosha Efros, Trevor Darrell.
International Conference in Machine Learning (ICML), 2018
pdf | abstract | bibtex | code | slides

Domain adaptation is critical for success in new, unseen environments. Adversarial adaptation models have shown tremendous progress towards adapting to new environments by focusing either on discovering domain invariant representations or by mapping between unpaired image domains. While feature space methods are difficult to interpret and sometimes fail to capture pixel-level and low-level domain shifts, image space methods sometimes fail to incorporate high level semantic knowledge relevant for the end task. We propose a model which adapts between domains using both generative image space alignment and latent representation space alignment. Our approach, Cycle-Consistent Adversarial Domain Adaptation (CyCADA), guides transfer between domains according to a specific discriminatively trained task and avoids divergence by enforcing consistency of the relevant semantics before and after adaptation. We evaluate our method on a variety of visual recognition and prediction settings, including digit classification and semantic segmentation of road scenes, advancing state-of-the-art performance for unsupervised adaptation from synthetic to real world driving domains.

@inproceedings{2018icmlcycada, 
author = {Hoffman, Judy and Tzeng, Eric and 
	Park, Taesung and Zhu, Jun-Yan 
	and Isola, Phillip and Saenko, 
	Kate and Efros, Alyosha and 
	Darrell, Trevor},
title = {CyCADA: Cycle Consistent 
	Adversarial Domain Adaptation},
year = 2018,
booktitle = {International Conference in 
	Machine Learning (ICML)}
}
sym Adapting to Continuously Shifting Domains
Andreea Bobu, Eric Tzeng, Judy Hoffman, Trevor Darrell.
International Conference on Learning Representations (ICLR) Workshop Track, 2018
pdf | abstract | bibtex

Domain adaptation typically focuses on adapting a model from a single source domain to a target domain. However, in practice, this paradigm of adapting from one source to one target is limiting, as different aspects of the real world such as illumination and weather conditions vary continuously and cannot be effectively captured by two static domains. Approaches that attempt to tackle this problem by adapting from a single source to many different target domains simultaneously are consistently unable to learn across all domain shifts. Instead, we propose an adaptation method that exploits the continuity between gradually varying domains by adapting in sequence from the source to the most similar target domain. By incrementally adapting while simultaneously efficiently regularizing against prior examples, we obtain a single strong model capable of recognition within all observed domains.

@inproceedings{2018iclrwBobu, 
author = {Bobu, Andreea and Tzeng, Eric and 
	Hoffman, Judy and Darrell, 
	Trevor},
title = {Adapting to Continuously Shifting 
	Domains},
year = 2018,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR) Workshop Track}
}
sym Scaling Human-Object Interaction Recognition through Zero-Shot Learning
Liyue Shen, Serena Yeung, Judy Hoffman, Greg Mori, Li Fei-Fei.
IEEE/CVF Winter Conference on Applications in Computer Vision (WACV), 2018
pdf | abstract | bibtex

Recognizing human object interactions (HOI) is an important part of distinguishing the rich variety of human action in the visual world. While recent progress has been made in improving HOI recognition in the fully supervised setting, the space of possible human-object interactions is large and it is impractical to obtain labeled training data for all interactions of interest. In this work, we tackle the challenge of scaling HOI recognition to the long tail of categories through a zero-shot learning approach. We introduce a factorized model for HOI detection that disentangles reasoning on verbs and objects, and at test-time can therefore produce detections for novel verb-object pairs. We present experiments on the recently introduced large-scale HICODET dataset, and show that our model is able to both perform comparably to state-of-the-art in fully-supervised HOI detection, while simultaneously achieving effective zeroshot detection of new HOI categories.

@inproceedings{2018wacv_hico, 
author = {Shen, Liyue and Yeung, Serena and 
	Hoffman, Judy and Mori, Greg 
	and Fei-Fei, Li},
title = {Scaling Human-Object Interaction 
	Recognition through Zero-Shot 
	Learning},
year = 2018,
booktitle = {IEEE/CVF Winter Conference on 
	Applications in Computer 
	Vision (WACV)}
}
sym Label Efficient Learning of Transferable Representations across Domains and Tasks
Zelun Luo, Yuliang Zou, Judy Hoffman, Li Fei-Fei.
Neural Information Processing Systems (NeurIPS), 2017
pdf | abstract | bibtex | project page

We propose a framework that learns a representation transferable across different domains and tasks in a label efficient manner. Our approach battles domain shift with a domain adversarial loss, and generalizes the embedding to novel task using a metric learning-based approach. Our model is simultaneously optimized on labeled source data and unlabeled or sparsely labeled data in the target domain. Our method shows compelling results on novel classes within a new domain even when only a few labeled examples per class are available, outperforming the prevalent fine-tuning approach. In addition, we demonstrate the effectiveness of our framework on the transfer learning task from image object recognition to video action recognition.

@inproceedings{2017NipsLuo, 
author = {Luo, Zelun and Zou, Yuliang and 
	Hoffman, Judy and Fei-Fei, Li},
title = {Label Efficient Learning of 
	Transferable Representations 
	across Domains and Tasks},
year = 2017,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach
Timnit Gebru, Judy Hoffman, Li Fei-Fei.
IEEE/CVF International Conference in Computer Vision (ICCV), 2017
pdf | abstract | bibtex

While fine-grained object recognition is an important problem in computer vision, current models are unlikely to accurately classify objects in the wild. These fully supervised models need additional annotated images to classify objects in every new scenario, a task that is infeasible. However, sources such as e-commerce websites and field guides provide annotated images for many classes. In this work, we study fine-grained domain adaptation as a step towards overcoming the dataset shift between easily acquired annotated images and the real world. Adaptation has not been studied in the fine-grained setting where annotations such as attributes could be used to increase performance. Our work uses an attribute based multi-task adaptation loss to increase accuracy from a baseline of 4.1% to 19.1% in the semi-supervised adaptation case. Prior do- main adaptation works have been benchmarked on small datasets such as [46] with a total of 795 images for some domains, or simplistic datasets such as [41] consisting of digits. We perform experiments on a subset of a new challenging fine-grained dataset consisting of 1,095,021 images of 2, 657 car categories drawn from e-commerce websites and Google Street View.

@inproceedings{2017iccvGebru, 
author = {Gebru, Timnit and Hoffman, Judy 
	and Fei-Fei, Li},
title = {Fine-grained Recognition in the 
	Wild: A Multi-Task Domain 
	Adaptation Approach},
year = 2017,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Inferring and Executing Programs for Visual Reasoning
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick.
IEEE/CVF International Conference in Computer Vision (ICCV), 2017 (Oral Presentation)
pdf | abstract | bibtex | code | project page

Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases in the data rather than learning to perform visual reasoning. Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer. Both the program generator and the execution engine are implemented by neural networks, and are trained using a combination of backpropagation and REINFORCE. Using the CLEVR benchmark for visual reasoning, we show that our model significantly outperforms strong baselines and generalizes better in a variety of settings.

@inproceedings{2017iccvJohnson, 
author = {Johnson, Justin and Hariharan, 
	Bharath and Maaten, Laurens 
	van der and Hoffman, Judy and 
	Fei-Fei, Li and Zitnick, C. 
	Lawrence and Girshick, Ross},
title = {Inferring and Executing Programs 
	for Visual Reasoning},
year = 2017,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Adversarial Discriminative Domain Adaptation
Eric Tzeng, Judy Hoffman, Trevor Darrell, Kate Saenko.
IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2017
pdf | abstract | bibtex | code

Adversarial learning methods are a promising approach to training robust deep networks, and can generate complex samples across diverse domains. They also can improve recognition despite the presence of domain shift or dataset bias: several adversarial approaches to unsupervised domain adaptation have recently been introduced, which reduce the difference between the training and test domain distributions and thus improve generalization performance. Prior generative approaches show compelling visualizations, but are not optimal on discriminative tasks and can be limited to smaller shifts. Prior discriminative approaches could handle larger domain shifts, but imposed tied weights on the model and did not exploit a GAN-based loss. We first outline a novel generalized framework for adversarial adaptation, which subsumes recent state-of-the-art approaches as special cases, and we use this generalized view to better relate the prior approaches. We propose a previously unexplored instance of our general framework which combines discriminative modeling, untied weight sharing, and a GAN loss, which we call Adversarial Discriminative Domain Adaptation (ADDA). We show that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and demonstrate the promise of our approach by exceeding state-of-the-art unsupervised adaptation results on standard cross-domain digit classification tasks and a new more difficult cross-modality object classification task.

@inproceedings{2017cvprAdda, 
author = {Tzeng, Eric and Hoffman, Judy and 
	Darrell, Trevor and Saenko, 
	Kate},
title = {Adversarial Discriminative Domain 
	Adaptation},
year = 2017,
booktitle = {IEEE/CVF Computer Vision and 
	Pattern Recognition (CVPR)}
}
sym Clockwork Convnets for Video Semantic Segmentation
Evan Shelhamer*, Kate Rakelly*, Judy Hoffman*, Trevor Darrell.
Video Semantic Segmentation Workshop at European Conference in Computer Vision, 2016
pdf | abstract | bibtex

Recent years have seen tremendous progress in still-image segmentation; however the na¨ıve application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video. We propose a video recognition framework that relies on two key observations: 1) while pixels may change rapidly from frame to frame, the semantic content of a scene evolves more slowly, and 2) execution can be viewed as an aspect of architecture, yielding purpose-fit computation schedules for networks. We define a novel family of “clockwork” convnets driven by fixed or adaptive clock signals that schedule the processing of different layers at different update rates according to their semantic stability. We design a pipeline schedule to reduce latency for real-time recognition and a fixed-rate schedule to reduce overall computation. Finally, we extend clockwork scheduling to adaptive video processing by incorporating data-driven clocks that can be tuned on unlabeled video. The accuracy and efficiency of clockwork convnets are evaluated on the Youtube-Objects, NYUD, and Cityscapes video datasets.

@inproceedings{2016eccvw_clockwork, 
author = {Shelhamer*, Evan and Rakelly*, 
	Kate and Hoffman*, Judy and 
	Darrell, Trevor},
title = {Clockwork Convnets for Video 
	Semantic Segmentation},
year = 2016,
booktitle = {Video Semantic Segmentation 
	Workshop at European 
	Conference in Computer Vision}
}
sym Adapting deep visuomotor representations with weak pairwise constraints
Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell.
Workshop on Algorithmic Foundations in Robotics (WAFR), 2016
pdf | abstract | bibtex

Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world.

@inproceedings{2016wafrTzeng, 
author = {Tzeng, Eric and Devin, Coline and 
	Hoffman, Judy and Finn, 
	Chelsea and Abbeel, Pieter and 
	Levine, Sergey and Saenko, 
	Kate and Darrell, Trevor},
title = {Adapting deep visuomotor 
	representations with weak 
	pairwise constraints},
year = 2016,
booktitle = {Workshop on Algorithmic 
	Foundations in Robotics (WAFR)}
}
sym Fine-To-Coarse Knowledge Transfer For Low-Res Image Classification
Xingchao Peng, Judy Hoffman, Stella Yu, Kate Saenko.
International Conference on Image Processing (ICIP), 2016
pdf | abstract | bibtex

We address the difficult problem of distinguishing fine-grained object categories in low resolution images. Wepropose a simple an effective deep learning approach that transfers fine-grained knowledge gained from high resolution training data to the coarse low-resolution test scenario. Such fine-to-coarse knowledge transfer has many real world applications, such as identifying objects in surveillance photos or satellite images where the image resolution at the test time is very low but plenty of high resolution photos of similar objects are available. Our extensive experiments on two standard benchmark datasets containing fine-grained car models and bird species demonstrate that our approach can effectively transfer fine-detail knowledge to coarse-detail imagery.

@inproceedings{2016icipPeng, 
author = {Peng, Xingchao and Hoffman, Judy 
	and Yu, Stella and Saenko, 
	Kate},
title = {Fine-To-Coarse Knowledge Transfer 
	For Low-Res Image 
	Classification},
year = 2016,
booktitle = {International Conference on Image 
	Processing (ICIP)}
}
sym Cross Modal Distillation for Supervision Transfer
Saurabh Gupta, Judy Hoffman, Jitendra Malik.
IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2016
pdf | abstract | bibtex | code

In this work we propose a technique that transfers supervision between images from different modalities. We use learned representations from a large labeled modality as a supervisory signal for training representations for a new unlabeled paired modality. Our method enables learning of rich representations for unlabeled modalities and can be used as a pre-training procedure for new modalities with limited labeled data. We show experimental results where we transfer supervision from labeled RGB images to unlabeled depth and optical flow images and demonstrate large improvements for both these cross modal supervision transfers.

@inproceedings{2016cvprGupta, 
author = {Gupta, Saurabh and Hoffman, Judy 
	and Malik, Jitendra},
title = {Cross Modal Distillation for 
	Supervision Transfer},
year = 2016,
booktitle = {IEEE/CVF Computer Vision and 
	Pattern Recognition (CVPR)}
}
sym Learning with Side Information through Modality Hallucination
Judy Hoffman, Saurabh Gupta, Trevor Darrell.
IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2016 (Spotlight Presentation)
pdf | abstract | bibtex | slides

We present a modality hallucination architecture for training an RGB object detection model which incorporates depth side information at training time. Our convolutional hallucination network learns a new and complementary RGB image representation which is taught to mimic convolutional mid-level features from a depth network. At test time images are processed jointly through the RGB and hallucination networks to produce improved detection performance. Thus, our method transfers information commonly extracted from depth training data to a network which can extract that information from the RGB counterpart. We present results on the standard NYUDv2 dataset and report improvement on the RGB detection task.

@inproceedings{2016cvprHoffman, 
author = {Hoffman, Judy and Gupta, Saurabh 
	and Darrell, Trevor},
title = {Learning with Side Information 
	through Modality Hallucination},
year = 2016,
booktitle = {IEEE/CVF Computer Vision and 
	Pattern Recognition (CVPR)}
}
sym Cross-Modal Adaptation for RGB-D Detection
Judy Hoffman, Saurabh Gupta, Jian Leong, Sergio Guadarrama, Trevor Darrell.
International Conference in Robotics and Automation (ICRA), 2016
pdf | abstract | bibtex | slides

Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world.

@inproceedings{2016icraHoffman, 
author = {Hoffman, Judy and Gupta, Saurabh 
	and Leong, Jian and 
	Guadarrama, Sergio and 
	Darrell, Trevor},
title = {Cross-Modal Adaptation for RGB-D 
	Detection},
year = 2016,
booktitle = {International Conference in 
	Robotics and Automation (ICRA)}
}
sym Quantification in-the-wild: data-sets and baselines
Oscar Beijbom, Judy Hoffman, Evan Yao, Trevor Darrell, Alberto Rodriguez-Ramirez, Manuel Gonzlez-Rivero, Ove Hoegh-Guldberg.
Transfer and Multi-Task Learning: Trends and New Perspectives, Workshop at NeurIPS, 2015
pdf | bibtex

@inproceedings{2015nipswBeijbom, 
author = {Beijbom, Oscar and Hoffman, Judy 
	and Yao, Evan and Darrell, 
	Trevor and Rodriguez-Ramirez, 
	Alberto and Gonzlez-Rivero, 
	Manuel and Hoegh-Guldberg, Ove},
title = {Quantification in-the-wild: 
	data-sets and baselines},
year = 2015,
booktitle = {Transfer and Multi-Task Learning: 
	Trends and New Perspectives, 
	Workshop at NeurIPS}
}
sym Spatial Semantic Regularisation for Large Scale Object Detection
Damian Mrowca, Marcus Rohrbach, Judy Hoffman, Ronghang Hu, Kate Saenko, Trevor Darrell.
IEEE/CVF International Conference in Computer Vision (ICCV), 2015
pdf | abstract | bibtex

Large scale object detection with thousands of classes introduces the problem of many contradicting false positive detections, which have to be suppressed. Class-independent non-maximum suppression has traditionally been used for this step, but it does not scale well as the number of classes grows. Traditional non-maximum suppression does not consider label- and instance-level relationships nor does it allow an exploitation of the spatial layout of detection proposals. We propose a new multi-class spatial semantic regularisation method based on affinity propagation clustering, which simultaneously optimises across all categories and all proposed locations in the image, to improve both the localisation and categorisation of selected detection proposals. Constraints are shared across the labels through the semantic WordNet hierarchy. Our approach proves to be especially useful in large scale settings with thousands of classes, where spatial and semantic interactions are very frequent and only weakly supervised detectors can be built due to a lack of bounding box annotations. Detection experiments are conducted on the ImageNet and COCO dataset, and in settings with thousands of detected categories. Our method provides a significant precision improvement by reducing false positives, while simultaneously improving the recall.

@inproceedings{2015iccvMrowca, 
author = {Mrowca, Damian and Rohrbach, 
	Marcus and Hoffman, Judy and 
	Hu, Ronghang and Saenko, Kate 
	and Darrell, Trevor},
title = {Spatial Semantic Regularisation 
	for Large Scale Object 
	Detection},
year = 2015,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Simultaneous Deep Transfer Across Domains and Tasks
Eric Tzeng*, Judy Hoffman*, Trevor Darrell, Kate Saenko.
IEEE/CVF International Conference in Computer Vision (ICCV), 2015
pdf | abstract | bibtex | code

Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias. Fine-tuning deep models in a new domain can require a significant amount of labeled data, which for many applications is simply not available. We propose a new CNN architecture to exploit unlabeled and sparsely labeled target domain data. Our approach simultaneously optimizes for domain invariance to facilitate domain transfer and uses a soft label distribution matching loss to transfer information between tasks. Our proposed adaptation method offers empirical performance which exceeds previously published results on two standard benchmark visual domain adaptation tasks, evaluated across supervised and semi-supervised adaptation settings.

@inproceedings{2015iccvTzeng, 
author = {Tzeng*, Eric and Hoffman*, Judy 
	and Darrell, Trevor and 
	Saenko, Kate},
title = {Simultaneous Deep Transfer Across 
	Domains and Tasks},
year = 2015,
booktitle = {IEEE/CVF International Conference 
	in Computer Vision (ICCV)}
}
sym Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning
Judy Hoffman, Deepak Pathak, Trevor Darrell, Kate Saenko.
IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2015
pdf | abstract | bibtex

We develop methods for detector learning which exploit joint training over both weak and strong labels and which transfer learned perceptual representations from strongly-labeled auxiliary tasks. Previous methods for weak-label learning often learn detector models independently using latent variable optimization, but fail to share deep representation knowledge across classes and usually require strong initialization. Other previous methods transfer deep representations from domains with strong labels to those with only weak labels, but do not optimize over individual latent boxes, and thus may miss specific salient structures for a particular category. We propose a model that subsumes these previous approaches, and simultaneously trains a representation and detectors for categories with either weak or strong labels present. We provide a novel formulation of a joint multiple instance learning method that includes examples from classification-style data when available, and also performs domain transfer learning to improve the underlying detector representation. Our model outperforms known methods on ImageNet-200 detection with weak labels.

@inproceedings{2015cvprHoffman, 
author = {Hoffman, Judy and Pathak, Deepak 
	and Darrell, Trevor and 
	Saenko, Kate},
title = {Detector Discovery in the Wild: 
	Joint Multiple Instance and 
	Representation Learning},
year = 2015,
booktitle = {IEEE/CVF Computer Vision and 
	Pattern Recognition (CVPR)}
}
sym LSDA: Large Scale Detection through Adaptation
Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko.
Neural Information Processing Systems (NeurIPS), 2014
pdf | abstract | bibtex | project page | slides

A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) have emerged as clear winners on object classification benchmarks, in part due to training with 1.2M+ labeled classification images. Unfortunately, only a small fraction of those labels are available for the detection task. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect detection data and label it with precise bounding boxes. In this paper, we propose Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Our method has the potential to enable detection for the tens of thousands of categories that lack bounding box annotations, yet have plenty of classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge demonstrates the efficacy of our approach. This algorithm enables us to produce a >7.6K detector by using available classification data from leaf nodes in the ImageNet tree. We additionally demonstrate how to modify our architecture to produce a fast detector (running at 2fps for the 7.6K detector).

@inproceedings{2014nipsHoffman, 
author = {Hoffman, Judy and Guadarrama, 
	Sergio and Tzeng, Eric and Hu, 
	Ronghang and Donahue, Jeff and 
	Girshick, Ross and Darrell, 
	Trevor and Saenko, Kate},
title = {LSDA: Large Scale Detection 
	through Adaptation},
year = 2014,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell.
International Conference in Machine Learning (ICML), 2014
pdf | abstract | bibtex | code

We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.

@inproceedings{2014icmlDonahue, 
author = {Donahue, Jeff and Jia, Yangqing 
	and Vinyals, Oriol and 
	Hoffman, Judy and Zhang, Ning 
	and Tzeng, Eric and Darrell, 
	Trevor},
title = {DeCAF: A Deep Convolutional 
	Activation Feature for Generic 
	Visual Recognition},
year = 2014,
booktitle = {International Conference in 
	Machine Learning (ICML)}
}
sym Continuous Manifold Based Adaptation for Evolving Visual Domains
Judy Hoffman, Trevor Darrell, Kate Saenko.
IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2014
pdf | abstract | bibtex | project page | video

We pose the following question: what happens when test data not only differs from training data, but differs from it in a continually evolving way? The classic domain adaptation paradigm considers the world to be separated into stationary domains with clear boundaries between them. However, in many real-world applications, examples cannot be naturally separated into discrete domains, but arise from a continuously evolving underlying process. Examples include video with gradually changing lighting and spam email with evolving spammer tactics. We formulate a novel problem of adapting to such continuous domains, and present a solution based on smoothly varying embeddings. Recent work has shown the utility of considering discrete visual domains as fixed points embedded in a manifold of lower-dimensional subspaces. Adaptation can be achieved via transforms or kernels learned between such stationary source and target subspaces. We propose a method to consider non-stationary domains, which we refer to as Continuous Manifold Adaptation (CMA). We treat each target sample as potentially being drawn from a different subspace on the domain manifold, and present a novel technique for continuous transform-based adaptation. Our approach can learn to distinguish categories using training data collected at some point in the past, and continue to update its model of the categories for some time into the future, without receiving any additional labels. Experiments on two visual datasets demonstrate the value of our approach for several popular feature representations.

@inproceedings{2014cvprHoffman, 
author = {Hoffman, Judy and Darrell, Trevor 
	and Saenko, Kate},
title = {Continuous Manifold Based 
	Adaptation for Evolving Visual 
	Domains},
year = 2014,
booktitle = {IEEE/CVF Computer Vision and 
	Pattern Recognition (CVPR)}
}
sym Interactive Adaptation of Real-Time Object Detectors
Daniel Goehring, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell.
International Conference in Robotics and Automation (ICRA), 2014
pdf | bibtex | project page

@inproceedings{2014icraGoering, 
author = {Goehring, Daniel and Hoffman, Judy 
	and Rodner, Erik and Saenko, 
	Kate and Darrell, Trevor},
title = {Interactive Adaptation of 
	Real-Time Object Detectors},
year = 2014,
booktitle = {International Conference in 
	Robotics and Automation (ICRA)}
}
sym Asymmetric and Category Invariant Feature Transformations for Domain Adaptation
Judy Hoffman, Erik Rodner, Jeff Donahue, Brian Kulis, Kate Saenko.
International Journal in Computer Vision (IJCV), 2013
pdf | abstract | bibtex

We address the problem of visual domain adaptation for transferring object models from one dataset or visual domain to another. We introduce a unified flexible model for both supervised and semi-supervised learning that allows us to learn transformations between domains. Additionally, we present two instantiations of the model, one for general feature adaptation/alignment, and one specifically designed for classification. First, we show how to extend metric learning methods for domain adaptation, allowing for learning metrics independent of the domain shift and the final classifier used. Furthermore, we go beyond classical metric learning by extending the method to asymmetric, category independent transformations. Our framework can adapt features even when the target domain does not have any labeled examples for some categories, and when the target and source features have different dimensions. Finally, we develop a joint learning framework for adaptive classifiers, which outperforms competing methods in terms of multi-class accuracy and scalability. We demonstrate the ability of our approach to adapt object recognition models under a variety of situations, such as differing imaging conditions, feature types, and codebooks. The experiments show its strong performance compared to previous approaches and its applicability to largescale scenarios

@article{2013ijcvHoffman, 
author = {Hoffman, Judy and Rodner, Erik and 
	Donahue, Jeff and Kulis, Brian 
	and Saenko, Kate},
title = {Asymmetric and Category Invariant 
	Feature Transformations for 
	Domain Adaptation},
year = 2013,
journal = {International Journal in Computer 
	Vision (IJCV)}
}
sym Semi-Supervised Domain Adaptation with Instance Constraints
Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell.
IEEE/CVF Computer Vision and Pattern Recognition (CVPR), 2013
pdf | bibtex | poster

@inproceedings{2013cvprDonahue, 
author = {Donahue, Jeff and Hoffman, Judy 
	and Rodner, Erik and Saenko, 
	Kate and Darrell, Trevor},
title = {Semi-Supervised Domain Adaptation 
	with Instance Constraints},
year = 2013,
booktitle = {IEEE/CVF Computer Vision and 
	Pattern Recognition (CVPR)}
}
sym sym Efficient Learning of Domain-invariant Image Representations
Judy Hoffman, Erik Rodner, Jeff Donahue, Kate Saenko, Trevor Darrell.
International Conference on Learning Representations (ICLR), 2013 (Oral Presentation)
pdf | abstract | bibtex | code | slides

We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches.

@inproceedings{2013iclrHoffman, 
author = {Hoffman, Judy and Rodner, Erik and 
	Donahue, Jeff and Saenko, Kate 
	and Darrell, Trevor},
title = {Efficient Learning of 
	Domain-invariant Image 
	Representations},
year = 2013,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR)}
}
sym Discovering Latent Domains For Multisource Domain Adaptation
Judy Hoffman, Brian Kulis, Trevor Darrell, Kate Saenko.
European Conference in Computer Vision (ECCV), 2012
pdf | bibtex | code | slides | poster | video

@inproceedings{2012eccvHoffman, 
author = {Hoffman, Judy and Kulis, Brian and 
	Darrell, Trevor and Saenko, 
	Kate},
title = {Discovering Latent Domains For 
	Multisource Domain Adaptation},
year = 2012,
booktitle = {European Conference in Computer 
	Vision (ECCV)}
}
sym sym Weakly Supervised Learning of Object Segmentations from Web-Scale Video
Glen Hartmann, Matthias Grundmann, Judy Hoffman, David Tsai, Vivek Kwatra, Omid Madani, Sudheendra Vijayanarasimhan, Irfan Essa, James Rehg, Rahul Sukthankar.
eccvw-webscale, 2012 (Best Paper Award)
pdf | abstract | bibtex

We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches.

@inproceedings{2012eccvwHartmann, 
author = {Hartmann, Glen and Grundmann, 
	Matthias and Hoffman, Judy and 
	Tsai, David and Kwatra, Vivek 
	and Madani, Omid and 
	Vijayanarasimhan, Sudheendra 
	and Essa, Irfan and Rehg, 
	James and Sukthankar, Rahul},
title = {Weakly Supervised Learning of 
	Object Segmentations from 
	Web-Scale Video},
year = 2012,
booktitle = {eccvw-webscale}
}
sym Domain Adaptation with Multiple Latent Domains
Judy Hoffman, Kate Saenko, Brian Kulis, Trevor Darrell.
Domain Adaptation Workshop at Neural Information Processing Symposium (NeurIps), 2011 (Best Student Paper Award)
bibtex

@inproceedings{2011nipswHoffman, 
author = {Hoffman, Judy and Saenko, Kate and 
	Kulis, Brian and Darrell, 
	Trevor},
title = {Domain Adaptation with Multiple 
	Latent Domains},
year = 2011,
booktitle = {Domain Adaptation Workshop at 
	Neural Information Processing 
	Symposium (NeurIps)}
}
sym EG-RRT: Environment-Guided Random Trees for Kinodynamic Motion Planning with Uncertainty and Obstacles
Leonard Jaillet, Judy Hoffman, Jur van den Berg, Pieter Abbeel, Josep M. Porta, Ken Goldberg.
International Conference on Intelligent Robotics and Systems (IROS), 2011
pdf | abstract | bibtex

Existing sampling-based robot motion planning methods are often inefficient at finding trajectories for kinodynamic systems, especially in the presence of narrow passages between obstacles and uncertainty in control and sensing. To address this, we propose EG-RRT, an Environment-Guided variant of RRT designed for kinodynamic robot systems that combines elements from several prior approaches and may incorporate a cost model based on the LQG-MP framework to estimate the probability of collision under uncertainty in control and sensing. We compare the performance of EG-RRT with several prior approaches on challenging sample problems. Results suggest that EG-RRT offers significant improvements in performance.

@inproceedings{2011irosJaillet, 
author = {Jaillet, Leonard and Hoffman, Judy 
	and Berg, Jur van den and 
	Abbeel, Pieter and Porta, 
	Josep M. and Goldberg, Ken},
title = {EG-RRT: Environment-Guided Random 
	Trees for Kinodynamic Motion 
	Planning with Uncertainty and 
	Obstacles},
year = 2011,
booktitle = {International Conference on 
	Intelligent Robotics and 
	Systems (IROS)}
}
Teaching
CS 6476-A: Computer Vision
Georgia Tech - Spring 2024 (Instructor)
CS 7647: Machine Learning with Limited Supervision
Georgia Tech - Fall 2023 (Instructor)
CS 4476-A: Introduction to Computer Vision
Georgia Tech - Spring 2023 (Instructor)
CS 8803 LS: Machine Learning with Limited Supervision
Georgia Tech - Fall 2022 (Instructor)
CS 4476-A: Introduction to Computer Vision
Georgia Tech - Spring 2022 (Instructor)
CS 8803 LS: Machine Learning with Limited Supervision
Georgia Tech - Fall 2021 (Instructor)
CS 4476: Introduction to Computer Vision
Georgia Tech - Spring 2021 (Instructor)
CS 4476/6476: Computer Vision
Georgia Tech - Spring 2020 (Instructor)
CS 8803 LS: Machine Learning with Limited Supervision
Georgia Tech - Fall 2019 (Instructor)
CS 188: Introduction to Artificial Intelligence
UC Berkeley - Spring 2013 (Teaching Assistant)
EE 20N: Signals and System
UC Berkeley - Fall 2009 (Teaching Assistant)
Selected Awards
Selected Service and Outreach

  • Mentoring and Outreach
    • Mentor to junior researchers at CVPR Main Conference 2021,2022
    • Panelist at Woodward Academy High School Discussion on Bias in Machine Learning, 2021
    • African Masters in Machine Intelligence Research Mentor, 2020
    • Mentor at Women in Computer Vision (2018-2022)
    • Mentor at Doctorial Consortium, ICCV 2019, CVPR 2022
    • Mentor at Women in Machine Learning (2018)
  • Leadership
    • Co-founder and continued advisor for Women in Computer Vision (WiCV), 2015 - Present
      Provides support for junior researchers who idenify as women to attend (travel support), present (exposure), and receive mentorship at top Computer Vision Conferences.
    • Co-organizer Workshop on Responsible Computer Vision at ECCV, 2022
    • Co-organizer Workshop on Learning from Limited and Imperfect Data at ECCV, 2022
    • Co-organizer Workshop on Adversarial Robustness in the Real World at ECCV, 2022
    • Co-organizer LVIS Workshop and Challenge at ICCV, 2021
    • Co-organizer Adversarial Robustness in the Real World Workshop at ICCV, 2021
    • Co-organizer Responsible Computer Vision Workshop at CVPR, 2021
    • Co-organizer Adversarial Machine Learning in Computer Vision Tutorial at CVPR, 2021
    • Co-organizer Learning from Limited and Imperfect Data Workshop at CVPR, 2021
    • Co-organizer Tutorial on Learning with Limited Labels at ICCV, 2019
    • Co-organizer TASK-CV Workshop, ICCV 2019
  • Recent Editorial Service
    • Program Chair CVPR, 2023
    • Tutorial Chair ICCV, 2023
    • Associate Editor T-PAMI, 2021 - Present
    • Associate Editor IJCV, 2020 - 2022
    • Area Chair CVPR, 2019 - 2021
    • Area Chair NeurIPS, 2021
    • Area Chair ICLR, 2019-2020
    • Area Chair ICCV, 2019, 2021
    • Area Chair ICML, 2020
    • See CV for full reviewing list

Sponsors

My research is made possible by the generous support of the following organizations.