Judy Hoffman
Email judy (at) gatech.edu

Assistant Professor in the School of Interactive Computing at Georgia Tech and a member of the Machine Learning Center. Research interests include computer vision, machine learning, domain adaptation, robustness, and fairness.

Prior to joining Georgia Tech, Dr. Hoffman was a Visiting Research Scientist at Facebook AI Research and a postdoctoral scholar at Stanford University and UC Berkeley. She received her PhD from UC Berkeley, EECS in 2016 where she was a member of BAIR and BDD.

Prospective Students: Read before contacting. If you are interested in joining my group and are not currently at Georgia Tech, please apply directly to the college. Unfortunately, I cannot respond to individual requests from students outside Tech. If you are already a PhD student at Georgia Tech, feel free to contact me directly via email and include your resume and research interests. For Georgia Tech MS and undergraduates interested in research experience, please fill out this application form. Note, we have no availability for Fall 2022 and will review applications for Spring 2023 in late Fall. Unfortunately, my group is not accepting visitors from outside Georgia Tech at this time.

Bio | CV | Google Scholar | Github | Twitter

News
Research Group

Viraj Prabhu
PhD Student

Daniel Bolya
PhD Student

Sean Foley
PhD Student
(Co-advised w/ James Hays)

George Stoica
PhD Student

Simar Kareer
PhD Student

Bhavika Devnani
MS Student

Kartik Sarangmath
MS Student

Aayushi Agarwal
MS Student

Taylor Hearn
MS Student

Aaditya Singh
MS Student

Vivek Vijaykumar
BS Student

Jakob Bjorner
BS Student

Alumni

Luis Bermudez
MS student
Next Intel

Arvind Krisnakumar
MS Spring 2021

Shivam Khare
MS Spring 2021
Next Twitter AI

Rohit Mittapalli
BS Spring 2021
Next Startup

Fu Lin
MS Spring 2020
Next AWS Beijing

Sachit Kuhar
MS during Spring 2021

Deeksha Kartik
MS Spring 2022
Next PathAI

Sruthi Sudhakar
BS Spring 2022
Next PhD Student Columbia

Research

My research lies at the intersection of computer vision and machine learning and focuses on tackling real-world variation and scale while minimizing human supervision. I develop learning algorithms which facilitate transfer of information through unsupervised and semi-supervised model adaptation and generalization.

sym AUGCO: Augmentation Consistency-guided Self-training for Source-free Domain Adaptive Semantic Segmentation
Viraj Prabhu*, Shivam Khare*, Deeksha Kartik, Judy Hoffman.
arXiv, 2022
pdf | bibtex

@inproceedings{2022arXiv_Zaugco, 
author = {Prabhu*, Viraj and Khare*, Shivam 
	and Kartik, Deeksha and 
	Hoffman, Judy},
title = {AUGCO: Augmentation 
	Consistency-guided 
	Self-training for Source-free 
	Domain Adaptive Semantic 
	Segmentation},
year = 2022,
booktitle = {arXiv}
}
sym [NEW]Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency
Viraj Prabhu*, Sriram Yenamandra*, Aaditya Singh, Judy Hoffman.
Neural Information Processing Systems (NeurIPS), 2022
pdf | bibtex

@inproceedings{2022NeurIPS_pacmac, 
author = {Prabhu*, Viraj and Yenamandra*, 
	Sriram and Singh, Aaditya and 
	Hoffman, Judy},
title = {Adapting Self-Supervised Vision 
	Transformers by Probing 
	Attention-Conditioned Masking 
	Consistency},
year = 2022,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym [NEW]ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings
Arjun Majumdar*, Gunjan Aggarwal*, Bhavika Devnani, Judy Hoffman, Dhruv Batra.
Neural Information Processing Systems (NeurIPS), 2022
pdf | bibtex

@inproceedings{2023NeurIPS_zeroNav, 
author = {Majumdar*, Arjun and Aggarwal*, 
	Gunjan and Devnani, Bhavika 
	and Hoffman, Judy and Batra, 
	Dhruv},
title = {ZSON: Zero-Shot Object-Goal 
	Navigation using Multimodal 
	Goal Embeddings},
year = 2022,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym [NEW]Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Judy Hoffman.
ECCV Workshop on Computational Aspects of Deep Learning, 2022
pdf | bibtex

@inproceedings{2022ECCVW_HydraAttention, 
author = {Bolya, Daniel and Fu, Cheng-Yang 
	and Dai, Xiaoliang and Zhang, 
	Peizhao and Hoffman, Judy},
title = {Hydra Attention: Efficient 
	Attention with Many Heads},
year = 2022,
booktitle = {ECCV Workshop on Computational 
	Aspects of Deep Learning}
}
sym Can domain adaptation make object recognition work for everyone?
Viraj Prabhu, Ramprasaath R. Selvaraju, Judy Hoffman, Nikhil Naik.
Computer Vision and Pattern Recognition (CVPR) L3D Workshop, 2022
pdf | abstract | bibtex

Despite the rapid progress in deep visual recognition, modern computer vision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies. We investigate the effectiveness of unsupervised domain adaptation (UDA) of such models across geographies at closing this performance gap. To do so, we first curate two shifts from existing datasets to study the Geographical DA problem, and discover new challenges beyond data distribution shift: context shift, wherein object surroundings may change significantly across geographies, and subpopulation shift, wherein the intra-category distributions may shift. We demonstrate the inefficacy of standard DA methods at Geographical DA, highlighting the need for specialized geographical adaptation solutions to address the challenge of making object recognition work for everyone.

@inproceedings{2022CVPR_GeoDA, 
author = {Prabhu, Viraj and Selvaraju, 
	Ramprasaath R. and Hoffman, 
	Judy and Naik, Nikhil},
title = {Can domain adaptation make object 
	recognition work for everyone?},
year = 2022,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR) L3D 
	Workshop}
}
sym VISCUIT: Visual Auditor for Bias in CNN Image Classifier
Seongmin Lee, Zijie J. Wang, Judy Hoffman, Duen Horng (Polo) Chau.
Computer Vision and Pattern Recognition (CVPR) Demo Track, 2022
pdf | bibtex | project page

@inproceedings{2022CVPR_VisCUIT, 
author = {Lee, Seongmin and Wang, Zijie J. 
	and Hoffman, Judy and Chau, 
	Duen Horng (Polo)},
title = {VISCUIT: Visual Auditor for Bias 
	in CNN Image Classifier},
year = 2022,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR) Demo Track}
}
sym Scalable Diverse Model Selection for Accessible Transfer Learning
Daniel Bolya*, Rohit Mittapali*, Judy Hoffman.
Neural Information Processing Systems (NeurIPS), 2021
pdf | abstract | bibtex | code | project page | video

With the preponderance of pretrained deep learning models available off-the-shelf from model banks today, finding the best weights to fine-tune to your use-case can be a daunting task. Several methods have recently been proposed to find good models for transfer learning, but they either don't scale well to large model banks or don't perform well on the diversity of off-the-shelf models. Ideally the question we want to answer is, given some data and a source model, can you quickly predict the model's accuracy after fine-tuning? We formalize this setting as Scalable Diverse Model Selection and propose several benchmarks for evaluating on this task. We find that existing model selection and transferability estimation methods perform poorly here and analyze why this is the case. We then introduce simple techniques to improve the performance and speed of these algorithms. Finally, we iterate on existing methods to create PARC, which outperforms all other methods on diverse model selection. We intend to release the benchmarks and method code in hope to inspire future work in model selection for accessible transfer learning.

@inproceedings{2021NeurIPSModelFinder, 
author = {Bolya*, Daniel and Mittapali*, 
	Rohit and Hoffman, Judy},
title = {Scalable Diverse Model Selection 
	for Accessible Transfer 
	Learning},
year = 2021,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Mitigating Bias in Visual Transformers via Targeted Alignment
Sruthi Sudhakar, Viraj Prabhu, Arvind Krishnakumar, Judy Hoffman.
British Machine Vision Conference (BMVC), 2021
pdf | abstract | bibtex | project page

As transformer architectures become increasingly prevalent in computer vision, itis critical to understand their fairness implications. We perform the first study of thefairness of transformers applied to computer vision and benchmark several bias miti-gation approaches from prior work. We visualize the feature space of the transformerself-attention modules and discover that a significant portion of the bias is encoded in thequery matrix. With this knowledge, we proposeTADeT, a targeted alignment strategyfor debiasing transformers that aims to discover and remove bias primarily from querymatrix features. We measure performance using Balanced Accuracy and Standard Ac-curacy, and fairness using Equalized Odds and Balanced Accuracy Difference.TADeTconsistently leads to improved fairness over prior work on multiple attribute predictiontasks on the CelebA dataset, without compromising performance.

@inproceedings{2021BMVC_TADET, 
author = {Sudhakar, Sruthi and Prabhu, Viraj 
	and Krishnakumar, Arvind and 
	Hoffman, Judy},
title = {Mitigating Bias in Visual 
	Transformers via Targeted 
	Alignment},
year = 2021,
booktitle = {British Machine Vision Conference 
	(BMVC)}
}
sym UDIS: Unsupervised Discovery of Bias in Deep Visual Recognition Models
Arvind Krishnakumar, Viraj Prabhu, Sruthi Sudhakar, Judy Hoffman.
British Machine Vision Conference (BMVC), 2021
pdf | abstract | bibtex | code | project page

Deep learning models have been shown to learn spurious correlations from data that sometimes lead to systematic failures for certain subpopulations. Prior work has typically diagnosed this by crowdsourcing annotations for various protected attributes and measur- ing performance, which is both expensive to acquire and difficult to scale. In this work, we propose UDIS, an unsupervised algorithm for surfacing and analyzing such failure modes. UDIS identifies subpopulations via hierarchical clustering of dataset embeddings and surfaces systematic failure modes by visualizing low performing clusters along with their gradient-weighted class-activation maps. We show the effectiveness of UDIS in identifying failure modes in models trained for image classification on the CelebA and MSCOCO datasets. UDIS is available at https://github.com/akrishna77/ bias- discovery.

@inproceedings{2021BMVC_UDIS, 
author = {Krishnakumar, Arvind and Prabhu, 
	Viraj and Sudhakar, Sruthi and 
	Hoffman, Judy},
title = {UDIS: Unsupervised Discovery of 
	Bias in Deep Visual 
	Recognition Models},
year = 2021,
booktitle = {British Machine Vision Conference 
	(BMVC)}
}
sym Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation
Viraj Prabhu, Shivam Khare, Deeksha Karthik, Judy Hoffman.
International Conference in Computer Vision (ICCV), 2021
pdf | abstract | bibtex | code | project page

Many existing approaches for unsupervised domain adaptation (UDA) focus on adapting under only data distribution shift and offer limited success under additional cross-domain label distribution shift. Recent work based on self-training using target pseudo-labels has shown promise, but on challenging shifts pseudo-labels may be highly unreliable, and using them for self-training may cause error accumulation and domain misalignment. We propose Selective Entropy Optimization via Committee Consistency (SENTRY), a UDA algorithm that judges the reliability of a target instance based on its predictive consistency under a committee of random image transformations. Our algorithm then selectively minimizes predictive entropy to increase confidence on highly consistent target instances, while maximizing predictive entropy to reduce confidence on highly inconsistent ones. In combination with pseudo-label based approximate target class balancing, our approach leads to significant improvements over the state-of-the-art on 27/31 domain shifts from standard UDA benchmarks as well as benchmarks designed to stress-test adaptation under label distribution shift.

@inproceedings{2021arXivSENTRY, 
author = {Prabhu, Viraj and Khare, Shivam 
	and Karthik, Deeksha and 
	Hoffman, Judy},
title = {Selective Entropy Optimization via 
	Committee Consistency for 
	Unsupervised Domain Adaptation},
year = 2021,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym RobustNav: Towards Benchmarking Robustness in Embodied Navigation
Prithvijit Chattopadhyay, Judy Hoffman, Roozbeh Mottaghi, Ani Kembhavi.
International Conference in Computer Vision (ICCV), 2021 (Oral Presentation)
pdf | abstract | bibtex | code | project page

As an attempt towards assessing the robustness of embodied navigation agents, we propose RobustNav, a framework to quantify the performance of embodied navigation agents when exposed to a wide variety of visual – affecting RGB inputs – and dynamics – affecting transition dynamics – corruptions. Most recent efforts in visual navigation have typically focused on generalizing to novel target environments with similar appearance and dynamics characteristics. With RobustNav, we find that some standard embodied navigation agents significantly underperform (or fail) in the presence of visual or dynamics corruptions. We systematically analyze the kind of idiosyncrasies that emerge in the behavior of such agents when operating under corruptions. Finally, for visual corruptions in RobustNav, we show that while standard techniques to improve robustness such as data-augmentation and self-supervised adaptation offer some zero-shot resistance and improvements in navigation performance, there is still a long way to go in terms of recovering lost performance relative to clean “non-corrupt” settings, warranting more research in this direction.

@inproceedings{2021RobustNav, 
author = {Chattopadhyay, Prithvijit and 
	Hoffman, Judy and Mottaghi, 
	Roozbeh and Kembhavi, Ani},
title = {RobustNav: Towards Benchmarking 
	Robustness in Embodied 
	Navigation},
year = 2021,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings
Viraj Prabhu, Arjun Chandrasekaran, Kate Saenko, Judy Hoffman.
International Conference in Computer Vision (ICCV), 2021
pdf | abstract | bibtex | project page | video

Generalizing deep neural networks to new target domains is critical to their real-world utility. In practice, it may be feasible to get some target data labeled, but to be cost-effective it is desirable to select a maximally-informative subset via active learning (AL). We study the problem of AL under a domain shift, called Active Domain Adaptation (Active DA). We empirically demonstrate how existing AL approaches based solely on model uncertainty or diversity sampling are suboptimal for Active DA. Our algorithm, Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings (ADA-CLUE), i) identifies target instances for labeling that are both uncertain under the model and diverse in feature space, and ii) leverages the available source and target data for adaptation by optimizing a semi-supervised adversarial entropy loss that is complementary to our active sampling objective. On standard image classification-based domain adaptation benchmarks, ADA-CLUE consistently outperforms competing active adaptation, active learning, and domain adaptation methods across domain shifts of varying severity.

@inproceedings{2021CLUE, 
author = {Prabhu, Viraj and Chandrasekaran, 
	Arjun and Saenko, Kate and 
	Hoffman, Judy},
title = {Active Domain Adaptation via 
	Clustering 
	Uncertainty-weighted 
	Embeddings},
year = 2021,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Temporal Action Detection with Multi-level Supervision
Baifeng Shi, Qi Dai, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu.
International Conference in Computer Vision (ICCV), 2021
abstract | bibtex

Training temporal action detection in videos requires large amounts of labeled data, yet such annotation is expensive to collect. Incorporating unlabeled or weakly-labeled data to train action detection model could help reduce annotation cost. In this work, we first introduce the Semi-supervised Action Detection (SSAD) task with a mixture of labeled and unlabeled data and analyze different types of errors in the proposed SSAD baselines which are directly adapted from the semi-supervised classification literature. Identifying that the main source of error is action incompleteness (i.e., missing parts of actions), we alleviate it by designing an unsupervised foreground attention (UFA) module utilizing the conditional independence between foreground and background motion. Then we incorporate weakly-labeled data into SSAD and propose Omni-supervised Action Detection (OSAD) with three levels of supervision. To overcome the accompanying action-context confusion problem in OSAD baselines, an information bottleneck (IB) is designed to suppress the scene information in non-action frames while preserving the action information. We extensively benchmark against the baselines for SSAD and OSAD on our created data splits in THUMOS14 and ActivityNet1.2, and demonstrate the effectiveness of the proposed UFA and IB methods. Lastly, the benefit of our full OSAD-IB model under limited annotation budgets is shown by exploring the optimal annotation strategy for labeled, unlabeled and weakly-labeled data.

@inproceedings{2021temporalIccv, 
author = {Shi, Baifeng and Dai, Qi and 
	Hoffman, Judy and Saenko, Kate 
	and Darrell, Trevor and Xu, 
	Huijuan},
title = {Temporal Action Detection with 
	Multi-level Supervision},
year = 2021,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Representation Learning Through Latent Canonicalizations
Or Litany, Ari Morcos, Srinath Sridhar, Leonidas Guibas, Judy Hoffman.
IEEE Winter Conference on Applications in Computer Vision (WACV), 2021
pdf | abstract | bibtex

We seek to learn a representation on a large annotated data source that generalizes to a target domain using limited new supervision. Many prior approaches to this problem have focused on learning disentangled representations so that as individual factors vary in a new domain, only a portion of the representation need be updated. In this work, we seek the generalization power of disentangled representations, but relax the requirement of explicit latent disentanglement and instead encourage linearity of individual factors of variation by requiring them to be manipulable by learned linear transformations. We dub these transformations latent canonicalizers, as they aim to modify the value of a factor to a pre-determined (but arbitrary) canonical value (e.g., recoloring the image foreground to black). Assuming a source domain with access to meta-labels specifying the factors of variation within an image, we demonstrate experimentally that our method helps reduce the number of observations needed to generalize to a similar target domain when compared to a number of supervised baselines.

@inproceedings{2020WacvLatentCanon, 
author = {Litany, Or and Morcos, Ari and 
	Sridhar, Srinath and Guibas, 
	Leonidas and Hoffman, Judy},
title = {Representation Learning Through 
	Latent Canonicalizations},
year = 2021,
booktitle = {IEEE Winter Conference on 
	Applications in Computer 
	Vision (WACV)}
}
sym Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
Yogesh Balaji, Tom Goldstein, Judy Hoffman.
arXiv, 2020
pdf | abstract | bibtex | code

Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails to generalize well to unperturbed test set. We hypothesize that this poor generalization is a consequence of adversarial training with uniform perturbation radius around every training sample. Samples close to decision boundary can be morphed into a different class under a small perturbation budget, and enforcing large margins around these samples produce poor decision boundaries that generalize poorly. Motivated by this hypothesis, we propose instance adaptive adversarial training -- a technique that enforces sample-specific perturbation margins around every training sample. We show that using our approach, test accuracy on unperturbed samples improve with a marginal drop in robustness. Extensive experiments on CIFAR-10, CIFAR-100 and Imagenet datasets demonstrate the effectiveness of our proposed approach.

@inproceedings{2020arXivInstanceAdaptive, 
author = {Balaji, Yogesh and Goldstein, Tom 
	and Hoffman, Judy},
title = {Instance adaptive adversarial 
	training: Improved accuracy 
	tradeoffs in neural nets},
year = 2020,
booktitle = {arXiv}
}
sym Multiple-Source Adaptation Theory and Algorithms
Ningshan Zhang, Mehryar Mohri, Judy Hoffman.
Annals of Mathematics and Artificial Intelligence, 2020
pdf | bibtex

@article{2020amai, 
author = {Zhang, Ningshan and Mohri, Mehryar 
	and Hoffman, Judy},
title = {Multiple-Source Adaptation Theory 
	and Algorithms},
year = 2020,
journal = {Annals of Mathematics and 
	Artificial Intelligence}
}
sym Auxiliary Task Reweighting for Minimum-data Learning
Baifeng Shi, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu.
Neural Information Processing Systems (NeurIPS), 2020
abstract | bibtex | code | project page | video

Supervised learning requires a large amount of training data, limiting its appli- cation where labeled data is scarce. To compensate for data scarcity, one pos- sible method is to utilize auxiliary tasks to provide additional supervision for the main task. Assigning and optimizing the importance weights for different auxiliary tasks remains an crucial and largely understudied research question. In this work, we propose a method to automatically reweight auxiliary tasks in order to reduce the data requirement on the main task. Specifically, we formu- late the weighted likelihood function of auxiliary tasks as a surrogate prior for the main task. By adjusting the auxiliary task weights to minimize the diver- gence between the surrogate prior and the true prior of the main task, we obtain a more accurate prior estimation, achieving the goal of minimizing the required amount of training data for the main task and avoiding a costly grid search. In multiple experimental settings (e.g. semi-supervised learning, multi-label classifi- cation), we demonstrate that our algorithm can effectively utilize limited labeled data of the main task with the benefit of auxiliary tasks compared with previous task reweighting methods. We also show that under extreme cases with only a few extra examples (e.g. few-shot domain adaptation), our algorithm results in significant improvement over the baseline.

@inproceedings{2020NeurIPSAux, 
author = {Shi, Baifeng and Hoffman, Judy and 
	Saenko, Kate and Darrell, 
	Trevor and Xu, Huijuan},
title = {Auxiliary Task Reweighting for 
	Minimum-data Learning},
year = 2020,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Integrating Egocentric Localization for More Realistic Point-Goal Navigation Agents
Samyak Datta, Oleksandr Maksymets, Judy Hoffman, Stefan Lee, Dhruv Batra, Devi Parikh.
Conference on Robot Learning (CoRL), 2020
abstract | bibtex

Recent work has presented embodied agents that can navigate to point-goal targets in novel indoor environments with near-perfect accuracy. However, these agents are equipped with idealized sensors for localization and take deterministic actions. This setting is practically sterile by comparison to the dirty reality of noisy sensors and actuations in the real world -- wheels can slip, motion sensors have error, actuations can rebound. In this work, we take a step towards this noisy reality, developing point-goal navigation agents that rely on visual estimates of egomotion under noisy action dynamics. We find these agents outperform naive adaptions of current point-goal agents to this setting as well as those incorporating classic localization baselines. Further, our model conceptually divides learning agent dynamics or odometry (where am I?) from task-specific navigation policy (where do I want to go?). This enables a seamless adaption to changing dynamics (a different robot or floor type) by simply re-calibrating the visual odometry model -- circumventing the expense of re-training of the navigation policy. Our agent was the runner-up in the PointNav track of CVPR 2020 Habitat Challenge.

@inproceedings{2020CorlEgo, 
author = {Datta, Samyak and Maksymets, 
	Oleksandr and Hoffman, Judy 
	and Lee, Stefan and Batra, 
	Dhruv and Parikh, Devi},
title = {Integrating Egocentric 
	Localization for More 
	Realistic Point-Goal 
	Navigation Agents},
year = 2020,
booktitle = {Conference on Robot Learning 
	(CoRL)}
}
sym Masked Reconstruction based Self-Supervision for Human Activity Recognition
Harish Haresamudram, Apoorva Beedu, Varun Agrawal, Patrick L Grady, Irfan Essa, Judy Hoffman, Thomas Ploetz.
International Symposium on Wearable Computers (ISWC), 2020
pdf | bibtex

@inproceedings{2020ISWC, 
author = {Haresamudram, Harish and Beedu, 
	Apoorva and Agrawal, Varun and 
	Grady, Patrick L and Essa, 
	Irfan and Hoffman, Judy and 
	Ploetz, Thomas},
title = {Masked Reconstruction based 
	Self-Supervision for Human 
	Activity Recognition},
year = 2020,
booktitle = {International Symposium on 
	Wearable Computers (ISWC)}
}
sym Learning to Balance Specificity and Invariance for In and Out of Domain Generalization
Prithvijit Chattopadhyay, Yogesh Balaji, Judy Hoffman.
European Conference in Computer Vision (ECCV), 2020
pdf | abstract | bibtex | code | video

We introduce Domain-specific Masks for Generalization, a model for improving both in-domain and out-of-domain generalization performance. For domain generalization, the goal is to learn from a set of source domains to produce a single model that will best generalize to an unseen target domain. As such, many prior approaches focus on learning representations which persist across all source domains with the assumption that these domain agnostic representations will generalize well. However, often individual domains contain characteristics which are unique and when leveraged can significantly aid in-domain recognition performance. To produce a model which best generalizes to both seen and unseen domains, we propose learning domain specific masks. The masks are encouraged to learn a balance of domain-invariant and domain-specific features, thus enabling a model which can benefit from the predictive power of specialized features while retaining the universal applicability of domain-invariant features. We demonstrate competitive performance compared to naive baselines and state-of-the-art methods on both PACS and DomainNet.

@inproceedings{2020EccvDMG, 
author = {Chattopadhyay, Prithvijit and 
	Balaji, Yogesh and Hoffman, 
	Judy},
title = {Learning to Balance Specificity 
	and Invariance for In and Out 
	of Domain Generalization},
year = 2020,
booktitle = {European Conference in Computer 
	Vision (ECCV)}
}
sym TIDE: A General Toolbox for Identifying Object Detection Errors
Daniel Bolya, Sean Foley, James Hays, Judy Hoffman.
European Conference in Computer Vision (ECCV), 2020 (Spotlight Presentation)
pdf | abstract | bibtex | code | project page | video

We introduce TIDE, a framework and associated toolbox1 for analyzing the sources of error in object detection and instance segmenta- tion algorithms. Importantly, our framework is applicable across datasets and can be applied directly to output prediction files without required knowledge of the underlying prediction system. Thus, our framework can be used as a drop-in replacement for the standard mAP computation while providing a comprehensive analysis of each model’s strengths and weaknesses. We segment errors into six types and, crucially, are the first to introduce a technique for measuring the contribution of each error in a way that isolates its effect on overall performance. We show that such a representation is critical for drawing accurate, comprehensive conclusions through in-depth analysis across 4 datasets and 7 recognition models.

@inproceedings{2020EccvTIDE, 
author = {Bolya, Daniel and Foley, Sean and 
	Hays, James and Hoffman, Judy},
title = {TIDE: A General Toolbox for 
	Identifying Object Detection 
	Errors},
year = 2020,
booktitle = {European Conference in Computer 
	Vision (ECCV)}
}
sym Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses
Fu Lin, Rohit Mittapali, Prithvijit Chattopadhyay, Daniel Bolya, Judy Hoffman.
Adversarial Robustness in the Real World (AROW), ECCV, 2020 (Best paper runner up)
pdf | abstract | bibtex

Convolutional Neural Networks (CNNs) have been shown to be vulnerable to adversarial examples, which are known to locate in subspaces close to where normal data lies but are not naturally occurring and have low probability. In this work, we investigate the potential effect defense techniques have on the geometry of the likelihood landscape - likelihood of the input images under the trained model. We first propose a way to visualize the likelihood landscape by leveraging an energy-based model interpretation of discriminative classifiers. Then we introduce a measure to quantify the flatness of the likelihood landscape. We observe that a subset of adversarial defense techniques results in a similar effect of flattening the likelihood landscape. We further explore directly regularizing towards a flat landscape for adversarial robustness.

@inproceedings{2020EccvWLikelihood, 
author = {Lin, Fu and Mittapali, Rohit and 
	Chattopadhyay, Prithvijit and 
	Bolya, Daniel and Hoffman, 
	Judy},
title = {Likelihood Landscapes: A Unifying 
	Principle Behind Many 
	Adversarial Defenses},
year = 2020,
booktitle = {Adversarial Robustness in the Real 
	World (AROW), ECCV}
}
sym SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation
Daniel Gordon, Abhishek Kadian, Devi Parikh, Judy Hoffman, Dhruv Batra.
International Conference in Computer Vision (ICCV), 2019
pdf | abstract | bibtex | code

We propose SplitNet, a method for decoupling visual perception and policy learning. By incorporating auxiliary tasks and selective learning of portions of the model, we explicitly decompose the learning objectives for visual navigation into perceiving the world and acting on that perception. We show dramatic improvements over baseline models on transferring between simulators, an encouraging step towards Sim2Real. Additionally, SplitNet generalizes better to unseen environments from the same simulator and transfers faster and more effectively to novel embodied navigation tasks. Further, given only a small sample from a target domain, SplitNet can match the performance of traditional end-to-end pipelines which receive the entire dataset.

@inproceedings{2019iccvsplitnet, 
author = {Gordon, Daniel and Kadian, 
	Abhishek and Parikh, Devi and 
	Hoffman, Judy and Batra, Dhruv},
title = {SplitNet: Sim2Sim and Task2Task 
	Transfer for Embodied Visual 
	Navigation},
year = 2019,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Robust Learning with Jacobian Regularization
Judy Hoffman, Daniel A. Roberts, Sho Yaida.
Conference on the Mathematical Theory of Deep Learning (DeepMath), 2019
pdf | abstract | bibtex | code

Design of reliable systems must guarantee stability against input perturbations. In machine learning, such guarantee entails preventing overfitting and ensuring robustness of models against corruption of input data. In order to maximize stability, we analyze and develop a computationally efficient implementation of Jacobian regularization that increases classification margins of neural networks. The stabilizing effect of the Jacobian regularizer leads to significant improvements in robustness, as measured against both random and adversarial input perturbations, without severely degrading generalization properties on clean data.

@inproceedings{2019DeepMathJacobian, 
author = {Hoffman, Judy and Roberts, Daniel 
	A. and Yaida, Sho},
title = {Robust Learning with Jacobian 
	Regularization},
year = 2019,
booktitle = {Conference on the Mathematical 
	Theory of Deep Learning 
	(DeepMath)}
}
sym Predictive Inequity in Object Detection
Benjamin Wilson, Judy Hoffman, Jamie Morgenstern.
Workshop on Fairness Accountability Transparency and Ethics in Computer Vision at CVPR, 2019
pdf | abstract | bibtex | code
Press: Vox | Business Insider | The Guardian | NBC News |

In this work, we investigate whether state-of-theart object detection systems have equitable predictive performance on pedestrians with different skin tones. This work is motivated by many recent examples of ML and vision systems displaying higher error rates for certain demographic groups than others. We annotate an existing large scale dataset which contains pedestrians, BDD100K, with Fitzpatrick skin tones in ranges [1-3] or [4-6]. We then provide an in depth comparative analysis of performance between these two skin tone groupings, finding that neither time of day nor occlusion explain this behavior, suggesting this disparity is not merely the result of pedestrians in the 4-6 range appearing in more difficult scenes for detection. We investigate to what extent time of day, occlusion, and reweighting the supervised loss during training affect this predictive bias.

@inproceedings{2019FateCV, 
author = {Wilson, Benjamin and Hoffman, Judy 
	and Morgenstern, Jamie},
title = {Predictive Inequity in Object 
	Detection},
year = 2019,
booktitle = {Workshop on Fairness 
	Accountability Transparency 
	and Ethics in Computer Vision 
	at CVPR}
}
sym Algorithms and Theory for Multiple-Source Adaptation
Judy Hoffman, Mehryar Mohri, Ningshan Zhang.
Neural Information Processing Systems (NeurIPS), 2018
pdf | abstract | bibtex

We present a number of novel contributions to the multiple-source adaptation problem. We derive new normalized solutions with strong theoretical guarantees for the cross-entropy loss and other similar losses. We also provide new guarantees that hold in the case where the conditional probabilities for the source domains are distinct. Moreover, we give new algorithms for determining the distributionweighted combination solution for the cross-entropy loss and other losses. We report the results of a series of experiments with real-world datasets. We find that our algorithm outperforms competing approaches by producing a single robustmodel that performs well on any target mixture distribution. Altogether, our theory, algorithms, and empirical results provide a full solution for the multiple-source adaptation problem with very practical benefits.

@inproceedings{2018neurips_madap, 
author = {Hoffman, Judy and Mohri, Mehryar 
	and Zhang, Ningshan},
title = {Algorithms and Theory for 
	Multiple-Source Adaptation},
year = 2018,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym CyCADA: Cycle Consistent Adversarial Domain Adaptation
Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alyosha Efros, Trevor Darrell.
International Conference in Machine Learning (ICML), 2018
pdf | abstract | bibtex | code | slides

Domain adaptation is critical for success in new, unseen environments. Adversarial adaptation models have shown tremendous progress towards adapting to new environments by focusing either on discovering domain invariant representations or by mapping between unpaired image domains. While feature space methods are difficult to interpret and sometimes fail to capture pixel-level and low-level domain shifts, image space methods sometimes fail to incorporate high level semantic knowledge relevant for the end task. We propose a model which adapts between domains using both generative image space alignment and latent representation space alignment. Our approach, Cycle-Consistent Adversarial Domain Adaptation (CyCADA), guides transfer between domains according to a specific discriminatively trained task and avoids divergence by enforcing consistency of the relevant semantics before and after adaptation. We evaluate our method on a variety of visual recognition and prediction settings, including digit classification and semantic segmentation of road scenes, advancing state-of-the-art performance for unsupervised adaptation from synthetic to real world driving domains.

@inproceedings{2018icmlcycada, 
author = {Hoffman, Judy and Tzeng, Eric and 
	Park, Taesung and Zhu, Jun-Yan 
	and Isola, Phillip and Saenko, 
	Kate and Efros, Alyosha and 
	Darrell, Trevor},
title = {CyCADA: Cycle Consistent 
	Adversarial Domain Adaptation},
year = 2018,
booktitle = {International Conference in 
	Machine Learning (ICML)}
}
sym Adapting to Continuously Shifting Domains
Andreea Bobu, Eric Tzeng, Judy Hoffman, Trevor Darrell.
International Conference on Learning Representations (ICLR) Workshop Track, 2018
pdf | abstract | bibtex

Domain adaptation typically focuses on adapting a model from a single source domain to a target domain. However, in practice, this paradigm of adapting from one source to one target is limiting, as different aspects of the real world such as illumination and weather conditions vary continuously and cannot be effectively captured by two static domains. Approaches that attempt to tackle this problem by adapting from a single source to many different target domains simultaneously are consistently unable to learn across all domain shifts. Instead, we propose an adaptation method that exploits the continuity between gradually varying domains by adapting in sequence from the source to the most similar target domain. By incrementally adapting while simultaneously efficiently regularizing against prior examples, we obtain a single strong model capable of recognition within all observed domains.

@inproceedings{2018iclrwBobu, 
author = {Bobu, Andreea and Tzeng, Eric and 
	Hoffman, Judy and Darrell, 
	Trevor},
title = {Adapting to Continuously Shifting 
	Domains},
year = 2018,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR) Workshop Track}
}
sym Scaling Human-Object Interaction Recognition through Zero-Shot Learning
Liyue Shen, Serena Yeung, Judy Hoffman, Greg Mori, Li Fei-Fei.
IEEE Winter Conference on Applications in Computer Vision (WACV), 2018
pdf | abstract | bibtex

Recognizing human object interactions (HOI) is an important part of distinguishing the rich variety of human action in the visual world. While recent progress has been made in improving HOI recognition in the fully supervised setting, the space of possible human-object interactions is large and it is impractical to obtain labeled training data for all interactions of interest. In this work, we tackle the challenge of scaling HOI recognition to the long tail of categories through a zero-shot learning approach. We introduce a factorized model for HOI detection that disentangles reasoning on verbs and objects, and at test-time can therefore produce detections for novel verb-object pairs. We present experiments on the recently introduced large-scale HICODET dataset, and show that our model is able to both perform comparably to state-of-the-art in fully-supervised HOI detection, while simultaneously achieving effective zeroshot detection of new HOI categories.

@inproceedings{2018wacv_hico, 
author = {Shen, Liyue and Yeung, Serena and 
	Hoffman, Judy and Mori, Greg 
	and Fei-Fei, Li},
title = {Scaling Human-Object Interaction 
	Recognition through Zero-Shot 
	Learning},
year = 2018,
booktitle = {IEEE Winter Conference on 
	Applications in Computer 
	Vision (WACV)}
}
sym Label Efficient Learning of Transferable Representations across Domains and Tasks
Zelun Luo, Yuliang Zou, Judy Hoffman, Li Fei-Fei.
Neural Information Processing Systems (NeurIPS), 2017
pdf | abstract | bibtex | project page

We propose a framework that learns a representation transferable across different domains and tasks in a label efficient manner. Our approach battles domain shift with a domain adversarial loss, and generalizes the embedding to novel task using a metric learning-based approach. Our model is simultaneously optimized on labeled source data and unlabeled or sparsely labeled data in the target domain. Our method shows compelling results on novel classes within a new domain even when only a few labeled examples per class are available, outperforming the prevalent fine-tuning approach. In addition, we demonstrate the effectiveness of our framework on the transfer learning task from image object recognition to video action recognition.

@inproceedings{2017NipsLuo, 
author = {Luo, Zelun and Zou, Yuliang and 
	Hoffman, Judy and Fei-Fei, Li},
title = {Label Efficient Learning of 
	Transferable Representations 
	across Domains and Tasks},
year = 2017,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach
Timnit Gebru, Judy Hoffman, Li Fei-Fei.
International Conference in Computer Vision (ICCV), 2017
pdf | abstract | bibtex

While fine-grained object recognition is an important problem in computer vision, current models are unlikely to accurately classify objects in the wild. These fully supervised models need additional annotated images to classify objects in every new scenario, a task that is infeasible. However, sources such as e-commerce websites and field guides provide annotated images for many classes. In this work, we study fine-grained domain adaptation as a step towards overcoming the dataset shift between easily acquired annotated images and the real world. Adaptation has not been studied in the fine-grained setting where annotations such as attributes could be used to increase performance. Our work uses an attribute based multi-task adaptation loss to increase accuracy from a baseline of 4.1% to 19.1% in the semi-supervised adaptation case. Prior do- main adaptation works have been benchmarked on small datasets such as [46] with a total of 795 images for some domains, or simplistic datasets such as [41] consisting of digits. We perform experiments on a subset of a new challenging fine-grained dataset consisting of 1,095,021 images of 2, 657 car categories drawn from e-commerce websites and Google Street View.

@inproceedings{2017iccvGebru, 
author = {Gebru, Timnit and Hoffman, Judy 
	and Fei-Fei, Li},
title = {Fine-grained Recognition in the 
	Wild: A Multi-Task Domain 
	Adaptation Approach},
year = 2017,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Inferring and Executing Programs for Visual Reasoning
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick.
International Conference in Computer Vision (ICCV), 2017 (Oral Presentation)
pdf | abstract | bibtex | code | project page

Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases in the data rather than learning to perform visual reasoning. Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer. Both the program generator and the execution engine are implemented by neural networks, and are trained using a combination of backpropagation and REINFORCE. Using the CLEVR benchmark for visual reasoning, we show that our model significantly outperforms strong baselines and generalizes better in a variety of settings.

@inproceedings{2017iccvJohnson, 
author = {Johnson, Justin and Hariharan, 
	Bharath and Maaten, Laurens 
	van der and Hoffman, Judy and 
	Fei-Fei, Li and Zitnick, C. 
	Lawrence and Girshick, Ross},
title = {Inferring and Executing Programs 
	for Visual Reasoning},
year = 2017,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Adversarial Discriminative Domain Adaptation
Eric Tzeng, Judy Hoffman, Trevor Darrell, Kate Saenko.
Computer Vision and Pattern Recognition (CVPR), 2017
pdf | abstract | bibtex | code

Adversarial learning methods are a promising approach to training robust deep networks, and can generate complex samples across diverse domains. They also can improve recognition despite the presence of domain shift or dataset bias: several adversarial approaches to unsupervised domain adaptation have recently been introduced, which reduce the difference between the training and test domain distributions and thus improve generalization performance. Prior generative approaches show compelling visualizations, but are not optimal on discriminative tasks and can be limited to smaller shifts. Prior discriminative approaches could handle larger domain shifts, but imposed tied weights on the model and did not exploit a GAN-based loss. We first outline a novel generalized framework for adversarial adaptation, which subsumes recent state-of-the-art approaches as special cases, and we use this generalized view to better relate the prior approaches. We propose a previously unexplored instance of our general framework which combines discriminative modeling, untied weight sharing, and a GAN loss, which we call Adversarial Discriminative Domain Adaptation (ADDA). We show that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and demonstrate the promise of our approach by exceeding state-of-the-art unsupervised adaptation results on standard cross-domain digit classification tasks and a new more difficult cross-modality object classification task.

@inproceedings{2017cvprAdda, 
author = {Tzeng, Eric and Hoffman, Judy and 
	Darrell, Trevor and Saenko, 
	Kate},
title = {Adversarial Discriminative Domain 
	Adaptation},
year = 2017,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR)}
}
sym Clockwork Convnets for Video Semantic Segmentation
Evan Shelhamer*, Kate Rakelly*, Judy Hoffman*, Trevor Darrell.
Video Semantic Segmentation Workshop at European Conference in Computer Vision, 2016
pdf | abstract | bibtex

Recent years have seen tremendous progress in still-image segmentation; however the na¨ıve application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video. We propose a video recognition framework that relies on two key observations: 1) while pixels may change rapidly from frame to frame, the semantic content of a scene evolves more slowly, and 2) execution can be viewed as an aspect of architecture, yielding purpose-fit computation schedules for networks. We define a novel family of “clockwork” convnets driven by fixed or adaptive clock signals that schedule the processing of different layers at different update rates according to their semantic stability. We design a pipeline schedule to reduce latency for real-time recognition and a fixed-rate schedule to reduce overall computation. Finally, we extend clockwork scheduling to adaptive video processing by incorporating data-driven clocks that can be tuned on unlabeled video. The accuracy and efficiency of clockwork convnets are evaluated on the Youtube-Objects, NYUD, and Cityscapes video datasets.

@inproceedings{2016eccvw_clockwork, 
author = {Shelhamer*, Evan and Rakelly*, 
	Kate and Hoffman*, Judy and 
	Darrell, Trevor},
title = {Clockwork Convnets for Video 
	Semantic Segmentation},
year = 2016,
booktitle = {Video Semantic Segmentation 
	Workshop at European 
	Conference in Computer Vision}
}
sym Adapting deep visuomotor representations with weak pairwise constraints
Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell.
Workshop on Algorithmic Foundations in Robotics (WAFR), 2016
pdf | abstract | bibtex

Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world.

@inproceedings{2016wafrTzeng, 
author = {Tzeng, Eric and Devin, Coline and 
	Hoffman, Judy and Finn, 
	Chelsea and Abbeel, Pieter and 
	Levine, Sergey and Saenko, 
	Kate and Darrell, Trevor},
title = {Adapting deep visuomotor 
	representations with weak 
	pairwise constraints},
year = 2016,
booktitle = {Workshop on Algorithmic 
	Foundations in Robotics (WAFR)}
}
sym Fine-To-Coarse Knowledge Transfer For Low-Res Image Classification
Xingchao Peng, Judy Hoffman, Stella Yu, Kate Saenko.
International Conference on Image Processing (ICIP), 2016
pdf | abstract | bibtex

We address the difficult problem of distinguishing fine-grained object categories in low resolution images. Wepropose a simple an effective deep learning approach that transfers fine-grained knowledge gained from high resolution training data to the coarse low-resolution test scenario. Such fine-to-coarse knowledge transfer has many real world applications, such as identifying objects in surveillance photos or satellite images where the image resolution at the test time is very low but plenty of high resolution photos of similar objects are available. Our extensive experiments on two standard benchmark datasets containing fine-grained car models and bird species demonstrate that our approach can effectively transfer fine-detail knowledge to coarse-detail imagery.

@inproceedings{2016icipPeng, 
author = {Peng, Xingchao and Hoffman, Judy 
	and Yu, Stella and Saenko, 
	Kate},
title = {Fine-To-Coarse Knowledge Transfer 
	For Low-Res Image 
	Classification},
year = 2016,
booktitle = {International Conference on Image 
	Processing (ICIP)}
}
sym Cross-Modal Adaptation for RGB-D Detection
Judy Hoffman, Saurabh Gupta, Jian Leong, Sergio Guadarrama, Trevor Darrell.
International Conference in Robotics and Automation (ICRA), 2016
pdf | abstract | bibtex | slides

Real-world robotics problems often occur in domains that differ significantly from the robot's prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world.

@inproceedings{2016icraHoffman, 
author = {Hoffman, Judy and Gupta, Saurabh 
	and Leong, Jian and 
	Guadarrama, Sergio and 
	Darrell, Trevor},
title = {Cross-Modal Adaptation for RGB-D 
	Detection},
year = 2016,
booktitle = {International Conference in 
	Robotics and Automation (ICRA)}
}
sym Cross Modal Distillation for Supervision Transfer
Saurabh Gupta, Judy Hoffman, Jitendra Malik.
Computer Vision and Pattern Recognition (CVPR), 2016
pdf | abstract | bibtex | code

In this work we propose a technique that transfers supervision between images from different modalities. We use learned representations from a large labeled modality as a supervisory signal for training representations for a new unlabeled paired modality. Our method enables learning of rich representations for unlabeled modalities and can be used as a pre-training procedure for new modalities with limited labeled data. We show experimental results where we transfer supervision from labeled RGB images to unlabeled depth and optical flow images and demonstrate large improvements for both these cross modal supervision transfers.

@inproceedings{2016cvprGupta, 
author = {Gupta, Saurabh and Hoffman, Judy 
	and Malik, Jitendra},
title = {Cross Modal Distillation for 
	Supervision Transfer},
year = 2016,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR)}
}
sym Learning with Side Information through Modality Hallucination
Judy Hoffman, Saurabh Gupta, Trevor Darrell.
Computer Vision and Pattern Recognition (CVPR), 2016 (Spotlight Presentation)
pdf | abstract | bibtex | slides

We present a modality hallucination architecture for training an RGB object detection model which incorporates depth side information at training time. Our convolutional hallucination network learns a new and complementary RGB image representation which is taught to mimic convolutional mid-level features from a depth network. At test time images are processed jointly through the RGB and hallucination networks to produce improved detection performance. Thus, our method transfers information commonly extracted from depth training data to a network which can extract that information from the RGB counterpart. We present results on the standard NYUDv2 dataset and report improvement on the RGB detection task.

@inproceedings{2016cvprHoffman, 
author = {Hoffman, Judy and Gupta, Saurabh 
	and Darrell, Trevor},
title = {Learning with Side Information 
	through Modality Hallucination},
year = 2016,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR)}
}
sym Quantification in-the-wild: data-sets and baselines
Oscar Beijbom, Judy Hoffman, Evan Yao, Trevor Darrell, Alberto Rodriguez-Ramirez, Manuel Gonzlez-Rivero, Ove Hoegh-Guldberg.
Transfer and Multi-Task Learning: Trends and New Perspectives, Workshop at NeurIPS, 2015
pdf | bibtex

@inproceedings{2015nipswBeijbom, 
author = {Beijbom, Oscar and Hoffman, Judy 
	and Yao, Evan and Darrell, 
	Trevor and Rodriguez-Ramirez, 
	Alberto and Gonzlez-Rivero, 
	Manuel and Hoegh-Guldberg, Ove},
title = {Quantification in-the-wild: 
	data-sets and baselines},
year = 2015,
booktitle = {Transfer and Multi-Task Learning: 
	Trends and New Perspectives, 
	Workshop at NeurIPS}
}
sym Spatial Semantic Regularisation for Large Scale Object Detection
Damian Mrowca, Marcus Rohrbach, Judy Hoffman, Ronghang Hu, Kate Saenko, Trevor Darrell.
International Conference in Computer Vision (ICCV), 2015
pdf | abstract | bibtex

Large scale object detection with thousands of classes introduces the problem of many contradicting false positive detections, which have to be suppressed. Class-independent non-maximum suppression has traditionally been used for this step, but it does not scale well as the number of classes grows. Traditional non-maximum suppression does not consider label- and instance-level relationships nor does it allow an exploitation of the spatial layout of detection proposals. We propose a new multi-class spatial semantic regularisation method based on affinity propagation clustering, which simultaneously optimises across all categories and all proposed locations in the image, to improve both the localisation and categorisation of selected detection proposals. Constraints are shared across the labels through the semantic WordNet hierarchy. Our approach proves to be especially useful in large scale settings with thousands of classes, where spatial and semantic interactions are very frequent and only weakly supervised detectors can be built due to a lack of bounding box annotations. Detection experiments are conducted on the ImageNet and COCO dataset, and in settings with thousands of detected categories. Our method provides a significant precision improvement by reducing false positives, while simultaneously improving the recall.

@inproceedings{2015iccvMrowca, 
author = {Mrowca, Damian and Rohrbach, 
	Marcus and Hoffman, Judy and 
	Hu, Ronghang and Saenko, Kate 
	and Darrell, Trevor},
title = {Spatial Semantic Regularisation 
	for Large Scale Object 
	Detection},
year = 2015,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Simultaneous Deep Transfer Across Domains and Tasks
Eric Tzeng*, Judy Hoffman*, Trevor Darrell, Kate Saenko.
International Conference in Computer Vision (ICCV), 2015
pdf | abstract | bibtex | code

Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias. Fine-tuning deep models in a new domain can require a significant amount of labeled data, which for many applications is simply not available. We propose a new CNN architecture to exploit unlabeled and sparsely labeled target domain data. Our approach simultaneously optimizes for domain invariance to facilitate domain transfer and uses a soft label distribution matching loss to transfer information between tasks. Our proposed adaptation method offers empirical performance which exceeds previously published results on two standard benchmark visual domain adaptation tasks, evaluated across supervised and semi-supervised adaptation settings.

@inproceedings{2015iccvTzeng, 
author = {Tzeng*, Eric and Hoffman*, Judy 
	and Darrell, Trevor and 
	Saenko, Kate},
title = {Simultaneous Deep Transfer Across 
	Domains and Tasks},
year = 2015,
booktitle = {International Conference in 
	Computer Vision (ICCV)}
}
sym Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning
Judy Hoffman, Deepak Pathak, Trevor Darrell, Kate Saenko.
Computer Vision and Pattern Recognition (CVPR), 2015
pdf | abstract | bibtex

We develop methods for detector learning which exploit joint training over both weak and strong labels and which transfer learned perceptual representations from strongly-labeled auxiliary tasks. Previous methods for weak-label learning often learn detector models independently using latent variable optimization, but fail to share deep representation knowledge across classes and usually require strong initialization. Other previous methods transfer deep representations from domains with strong labels to those with only weak labels, but do not optimize over individual latent boxes, and thus may miss specific salient structures for a particular category. We propose a model that subsumes these previous approaches, and simultaneously trains a representation and detectors for categories with either weak or strong labels present. We provide a novel formulation of a joint multiple instance learning method that includes examples from classification-style data when available, and also performs domain transfer learning to improve the underlying detector representation. Our model outperforms known methods on ImageNet-200 detection with weak labels.

@inproceedings{2015cvprHoffman, 
author = {Hoffman, Judy and Pathak, Deepak 
	and Darrell, Trevor and 
	Saenko, Kate},
title = {Detector Discovery in the Wild: 
	Joint Multiple Instance and 
	Representation Learning},
year = 2015,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR)}
}
sym LSDA: Large Scale Detection through Adaptation
Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko.
Neural Information Processing Systems (NeurIPS), 2014
pdf | abstract | bibtex | project page | slides

A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) have emerged as clear winners on object classification benchmarks, in part due to training with 1.2M+ labeled classification images. Unfortunately, only a small fraction of those labels are available for the detection task. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect detection data and label it with precise bounding boxes. In this paper, we propose Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Our method has the potential to enable detection for the tens of thousands of categories that lack bounding box annotations, yet have plenty of classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge demonstrates the efficacy of our approach. This algorithm enables us to produce a >7.6K detector by using available classification data from leaf nodes in the ImageNet tree. We additionally demonstrate how to modify our architecture to produce a fast detector (running at 2fps for the 7.6K detector).

@inproceedings{2014nipsHoffman, 
author = {Hoffman, Judy and Guadarrama, 
	Sergio and Tzeng, Eric and Hu, 
	Ronghang and Donahue, Jeff and 
	Girshick, Ross and Darrell, 
	Trevor and Saenko, Kate},
title = {LSDA: Large Scale Detection 
	through Adaptation},
year = 2014,
booktitle = {Neural Information Processing 
	Systems (NeurIPS)}
}
sym Interactive Adaptation of Real-Time Object Detectors
Daniel Goehring, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell.
International Conference in Robotics and Automation (ICRA), 2014
pdf | bibtex | project page

@inproceedings{2014icraGoering, 
author = {Goehring, Daniel and Hoffman, Judy 
	and Rodner, Erik and Saenko, 
	Kate and Darrell, Trevor},
title = {Interactive Adaptation of 
	Real-Time Object Detectors},
year = 2014,
booktitle = {International Conference in 
	Robotics and Automation (ICRA)}
}
sym DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell.
International Conference in Machine Learning (ICML), 2014
pdf | abstract | bibtex | code

We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.

@inproceedings{2014icmlDonahue, 
author = {Donahue, Jeff and Jia, Yangqing 
	and Vinyals, Oriol and 
	Hoffman, Judy and Zhang, Ning 
	and Tzeng, Eric and Darrell, 
	Trevor},
title = {DeCAF: A Deep Convolutional 
	Activation Feature for Generic 
	Visual Recognition},
year = 2014,
booktitle = {International Conference in 
	Machine Learning (ICML)}
}
sym Continuous Manifold Based Adaptation for Evolving Visual Domains
Judy Hoffman, Trevor Darrell, Kate Saenko.
Computer Vision and Pattern Recognition (CVPR), 2014
pdf | abstract | bibtex | project page | video

We pose the following question: what happens when test data not only differs from training data, but differs from it in a continually evolving way? The classic domain adaptation paradigm considers the world to be separated into stationary domains with clear boundaries between them. However, in many real-world applications, examples cannot be naturally separated into discrete domains, but arise from a continuously evolving underlying process. Examples include video with gradually changing lighting and spam email with evolving spammer tactics. We formulate a novel problem of adapting to such continuous domains, and present a solution based on smoothly varying embeddings. Recent work has shown the utility of considering discrete visual domains as fixed points embedded in a manifold of lower-dimensional subspaces. Adaptation can be achieved via transforms or kernels learned between such stationary source and target subspaces. We propose a method to consider non-stationary domains, which we refer to as Continuous Manifold Adaptation (CMA). We treat each target sample as potentially being drawn from a different subspace on the domain manifold, and present a novel technique for continuous transform-based adaptation. Our approach can learn to distinguish categories using training data collected at some point in the past, and continue to update its model of the categories for some time into the future, without receiving any additional labels. Experiments on two visual datasets demonstrate the value of our approach for several popular feature representations.

@inproceedings{2014cvprHoffman, 
author = {Hoffman, Judy and Darrell, Trevor 
	and Saenko, Kate},
title = {Continuous Manifold Based 
	Adaptation for Evolving Visual 
	Domains},
year = 2014,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR)}
}
sym Asymmetric and Category Invariant Feature Transformations for Domain Adaptation
Judy Hoffman, Erik Rodner, Jeff Donahue, Brian Kulis, Kate Saenko.
International Journal in Computer Vision (IJCV), 2013
pdf | abstract | bibtex

We address the problem of visual domain adaptation for transferring object models from one dataset or visual domain to another. We introduce a unified flexible model for both supervised and semi-supervised learning that allows us to learn transformations between domains. Additionally, we present two instantiations of the model, one for general feature adaptation/alignment, and one specifically designed for classification. First, we show how to extend metric learning methods for domain adaptation, allowing for learning metrics independent of the domain shift and the final classifier used. Furthermore, we go beyond classical metric learning by extending the method to asymmetric, category independent transformations. Our framework can adapt features even when the target domain does not have any labeled examples for some categories, and when the target and source features have different dimensions. Finally, we develop a joint learning framework for adaptive classifiers, which outperforms competing methods in terms of multi-class accuracy and scalability. We demonstrate the ability of our approach to adapt object recognition models under a variety of situations, such as differing imaging conditions, feature types, and codebooks. The experiments show its strong performance compared to previous approaches and its applicability to largescale scenarios

@article{2013ijcvHoffman, 
author = {Hoffman, Judy and Rodner, Erik and 
	Donahue, Jeff and Kulis, Brian 
	and Saenko, Kate},
title = {Asymmetric and Category Invariant 
	Feature Transformations for 
	Domain Adaptation},
year = 2013,
journal = {International Journal in Computer 
	Vision (IJCV)}
}
sym Semi-Supervised Domain Adaptation with Instance Constraints
Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell.
Computer Vision and Pattern Recognition (CVPR), 2013
pdf | bibtex | poster

@inproceedings{2013cvprDonahue, 
author = {Donahue, Jeff and Hoffman, Judy 
	and Rodner, Erik and Saenko, 
	Kate and Darrell, Trevor},
title = {Semi-Supervised Domain Adaptation 
	with Instance Constraints},
year = 2013,
booktitle = {Computer Vision and Pattern 
	Recognition (CVPR)}
}
sym sym Efficient Learning of Domain-invariant Image Representations
Judy Hoffman, Erik Rodner, Jeff Donahue, Kate Saenko, Trevor Darrell.
International Conference on Learning Representations (ICLR), 2013 (Oral Presentation)
pdf | abstract | bibtex | code | slides

We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches.

@inproceedings{2013iclrHoffman, 
author = {Hoffman, Judy and Rodner, Erik and 
	Donahue, Jeff and Saenko, Kate 
	and Darrell, Trevor},
title = {Efficient Learning of 
	Domain-invariant Image 
	Representations},
year = 2013,
booktitle = {International Conference on 
	Learning Representations 
	(ICLR)}
}
sym Discovering Latent Domains For Multisource Domain Adaptation
Judy Hoffman, Brian Kulis, Trevor Darrell, Kate Saenko.
European Conference in Computer Vision (ECCV), 2012
pdf | bibtex | code | slides | poster | video

@inproceedings{2012eccvHoffman, 
author = {Hoffman, Judy and Kulis, Brian and 
	Darrell, Trevor and Saenko, 
	Kate},
title = {Discovering Latent Domains For 
	Multisource Domain Adaptation},
year = 2012,
booktitle = {European Conference in Computer 
	Vision (ECCV)}
}
sym sym Weakly Supervised Learning of Object Segmentations from Web-Scale Video
Glen Hartmann, Matthias Grundmann, Judy Hoffman, David Tsai, Vivek Kwatra, Omid Madani, Sudheendra Vijayanarasimhan, Irfan Essa, James Rehg, Rahul Sukthankar.
eccvw-webscale, 2012 (Best Paper Award)
pdf | abstract | bibtex

We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches.

@inproceedings{2012eccvwHartmann, 
author = {Hartmann, Glen and Grundmann, 
	Matthias and Hoffman, Judy and 
	Tsai, David and Kwatra, Vivek 
	and Madani, Omid and 
	Vijayanarasimhan, Sudheendra 
	and Essa, Irfan and Rehg, 
	James and Sukthankar, Rahul},
title = {Weakly Supervised Learning of 
	Object Segmentations from 
	Web-Scale Video},
year = 2012,
booktitle = {eccvw-webscale}
}
sym Domain Adaptation with Multiple Latent Domains
Judy Hoffman, Kate Saenko, Brian Kulis, Trevor Darrell.
Domain Adaptation Workshop at Neural Information Processing Symposium (NeurIps), 2011 (Best Student Paper Award)
bibtex

@inproceedings{2011nipswHoffman, 
author = {Hoffman, Judy and Saenko, Kate and 
	Kulis, Brian and Darrell, 
	Trevor},
title = {Domain Adaptation with Multiple 
	Latent Domains},
year = 2011,
booktitle = {Domain Adaptation Workshop at 
	Neural Information Processing 
	Symposium (NeurIps)}
}
sym EG-RRT: Environment-Guided Random Trees for Kinodynamic Motion Planning with Uncertainty and Obstacles
Leonard Jaillet, Judy Hoffman, Jur van den Berg, Pieter Abbeel, Josep M. Porta, Ken Goldberg.
International Conference on Intelligent Robotics and Systems (IROS), 2011
pdf | abstract | bibtex

Existing sampling-based robot motion planning methods are often inefficient at finding trajectories for kinodynamic systems, especially in the presence of narrow passages between obstacles and uncertainty in control and sensing. To address this, we propose EG-RRT, an Environment-Guided variant of RRT designed for kinodynamic robot systems that combines elements from several prior approaches and may incorporate a cost model based on the LQG-MP framework to estimate the probability of collision under uncertainty in control and sensing. We compare the performance of EG-RRT with several prior approaches on challenging sample problems. Results suggest that EG-RRT offers significant improvements in performance.

@inproceedings{2011irosJaillet, 
author = {Jaillet, Leonard and Hoffman, Judy 
	and Berg, Jur van den and 
	Abbeel, Pieter and Porta, 
	Josep M. and Goldberg, Ken},
title = {EG-RRT: Environment-Guided Random 
	Trees for Kinodynamic Motion 
	Planning with Uncertainty and 
	Obstacles},
year = 2011,
booktitle = {International Conference on 
	Intelligent Robotics and 
	Systems (IROS)}
}
Teaching
CS 8803 LS: Machine Learning with Limited Supervision
Georgia Tech - Fall 2022 (Instructor)
CS 4476-A: Introduction to Computer Vision
Georgia Tech - Spring 2022 (Instructor)
CS 8803 LS: Machine Learning with Limited Supervision
Georgia Tech - Fall 2021 (Instructor)
CS 4476: Introduction to Computer Vision
Georgia Tech - Spring 2021 (Instructor)
CS 4476/6476: Computer Vision
Georgia Tech - Spring 2020 (Instructor)
CS 8803 LS: Machine Learning with Limited Supervision
Georgia Tech - Fall 2019 (Instructor)
CS 188: Introduction to Artificial Intelligence
UC Berkeley - Spring 2013 (Teaching Assistant)
EE 20N: Signals and System
UC Berkeley - Fall 2009 (Teaching Assistant)
Selected Awards
Selected Service and Outreach

  • Mentoring and Outreach
    • Mentor to junior researchers at CVPR Main Conference 2021
    • Panelist at Woodward Academy High School Discussion on Bias in Machine Learning, 2021
    • African Masters in Machine Intelligence Research Mentor, 2020
    • Mentor at Women in Computer Vision (2018-2021)
    • Mentor at Doctorial Consortium, ICCV 2019
    • Mentor at Women in Machine Learning (2018)
  • Leadership
    • Co-founder and continued advisor for Women in Computer Vision (WiCV), 2015 - Present
      Provides support for junior researchers who idenify as women to attend (travel support), present (exposure), and receive mentorship at top Computer Vision Conferences.
    • Co-organizer LVIS Workshop and Challenge at ICCV, 2021
    • Co-organizer Adversarial Robustness in the Real World Workshop at ICCV, 2021
    • Co-organizer Responsible Computer Vision Workshop at CVPR, 2021
    • Co-organizer Adversarial Machine Learning in Computer Vision Tutorial at CVPR, 2021
    • Co-organizer Learning from Limited and Imperfect Data Workshop at CVPR, 2021
    • Co-organizer Tutorial on Learning with Limited Labels at ICCV, 2019
    • Co-organizer TASK-CV Workshop, ICCV 2019
    • Co-organizer TASK-CV Workshop and Domain Adaptation Challenge, ECCV 2018
    • Co-organizer TASK-CV Workshop and Domain Adaptation Challenge, ICCV 2017
    • Co-organizer Transfer and Multi-task learning workshop at NeurIPS, 2015
  • Recent Editorial Service
    • Associate Editor T-PAMI, 2021 - Present
    • Associate Editor IJCV, 2020 - Present
    • Area Chair CVPR, 2019 - 2021
    • Area Chair NeurIPS, 2021
    • Area Chair ICLR, 2019-2020
    • Area Chair ICCV, 2019, 2021
    • Area Chair ICML, 2020
    • See CV for full reviewing list

Sponsors

My research is made possible by the generous support of the following organizations.