Viraj Prabhu

I am a research scientist at Salesforce AI, where I work on building reliable multimodal foundation models. I received my PhD in Computer Science from Georgia Tech in December 2023, where I was advised by Judy Hoffman. The focus of my PhD was on developing reliable computer vision systems that can generalize and adapt to changing conditions.

I earned my Master's in CS (awarded the MS Research award) in 2019 at Georgia Tech, where I was advised by Devi Parikh and worked on developing visual conversational agents. In grad school, I've had the opportunity to intern at NVIDIA (with Sanja Fidler), Salesforce (with Nikhil Naik), and Curai (with Anitha Kannan). Before that, I've had stints as a research assistant at Virginia Tech (with Dhruv Batra) and Adobe, and as a mentor for Google Summer of Code. I received my Bachelor's degree in Computer Science from BITS Pilani in 2015. In my free time, I enjoy reading, running, soccer, and playing the guitar.

2011-2015

2017-2023

Summer '14, 2015-2016

Summer 2018, 2019

Summer 2021, Jan 2024-current

Summer 2022

News

[Oct '24] New preprints on training and evaluating multimodal foundation models out!
[Jan '24] Defended my Ph.D. thesis and started a new role as a Research Scientist at Salesforce AI!
[Aug '23-Mar '24] Invited talks on ''Reliable Computer Vision'' at Caltech, AWS Responsible AI, UC Berkeley, and CMU.

[Jun '22] Gave a tutorial on Human-Centered AI for Computer Vision at CVPR 2022.
[May '22] Co-organizing Learning from Limited and Imperfect Data at ECCV 2022.
[Oct '21] Recognized as an outstanding reviewer at NeurIPS 2021 and CVPR 2021.
[Jan '21] Head TA for Intro to Computer Vision, Spring 2021.
[May '19] Completed my Master's degree!
[Mar '19] Awarded the College of Computing's MS Research award (1 student annually).

Read Less

Publications

Trust but Verify: Programmatic VLM Evaluation in the Wild

Responsibly Building the Next Generation of Multimodal Foundational Models, NeurIPS 2024

Viraj Prabhu, Senthil Purushwalkam, An Yan, Caiming Xiong, Ran Xu

Paper Data

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Emergent Visual Abilities and Limits of Foundation Models, ECCV 2024

Le Xue, Manli Shu,, Anas Awadalla,, Jun Wang, An Yan, Senthil Purushwalkam, Honglu Zhou, Viraj Prabhu, Yutong Dai, Michael S Ryoo, Shrikant Kendre, Jieyu Zhang, Can Qin, Shu Zhang, Chia-Chih Chen, Ning Yu, Juntao Tan, Tulika Manoj Awalgaonkar, Shelby Heinecke, Huan Wang, Yejin Choi, Ludwig Schmidt, Zeyuan Chen, Silvio Savarese, Juan Carlos Niebles, Caiming Xiong, Ran Xu

Project page

We're Not Using Videos Effectively: An Updated Video Domain Adaptation Baseline

TMLR 2024

Simar Kareer, Vivek Vijaykumar, Harsh Maheshwari, Judy Hoffman, Prithvijit Chattopadhyay, Viraj Prabhu

Paper Code

Translating Labels to Solve Annotation Mismatches Across Object Detection Datasets

ICLR 2024

Yuan-Hong Liao, David Acuna, Rafid Mahmood, James Lucas, Viraj Prabhu, Sanja Fidler

Paper

AUGCAL: Sim-to-Real Adaptation by Improving Uncertainty Calibration on Augmented Synthetic Images

ICLR 2024

Prithvijit Chattopadhyay, Bharat Goyal, Bogi Ecsedi, Viraj Prabhu, Judy Hoffman

Paper

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

NeurIPS 2023

Viraj Prabhu, Sriram Yenamandra, Prithvijit Chattopadhyay, Judy Hoffman

Paper Code Project Page News

Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting

TMLR 2023

Viraj Prabhu, David Acuna, Yuan-Hong Liao, Rafid Mahmood, Marc T. Law, Judy Hoffman, Sanja Fidler, James Lucas

Paper

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Vision Tasks

NeurIPS 2023 (Datasets & Benchmarks)

Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Prabhu, Gowthami Somepalli, Prithivijit Chattopadhyay, Adrien Bardes, Mark Ibrahim, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, Tom Goldstein

Paper Code

FACTS: First Amplify Correlations and Then Slice to Discover Bias

ICCV 2023

Sriram Yenamandra, Pratik Ramesh, Viraj Prabhu, Judy Hoffman

Paper

ICON²: Reliably Benchmarking Inequity in Detection by Identifying and Controlling for Confounders

Safe and Secure Autonomous Driving, CVPR 2023

Sruthi Sudhakar, Viraj Prabhu, Olga Russakovsky, Judy Hoffman

Paper

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

NeurIPS 2022

Viraj Prabhu*, Sriram Yenamandra*, Aaditya Singh, Judy Hoffman (* = equal contribution)

Paper News

Can domain adaptation make object recognition work for everyone?

Learning with Limited Labelled Data, CVPR 2022

Viraj Prabhu, Ramprasaath R. Selvaraju, Judy Hoffman, Nikhil Naik

Paper

AUGCO: Augmentation Consistency-guided Self-training for Source-free Domain Adaptive Segmentation

Computer Vision in the Wild, ECCV 2022 (spotlight)

Viraj Prabhu*, Shivam Khare*, Deeksha Kartik, Judy Hoffman (* = equal contribution)

Paper Video

UDIS: Unsupervised Discovery of Bias in Deep Visual Recognition Models

BMVC 2021

Arvind Krishnakumar, Viraj Prabhu, Sruthi Sudhakar, Judy Hoffman

Paper Code

Mitigating Bias in Visual Transformers via Targeted Alignment

BMVC 2021

Sruthi Sudhakar, Viraj Prabhu, Arvind Krishnakumar, Judy Hoffman

Paper

Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation

ICCV 2021

Viraj Prabhu, Shivam Khare, Deeksha Kartik, Judy Hoffman

Paper Project Page Code Video Slides Poster

Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings

ICCV 2021

Viraj Prabhu, Arjun Chandrasekaran, Kate Saenko, Judy Hoffman

Paper Project Page Code Video Slides Poster

Open Set Medical Diagnosis

Machine Learning for Health, NeurIPS 2019

Viraj Prabhu, Anitha Kannan, Geoffrey J. Tso, Namit Katariya, Manish Chablani, David Sontag, Xavier Amatriain

Paper

Few-shot Learning for Dermatological Disease Diagnosis

MLHC 2019 (spotlight)

Viraj Prabhu, Anitha Kannan, Murali Ravuri, Manish Chablani, David Sontag, Xavier Amatriain

Paper

Do Explanations make VQA Models more Predictable to a Human?

EMNLP 2018

Arjun Chandrasekaran*, Viraj Prabhu*, Deshraj Yadav*, Prithvijit Chattopadhyay*, Devi Parikh (* = equal contribution)

Paper

The Promise of Premise: Harnessing Question Premises in Visual Question Answering

EMNLP 2017

Aroma Mahendru*, Viraj Prabhu*, Akrit Mohapatra*, Dhruv Batra, Stefan Lee (* = equal contribution)

Paper Code Dataset

Evaluating Visual Conversational Agents via Cooperative Human-AI Games

HCOMP 2017

Prithvijit Chattopadhyay*, Deshraj Yadav*, Viraj Prabhu, Arjun Chandrasekaran, Abhishek Das, Stefan Lee, Dhruv Batra, Devi Parikh (* = equal contribution)

Paper Code

Projects

Fabrik: An Online Collaborative Neural Network Editor Utsav Garg, Viraj Prabhu, Deshraj Yadav, Ram Ramrakhya, Harsh Agrawal, Dhruv Batra Lead mentor on Fabrik, an open-source web platform to collaboratively build, visualize, and design neural networks in the browser. Report Code
PyTorch implementation of Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning Nirbhay Modhe, Viraj Prabhu, Michael Cogswell, Satwik Kottur, Abhishek Das, Stefan Lee, Devi Parikh, Dhruv Batra Code
Adobe Captivate Prime During my time as a software developer at Adobe (Aug '15-'16), I was responsible for the Captivate Prime Android app through two release cycles. I developed features for offline content play-back, syncing, and UI.
Automated camera calibration Over a research internship at Tonbo Imaging (Spring '15), I designed and implemented an algorithm for automated camera calibration.
KeyframeCut Developed KeyframeCut, a fast graphcut-based segmentation algorithm for real-time background substitution in video, which was tech-transferred to Adobe Presenter Video Express 11. Demo Blog

News

Read More

Read Less

Publications

Trust but Verify: Programmatic VLM Evaluation in the Wild

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

We're Not Using Videos Effectively: An Updated Video Domain Adaptation Baseline

Translating Labels to Solve Annotation Mismatches Across Object Detection Datasets

AUGCAL: Sim-to-Real Adaptation by Improving Uncertainty Calibration on Augmented Synthetic Images

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Bridging the Sim2Real gap with CARE: Supervised Detection Adaptation with Conditional Alignment and Reweighting

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Vision Tasks

FACTS: First Amplify Correlations and Then Slice to Discover Bias

ICON2: Reliably Benchmarking Inequity in Detection by Identifying and Controlling for Confounders

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Can domain adaptation make object recognition work for everyone?

AUGCO: Augmentation Consistency-guided Self-training for Source-free Domain Adaptive Segmentation

UDIS: Unsupervised Discovery of Bias in Deep Visual Recognition Models

Mitigating Bias in Visual Transformers via Targeted Alignment

Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation

Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings

Open Set Medical Diagnosis

Few-shot Learning for Dermatological Disease Diagnosis

Do Explanations make VQA Models more Predictable to a Human?

The Promise of Premise: Harnessing Question Premises in Visual Question Answering

Evaluating Visual Conversational Agents via Cooperative Human-AI Games

Projects

Fabrik: An Online Collaborative Neural Network Editor

PyTorch implementation of Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

Adobe Captivate Prime

Automated camera calibration

KeyframeCut

ICON²: Reliably Benchmarking Inequity in Detection by Identifying and Controlling for Confounders