Vishvak Murahari

I am a 3rd year CS PhD student (focus on Natural Language Processing, Machine Learning) at Princeton University, advised by Prof. Karthik Narasimhan. I am also a Student Researcher at Google Princeton, advised by Prof. Elad Hazan.

I earned my Masters from Georgia Tech and I was fortunate to be advised by Prof. Devi Parikh and Abhishek Das. I also worked closely with Prof. Dhruv Batra. I earned my Bachelors in Computer Science (focus on AI and Devices) from Georgia Tech and I was fortunate to be advised by Prof. Thomas Ploetz and worked closely with Prof. Aman Parnami.

I previously interned with the PRIOR team at AI2 (Summer 2020) and was advised by Roozbeh Mottaghi and worked on some interesting problems in Instruction following. I have also had the fortune to intern at Microsoft, Redmond (Summer 2019, 2018, 2017) where I have worked on improving the query re-formulation algorithms at Outlook 365, designing recommendation systems for XBox and developing low latency systems to back large scale privacy dashboard for Windows 10 Users.

In my spare time you can catch me reading about Geopolitics and History or find me on the Tennis court.

Email  /  CV  /  Google Scholar  /  Github  /  Twitter  / 

profile photo

The problems that I work on lie at the intersection Natural Language Processing, Machine Learning and Computer Vision. Some of my current research interests include:

  • Language Pretraining/ RL pretraining : Teaching agents to learn good representations from unsupervised data.
  • Grounded Language Learning: Teaching agents to talk about environment specific concepts and entities.
  • Learning language through interaction: Teaching agents to talk through either self-play or by interacting with language based environments.

Representative papers are listed under Papers.

3DSP DataMUX: Data Multiplexing for Neural Networks
Vishvak Murahari , Carlos E. Jimenez, Runzhe Yang, Karthik Narasimhan
NeurIPS 2022
[Code] [Webpage]

We introduce data multiplexing (DataMUX), a technique that enables deep neural networks to process multiple inputs simultaneously using a single compact representation. DataMUX demonstrates that neural networks are capable of generating accurate predictions over mixtures of inputs, resulting in increased throughput with minimal extra memory requirements. Our approach uses two key components -- 1) a multiplexing layer that performs a fixed linear transformation to each input before combining them to create a mixed representation of the same size as a single input, which is then processed by the base network, and 2) a demultiplexing layer that converts the base network's output back into independent representations before producing predictions for each input. We show the viability of DataMUX for different architectures (Transformers, and to a lesser extent MLPs and CNNs) across six different tasks spanning sentence classification, named entity recognition and image classification. For instance, DataMUX for Transformers can multiplex up to 20x/40x inputs, achieving 11x/18x increase in throughput with minimal absolute performance drops of 2% and 4% respectively on MNLI, a natural language inference task.

3DSP Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari , Dhruv Batra, Devi Parikh, Abhishek Das
ECCV 2020
[Code] [Talk]

Prior work in visual dialog has focused on training deep neural models on VisDial in isolation. Instead, we present an approach to leverage pretraining on related vision-language datasets before transferring to visual dialog. Our best single model outperforms prior published work (including model ensembles) by more than 1% absolute on NDCG and MRR. Next, we find that additional finetuning using "dense" annotations in VisDial leads to even higher NDCG -- more than 10% over our base model -- but hurts MRR -- more than 17% below our base model! This highlights a trade-off between the two primary metrics -- NDCG and MRR -- which we find is due to dense annotations not correlating well with the original ground-truth answers to questions.

3DSP Improving Generative Visual Dialog by Answering Diverse Questions
Vishvak Murahari , Prithvijit Chattopadhyay, Dhruv Batra, Devi Parikh, Abhishek Das
EMNLP, 2019
[Code] [Poster]

While generative visual dialog models trained with self-talk based RL perform better at the associated downstream task, they suffer from repeated interactions -- resulting in saturation in improvements as the number of rounds increase. To counter this, we devise a simple auxiliary objective that incentivizes Q-Bot to ask diverse questions, thus reducing repetitions and in turn enabling A-Bot to explore a larger state space during RL i.e., be exposed to more visual concepts to talk about, and varied questions to answer.

iswc2018 On attention models in human activity recognition
Vishvak Murahari, Thomas Ploetz
ISWC 2018

Most approaches that model time-series data in human activity recognition based on body-worn sensing (HAR) use a fixed size temporal context to represent different activities. This might, however, not be apt for sets of activities with individually varying durations. We introduce attention models into HAR research as a data driven approach for exploring relevant temporal context. Attention models learn a set of weights over input data, which we leverage to weight the temporal context being considered to model each sensor reading. We also visualize the learned weights to better understand what constitutes relevant temporal context

3DSP Teaching Assistant, Introduction to Robotics and Perception (CS 3630)

As a TA for CS 3630, I was a part of one of the largest hands-on advanced robotics classes in the country, taken by close to 200 students. I advised students on robotic planning, control and localization. I collaborated with co-TAs to develop and improve 2 projects on robot localization. I also engaged with students in-person through weekly office hours and also engaged online through Piazza

iswc2018 Teaching Assistant, Introduction to AI (CS 3600)

Guided more than 300 students on AI projects and homework. Reinforced concepts ranging from probabilistic inference to Neural Networks, Optimization and Reinforcement Leaning. Helped in course development and helped improve existing class projects. Held weekly office hours to engage with students

(Design and CSS courtesy: Jon Barron and Amlaan Bhoi)