top of page

Selected Projects


Multi-Modal Bias: Introducing a Framework for Stereotypical Bias Assessment beyond Gender and Race in Vision–Language Models [EACL-2023]

We investigate the stereotypic bias present in several prominent pre-trained vision-language models, including CLIP, ALBEF, and ViLT. Although most investigations of bias in multimodal models have focused on gender and racial bias other relevant minority groups, such as minorities with regard to religion, nationality, sexual orientation, or disabilities are much underexplored. This is mainly due to the lack of suitable benchmarks for such groups. We seek to address this gap by releasing a visual and textual bias benchmark called MMBias, consisting of around 3,800 images and phrases covering 14 population subgroups. Our results show that these models demonstrate meaningful bias favoring certain groups. Finally, we introduce a debiasing method designed specifically for such large pre-trained models that can be applied as a post-processing step to mitigate bias, while preserving the remaining accuracy of the model.




ScaLable Sequential Object-Oriented Representation Learning (SCALOR) [ICLR-2020]

SCALOR (SCALable sequential Object-oriented Representation learning) is the first completely self-supervised generative model capable of simultaneously tracking tens of objects in realistic natural scenes with dynamic backgrounds. It is also capable of future-time sequence generation.

  • SCALOR is a totally unsupervised generative model.

  • SCALOR significantly improves the tracking scalability (two orders of magnitude) compared to the state-of-the-art models.

  • It is not only applicable to settings containing nearly a hundred objects, but it is also more computationally efficient compared to SQAIR (which scales only to a few objects).

  • Propagation–discovery process is parallelized by introducing the propose–reject model, reducing the time complexity from O(N) to O(1).

  • SCALOR can model scenes with a complex dynamic background.

  • SCALOR is the first probabilistic model capable of handling natural images.




Domain Authoring Assistant for Intelligent Virtual Agent [AAMAS-2019]

Intelligent virtual assistants are a part of our everyday lives, however, the process of bringing them to life and giving them characters is quite challenging. This is because a team of creative authors have to describe different aspects of the characters in natural language, and another team of software engineers translates this description into computer code. This back-and-forth can be quite challenging and resource-demanding. In this paper, we introduce an authoring assistant tool that automates the code generation process from natural language description of virtual characters

  • Natural language understanding tool that uses semantic parsing and deep learning to translate a natural language description of a virtual character's world and set behavior and abilities into planning code that can be directly used in a planner.

  • Automatically identifies potentially missing aspects of the character's personality not mentioned by the author, and iteratively makes suggestions to improve.

  • Identifies inconsistent information in a character description and suggests possible improvements.

  • Uses deep learning and querying knowledge bases to suggest new possible aspects of the character and the story it comes in.

  • More than half of the users using our tool said they will use it on a daily basis.



Topic Spotting using Hierarchical Networks with Self-Attention [NAACL-2019]

The success of deep learning techniques has renewed interest in the development of dialogue systems. However, current systems struggle to have consistent long-term conversations with the users and fail to build rapport. Topic spotting, the task of automatically inferring the topic of a conversation, has been shown to be helpful in making a dialog system more engaging and efficient. We propose a hierarchical model with self-attention for topic spotting. Experiments on the Switchboard corpus show the superior performance of our model over previously proposed techniques for topic spotting as well as other SOTA baselines for text classification. Additionally, in contrast to the offline processing of dialog, we also analyze the performance of our model in a more realistic online scenario where the topic is identified in real-time as the dialog progresses. Results show that our model is able to generalize even with limited information in the online setting.

  • Our model uses a deep Bi-LSTM network with hierarchical self-attention for the task of topic classification in dialogue systems containing a wide range of topics.

  • Our model is not only superior compared to the state of the art in the offline setting but also outperforms baselines in an online setting.

  • Our model is able to generalize better when data is scarce.


Statistical Association Mapping of Population-Structured Genetic Data 

[IEEE Transaction on Computational Biology and Bioinformatics]

Genome-Wide Association Studies (GWAS) study statistical associations between specific regions of DNA sequence with the causal factors underlying a specific disease or any other observable property of an organism. Traditional GWAS methods have critical drawbacks making them inapplicable to real-world complex diseases. These include only considering single DNA regions independently as well as assuming genetic homogeneity while there might be ethnic genetic substructures in the data.

In this paper, we propose a novel Bayesian MCMC framework (Gibbs Sampling) for association mapping (mapping disease types to underlying genes) to address the mentioned limitations. Our model works in challenging scenarios where the hidden ethnic structure is present in the data.  


  • A novel Bayesian model based on Gibbs Sampling for association mapping in the presence of hidden population structures, where the population under study consists of numerous latent subpopulations with different genetic backgrounds. 

  • Our model not only identifies the latent population structure for each data point but is also able to identify hidden disease causal factors.

  • Our model outperforms state-of-the-art methods such as STRUCTURE and PLINK and is able to reach ~15% higher accuracy.


Crowd Behavior Modeling using Deep Neural Networks

Designed and implemented a deep-learning-based generative model for predicting crowd modeling of floor plans using VAEs and GANS.

A Novel Method for Fake News Classification

Designed and implemented a BiLSTM-based method for fake news classification using content, style as well as online fact-checking.

Classifying Motor Movements from EEG Data Using Spiking Neural Networks

Developed a Spiking Neural Net architecture to classify hand/leg movements from EEG data.

This approach not only provides reasonable classification performance but also has biological plausibility. 

Screen Shot 2023-02-16 at 12.00.47 PM.png

Other Projects

bottom of page