Atharva Kulkarni

Hello! I am a second year CS PhD student at the University of Southern California, advised by Swabha Swayamdipta, and a member of DILL Lab and USC NLP Group.

My broad research focus is on theoretical & empirical understanding of language models. Specifically, I am interested in:

Building a principled understanding of when & why they work / fail.
Studying their learning dynamics, geometric properties, & structural constraints.
Exploring avenues for improving their reliability, safety, & trustworthiness.

A central part of my work is investigating the scientific underpinnings of language fundamentals: pre-training, adaptation (fine-tuning & alignment), and post-training modification (model merging, pruning, & quantization). My goal is to understand why certain approaches work / fail in practice and how they can be improved.

Previously, I was a visiting PhD student at UC Berkeley – Simons Institute for the Theory of Computing for Spring 2025. I completed my Masters in Language Technologies (MLT) from Carnegie Mellon University – Language Technologies Institute, where I was fortunate to be advised by Barnabás Póczos & Graham Neubig. I have also interned at Apple with the Siri & Information Intelligence group (summer 2023 & 2024).

Before coming to CMU, I was a Predoctoral Researcher at the Laboratory for Computational Social Systems (LCS2), IIT Delhi. Even before that, I graduated with a Bachelors of Engineering in Computer Science from Savitribai Phule Pune University.

I’m eager to connect with my academic peers! If our research interests align (or diverge) in intriguing ways, I’d be delighted to explore potential collaborations or simply exchange ideas!

If you’d like to chat about navigating ML/NLP research, grad school applications, or want to collaborate on a research project, please visit the Outreach tab for more information.

News

May 2026	Started my summer research internship at Microsoft Research, Redmond with the Microsoft Teams Core Applied Science group.
Apr 2026	Disentangling Geometry, Performance, and Training in Language Models is accepted at ICML 2026 as a spotlight presentation! See you in Seoul :sk:!
Aug 2025	Summer 2024 internship work at Apple Research on Hallucination Metrics Meta-Evaluation got accepted to EMNLP 2025 FIndings 🇨🇳!

🕰️ all news ...

Selected Publications

ICML

Disentangling Geometry, Performance, and Training in Language Models

Atharva Kulkarni, Jacob Mitchell Springer, Arjun Subramonian, and Swabha Swayamdipta

In Forty-third International Conference on Machine Learning, 2026
TMLR

Multitask Learning Can Improve Worst-Group Outcomes

Atharva Kulkarni^*, Lucio M. Dery^*, Amrith Setlur, Aditi Raghunathan, Ameet Talwalkar, and Graham Neubig

Transactions on Machine Learning Research, Feb 2024

Abs Code

In order to create machine learning systems that serve a variety of users well, it is vital to not only achieve high average performance but also ensure equitable outcomes across diverse groups. However, most machine learning methods are designed to improve a model’s average performance on a chosen end task without consideration for their impact on worst group error. Multitask learning (MTL) is one such widely used technique. In this paper, we seek not only to understand the impact of MTL on worst-group accuracy but also to explore its potential as a tool to address the challenge of group-wise fairness. We primarily consider the standard setting of fine-tuning a pre-trained model, where, following recent work (Gururangan et al., 2020; Dery et al., 2023), we multitask the end task with the pre-training objective constructed from the end task data itself. In settings with few or no group annotations, we find that multitasking often, but not consistently, achieves better worst-group accuracy than Just-Train-Twice (JTT; Liu et al. (2021)) – a representative distributionally robust optimization (DRO) method. Leveraging insights from synthetic data experiments, we propose to modify standard MTL by regularizing the joint multitask representation space. We run a large number of fine-tuning experiments across computer vision and natural language processing datasets and find that our regularized MTL approach consistently outperforms JTT on both average and worst-group outcomes. Our official code can be found here: https://github.com/athrvkk/MTL-group-robustness.