Skip to content

hey, i'm atrey desai.

I am a third-year undergraduate student studying computer science and linguistics with a minor in korean studies at the University of Maryland.

I am fortunate to be advised by Professors Rachel Rudinger and Jordan Boyd-Graber.

Language models are increasingly capable, but our methods for measuring and building that capability lag behind. I work on evaluation and data pipelines for reliable NLP, namely:

1. Benchmark validity and the limits of what our evaluations actually measure

2. Human-AI collaboration in synthetic data creation and annotation

3. Evaluation for systems that reason and perceive in the world

[IP] = in progress

Quick, Create a Distractor! Evaluating LLM Distractors for Multiple-Choice Benchmarks

Atrey Desai, Nishant Balepur, Rachel Rudinger

Under ACL ARR Review, 2026
pdf
Washington, DC
Last updated May 29, 2026