Dhruv Madhwal ☕️

Dhruv Madhwal

Graduate Student at ASU

CoRAL

Professional Summary

I’m Dhruv Madhwal, a CS grad student at Arizona State University who likes building software at the intersection of ML, NLP, and reliable AI. My work focuses on hallucination detection, multi-hop question answering, LLM/VLM evaluation, and compositional reasoning in vision-language models, including research published at ACL. I enjoy turning research ideas into working software: agents, retrieval pipelines, ML applications, backend services, and tools people can actually use. I also spend time on the systems side: event-driven architectures, distributed data platforms, scalable pipelines, and production-oriented engineering.

Education

MS Computer Science

2024-08-22
2026-08-01

Arizona State University

MS Physics

2017-08-01
2022-05-31

Birla Institute of Technology and Science, Goa Campus

BE Electronics

2017-08-01
2022-05-31

Birla Institute of Technology and Science, Goa Campus

Interests

Large Language Models Agenic AI Computer Vision NLP Software Engineering Distributed Systems
Projects

I enjoy making things. Here are a selection of projects that I have worked on over the years.

Dishcovery featured image

Dishcovery

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures.

Multi-Hop Reasoning Agent featured image

Multi-Hop Reasoning Agent

PyTorch is a Python package that provides tensor computation (like NumPy) with strong GPU acceleration.

scikit-learn featured image

scikit-learn

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license.

📚 My Research

I’m a thesis student at Arizona State University working on reliable AI for language models. My thesis studies how to detect hallucinations and knowledge gaps in LLMs for multi-hop question answering, with the goal of making models better at knowing when they should abstain.

I have also worked on CLIP-style vision-language models and compositional reasoning, and have built NL-to-SQL benchmarks for privacy-sensitive domains such as healthcare, law, and criminal justice.

Please reach out if you’re interested in collaborating on reliable AI, LLM/VLM evaluation, or applied ML systems.