Dhruv Madhwal ☕️
Dhruv Madhwal

Graduate Student

About Me

I’m a grad student at ASU’s CoRAL Lab, building agentic AI and retrieval systems. I’ve worked on ML and data science projects at Samsung, Carelon, and several fast-paced startups, tackling problems across domains like healthcare analytics, recommendation systems, and consumer-focused technologies. Before ASU, I earned an MSc in Physics and BE in Electronics from BITS Pilani Goa, blending analytical rigor with technical expertise. I’m interested in both foundational AI research and its practical applications, and I’m actively looking for opportunities that bridge the two.

Download CV
Interests
  • Information Retrieval
  • AI Agents
  • Large Language Models
  • Artifical Intelligence
  • Deep Learning
Education
  • MS Computer Science

    Arizona State University

  • MSc. Physics

    Birla Institute of Technology and Science, Goa Campus

  • BE Electronics and Instrumentation

    Birla Institute of Technology and Science, Goa Campus

📚 My Research

Hi! I’m a Graduate Researcher at the CoRAL Lab at ASU. My work focuses on information retrieval, agentic LLM architectures, and large scale information synchronization.

Multi‑Hop Reasoning Agent — CoRAL Lab, ASU Feb 2025–Present
Building a model‑agnostic multi‑hop QA agent for open‑ and closed‑book settings, integrating RAG, question decomposition, inference‑time scaling, and self‑verification. A custom LLM‑as‑Judge replaces EM/ROUGE and off‑the‑shelf LLM graders, directly assessing factual grounding, logical consistency, and chain coherence, and revealing where current benchmarks under-measure reasoning quality.
  • Benchmarks/Datasets: FanOutQA, Musique, Frames, Quest.
  • Stack: LangGraph, AutoGen, LangChain.
  • Techniques: RAG, decomposition, inference‑time scaling, self‑verification loops
  • Datasets: FanOutQA, MuSiQue, Frames, QUEST
  • Goal: Beat SOTA and reveal current benchmark weaknesses.

InfoboxIQ: Text to Infobox synchronization (Wikipedia) — CoRAL Lab, ASU Feb 2025–Present
A multi-stage LLM pipeline for information synchronization (text‑to‑table). Takes a Wikipedia article and its infobox template and produces an updated, evidence‑grounded infobox. We also introduce an evaluation suite verifies that the synchronized table is faithful, complete, and non‑hallucinatory.
  • Pipeline (6–8 stages): Preprocess, key/property breakdown, QA‑SRL extraction, KG triple generation, KG merge and conflict resolution, infobox creation.
  • Dataset: ~90K article–infobox pairs across ~40 Wikipedia categories with manually annotated key schemas.
  • Evaluation: per‑key accuracy, coverage/completeness, hallucination rate, overall text‑to‑table sync quality.
🛠️ Technical Skills

Programming Languages & Frameworks: Python, C/C++, MATLAB, Flask, FastAPI

Machine Learning: PyTorch, TensorFlow, Keras, scikit-learn, Transformers, Hugging Face, OpenCV, pandas, NumPy

LLM/Agent & RAG Stack: LangChain/LangGraph, AutoGen, ChromaDB, Pinecone, FAISS

Data Engineering & Databases: Kafka, Spark, Airflow, MySQL, Postgres, MongoDB

Cloud & DevOps: AWS, Docker, Git, CI/CD, MLflow

🎯 Selected Projects

🧠 Machine Unlearning in Small Language Models

Teaching small LMs (~3–4B params) to forget specific facts without retraining from scratch while preserving general ability. Used lightweight procedures like gradient-ascent updates and random-label fine-tuning to make compact LMs “forget on demand” with minimal collateral damage. Also support quantized inference to run efficiently on commodity GPUs.

Key Features:

  • Targeted forgetting: Gradient Ascent unlearning and Random-Labeling + SFT procedures.
  • Model preservation: Retain general QA performance while removing specific facts.
  • Robust eval suite: Automated BLEU / ROUGE-L / BERTS scores, per-fact unlearn/retain tagging, ablations (label similarity), and spot manual validation.
  • Quantization & efficiency: FP16 plus 4-bit / 8-bit inference and LoRA/PEFT adapters to reduce VRAM and speed experimentation

Technologies: PyTorch, Hugging Face Transformers, LoRA/PEFT, bitsandbytes (4-/8-bit), Small LMs (Llama-3.2-3B-Instruct, Phi-3.5-mini-instruct, Nemotron-Mini-4B-Instruct)


🔍 InQuery ML: SQL-Native ID Image Fraud Detection

Built an end-to-end ID-image fraud detector with a lightweight CNN, achieving ~92% accuracy on held-out data. The model is exposed inside SQL via a PostgreSQL PL/Python UDF, so analysts can score images for fraud using only SQL—no Python or separate service calls required.

Key Features:

  • SQL-only workflow: Fraud scoring happens in a query (SELECT label, confidence FROM predict_fraud(image_b64)), enabling analysts to operationalize ML without leaving SQL
  • Postgres-native inference: PL/Python UDF returns (class, confidence) for in-database predictions and auditability at the DB layer
  • TorchServe integration: Custom handler (base64 → tensor → prediction → JSON) for portable, production-style serving
  • Analyst-ready queries: Views and filters to flag newly issued IDs predicted fraudulent and identify repeat submitters
  • Performance benchmarking: Documented trade-offs—row-wise UDF calls are simple but slower; batched inference is preferred for high volume

Technologies: PyTorch, TorchServe, PostgreSQL (PL/Python), Computer Vision


📡 Edge-to-Cloud Face Recognition on AWS

Real-time face recognition for a camera stream using AWS IoT. Faces are detected at the edge with MTCNN (via a Greengrass component), only cropped faces are sent to the cloud, and FaceNet (in Lambda) returns identity + confidence. Reduced bandwidth and keeping raw frames local.

Key Features:

  • Edge detection, cloud recognition: Low-latency loop that keeps heavy vision local and runs identity matching in the cloud
  • Privacy & efficiency: Only face crops leave the device, raw video never leaves the edge
  • Reliable messaging: Robust IoT messaging with delivery guarantees and request/response correlation
  • Operational visibility: Metrics and logs for end-to-end health checks and troubleshooting

Technologies: AWS IoT Core, Greengrass v2, AWS Lambda, Amazon SQS, CloudWatch, MTCNN, FaceNet, Python

Check out my work experience!