Tejas Garg
Building ML Systems That Work Beyond Demos
Pre-final year CSE (AI/ML) • Available from June 2026
Building ML Systems That Work Beyond Demos
Pre-final year CSE (AI/ML) • Available from June 2026
Third-year CS undergrad focused on building ML systems that work beyond demos. I care about bridging the gap between novel ML research and production-ready systems.
Recent work: reproduced a diffusion classifier from scratch and built custom XAI for it, created a real-time PPE monitoring pipeline with temporal filtering, and experimented with GRPO to teach LLMs explicit reasoning. I like projects where the engineering is as hard as the ML.
B.Tech in Computer Science & Engineering
Specialization in AI & Machine Learning
Indian Institute of Information Technology, Nagpur
Expected 2027
Explainable AI • LLM Pipelines & Evaluation • Reinforcement Learning for Reasoning • Diffusion Models • Real-time Computer Vision • Production ML Systems • Agentic Systems • RAG / Hybrid Retrieval
Reproduced a diffusion-based diabetic retinopathy classifier from scratch, achieved 84.1% accuracy on APTOS 2019, then built a dedicated XAI layer for timestep-aware interpretation. Implemented six explainers including trajectory analysis, conditional attribution, faithfulness checks, and counterfactuals.
Multi-agent research assistant that guides you from a vague topic to curated papers and a structured survey. Uses LangGraph subgraphs for discovery, analysis, and survey generation with human-in-the-loop checkpoints. Orchestrates Semantic Scholar, arXiv, and Firecrawl APIs.
Full-stack interview prep platform for OS, DBMS, and CN with SM-2 spaced repetition, prerequisite-aware learning paths, RAG-backed chat with citation-driven retrieval, quiz sessions, and PostgreSQL-backed progress tracking.
Event-driven video processing system that turns noisy frame-level PPE detections into stable violation events. Built with YOLOv8/v11, SAM3, FastAPI, and Next.js. Handles occlusion, limited GPU, and long-running streams using EMA fusion and hysteresis thresholds.
Experimented with Group Relative Policy Optimization to induce explicit reasoning in Mistral-7B via XML-structured traces. Warmed up with SFT on GSM8K, then applied RL. Improved final test accuracy from 41.2% to 52.5%. Key finding: evaluation consistency matters more than raw scores.
78% execution accuracy on Spider. Multi-stage LLM pipeline for reviewable SQL generation that breaks down intent into reasoning steps, generates SQL, then validates and auto-corrects. Includes prompt injection safeguards.