Experience
Applied Research Scientist, Thomson Reuters Lab, India
Period: August 2024 - Current
Legal AI Reasoning and Model Enhancement
- Pioneered a legal synthetic data generation pipeline for Process Reward Models (PRMs), creating domain-specific training datasets that improved legal reasoning capabilities on LegalBench benchmark tasks.
- Evaluated different Test-Time Scaling paradigms for legal reasoning and inference-time compute optimization.
- Architected an IRAC (Issue-Rule-Application-Conclusion) Knowledge Graph framework using Thomson Reuters’ Westlaw corpus and court case data, generating high-quality preference datasets that improved legal reasoning alignment in fine-tuned LLMs.
AI-Powered Legal Document Update System
- Built a comprehensive end-to-end LLM workflow pipeline to update Word documents with XML parsing and intelligent contextual mapping for paragraph identification and edit, achieving 70%+ success rate while preserving complete document formatting integrity.
- Engineered a human-in-the-loop validation interface with reasoning chains and alert-point mapping, resulting in a 60% reduction in manual processing time while maintaining legal accuracy through strategic human oversight and audit trails.
DocEvolver
- Created an MVP for a “Cursor for Word” style extension for updating and understanding MS Word files for lawyer-editors.
Search-and-Replace Agentic System
- Architected a multi-agent AI system for automated Word editing with a comprehensive validation pipeline (schema enforcement, content integrity, audit logging), achieving 98% accuracy and 65% reduction in manual content revision.
- Constructed error-resolving agents with function calling and multi-turn reasoning to fix XML issues, implementing few-shot learning and self-healing mechanisms. This led to adoption across teams, processing hundreds of documents monthly with sub-8 second processing time per section.
Additional Tools
- Truth Social Monitor: Created a monitoring system for Trump’s Truth Social posts with sub-3 second latency, generating automated alerts for Reuters journalists.
- Page Flipper: Revived the Page Monitor extension for website tracking, eliminating Visual Ping subscriptions for the team.
- These tools provided Reuters with a critical competitive advantage over competitors.
Research Intern, Microsoft Research India
Period: January 2024 - July 2024
Programming with Representations (PwR)
- Led backend development for Microsoft’s PwR Studio platform, focusing on the Natural Language to Domain Specific Language (NL2DSL) translation system using GPT-3.5 and GPT-4.
- Developed a symbolic translation pipeline that generates finite state machines structured as custom DSL, achieving an 85% reduction in hallucinations.
- Formulated rubrics and evaluation loops with error correction over DSL, increasing valid DSL generation from 65% to 95%.
Jugalbandi-Studio-Engine
- Architected a Python-based platform that converts DSL into scalable finite-state-machine-based chatbot applications, reducing development time by 80%.
- The platform was featured in Satya Nadella’s keynote talks, and I represented Microsoft Research in the pilot project, enabling 15+ non-technical organizations to build AI-powered conversational bots.
Jugalbandi (JB) Manager
- Established a chatbot management platform supporting WhatsApp, Telegram, and Web channels with multilingual text and voice capabilities.
- Integrated Bhashini Speech models with Azure service failover mechanisms, enabling 70% faster deployment of new chatbots.
Open-sourced Work
- PwR-NL2DSL: Natural Language to DSL conversion.
- PwR-Studio: Studio environment for Programming with Representations.
- Jugalbandi Studio: Open-source chatbot framework.
- Jugalbandi Manager: Chatbot management platform.
- The complete system was picked up by Bhashini to support chatbots across government initiatives.
Mentors: Sriram Rajamani, B. Ashok, Akash Lal, Sameer Segal
Research Intern, AI Institute, University of South Carolina
Period: December 2022 - April 2024
- Master thesis on Knowledge Enabled Multimodal Ingredient Substitution. Built a knowledge graph incorporating 27K ingredients and 40K substitution pairs, enabling precise ingredient recommendations using multimodal and constraint-based search.
- Developed an LLM-based query module for the ingredient substitution knowledge graph and submitted this work to AAAI-25.
-
Resources: GitHub Repository Dataset on Kaggle Dataset used in UC Irvine + Stanford Health Hackathon 2024 - Formulated cross-modal recipe retrieval and developed cooking action recognition for recipe analysis, achieving 95% recall and leading to Cook-Gen at IEEE SMC 2023.
- Mentors: Revathy Venkataramanan, Dr. Amit Sheth
Visiting Researcher, Societal Computing at Saarland University (SIC)
Period: May 2023 - August 2023
- Project: Time and multispectral domain satellite image super-resolution.
- Worked on satellite image super-resolution using temporal and multispectral information.
- Leveraged high temporal frequency low-resolution data for wildlife tracking and improved disaster analysis through GAN and diffusion approaches.
- Technologies: GANs, Diffusion Models, Computer Vision, Remote Sensing, PyTorch
- Mentors: Ingmar Weber, Ferda Ofli
Research Intern, University of Maryland, Baltimore (US)
Period: October 2022 - April 2023
- Project: Personalized AI Assistant, funded under a HealthCareNLP grant.
- Developed personalized response generation models using reward scaling over BART and T5.
- Work accepted as K-PERM at AAAI Symposium 2024 and improved NUBIA score by 10%.
- Focused on knowledge and persona-aware loss scaling for better response generation.
- Technologies: NLP, Information Retrieval, Large Language Models, Conversational Models, Question Answering, Generative AI
- Mentor: Manas Gaur
AI Intern, EdgeNeural.ai, Pune, India
Period: June 2022 - August 2022
- Project: Accelerated inference with model optimization through quantization and CPU/GPU customization.
- Developed training and optimization pipelines for automatic model training and hosting.
- Technologies: OCR, Object Detection (YOLO, SSD), TensorRT, GPU Optimization, OpenVINO, Docker, AWS, PyTorch, TensorFlow
- Collaborators: Sarvesh Devi, Chidhambararajan, Dhanraj
Research Intern, Video Analytics Lab, IISc Bangalore
Period: May 2022 - August 2022
- Implemented StyleGAN-based architectures for disentangled video interpretation across domains.
- Improved image/video generation and analysis workflows using GAN-based modeling.
- Technologies: GANs, Recurrent Neural Networks, PyTorch, TensorFlow, Python
- Mentor: Rishubh Parihar
Research Intern, Visual Learning and Intelligence Lab, IIT Hyderabad
Period: November 2021 - April 2022
- Researched medical image processing with Prof. Dr. C. Krishna Mohan.
- Developed a novel architecture for improved classification of low-quality images and unbalanced datasets.
- Published SFFNet for panoramic dental X-ray segmentation at IEEE APSCON 2023.
- Technologies: Medical Imaging, Healthcare, Deep Learning, TensorFlow, PyTorch
- Mentor: R Sai Chandra Teja
- Collaborators: Dhruv Makhwana, Rohit Pawar
Computer Vision Engineer, AI Mage (WETHEKOO)
Period: March 2021 - April 2021
- Developed a fashion tagging engine using deep learning.
- Optimized and deployed computer vision models on edge devices to improve product/user outcomes.
- Technologies: TensorFlow, Computer Vision, Siamese Neural Networks, Tagging Engine, Segmentation
Software Engineer, Rhizicube Technologies
Period: June 2021 - September 2021
- Oversaw server and REST API development and database design for a consumer data platform using Golang (Gin).
- Built a real-time streaming data pipeline using Apache Kafka.
- Built a LinkedIn scraper and a generalized organization website crawler using Selenium and Beautiful Soup.
- Technologies: Back-end Web Development, Relational Databases, Kafka, MySQL, Data Scraping, Go, Database Design, Selenium, Python
- Collaborators: Udit Sarin, Yash Goyal
Publications and Papers
- Transformers Remember First, Forget Last: Dual-Process Interference in LLMs - arXiv 2025
- K-PERM: Personalized Response Generation Using Dynamic Knowledge Retrieval and Persona-Adaptive Queries - AAAI Spring Symposium 2024
- Multimodal Ingredient Substitution Knowledge Graph (MISKG) - Dataset Release 2024
- Cook-Gen: Robust Generative Modeling of Cooking Actions from Recipes - IEEE SMC 2023
- Spatial Field Fusion Network (SFFNet) for Panoramic Dental X-ray Segmentation - IEEE APSCON 2023