Initiative Overview
Bridging the gap between cutting-edge AI and India's specific needs.
Vision & Mission
BharatGen creates Generative AI solutions relevant to India's diverse culture and languages. We leverage state-of-the-art Computer Vision and NLP to address practical challenges unique to the Indian context.
Core Focus
We emphasize scaling technologies for numerous Indian languages and dialects, tackling problems in agriculture, education, and digital heritage. Goals include democratizing AI, fostering digital inclusivity, and building responsible AI systems.
Project Showcase
Highlighting key research thrusts and developments.
Agribot
Conversational AI for personalized farming advice. Built on 1.3L+ entries, achieves 97.5% accuracy & 91.4% personalization.
Indian Dialect Processing
Created parallel corpora for 14 dialects (14k+ sentences). Achieved high dialect classification F1 (>0.90).
Educational AI
Developing models like CASSA & ScoreCLIQ for MCQ difficulty estimation. Also exploring automatic grading.
Advanced OCR Solutions
Tools for challenging OCR: MultiFOLD (cluttered docs), TEPALM (manuscripts), EroPT (eroded text), handwriting analysis.
Hindi Language Tools
Created Hindi Slang dataset (6k sentences) with high detection F1 (>0.92). Developing spoken error correction models.
LLM Reasoning & Understanding
Assessing LLM capabilities on ambiguous reasoning (20.8% accuracy) & understanding language learning challenges (24% overlap).
Publications & Achievements
Disseminating research findings through leading international conferences (Selected 2025 Acceptances).
AIED 2025 (CORE A)
- CASSA: Context-Aware Self-Attention for MCQ Difficulty
- ScoreCLIQ: Dynamic LLM Framework for Item Difficulty
- Automatic Grading of Handwritten Cloze Questions
- GISA: Gradual Information Selection Attention
COLING 2025
- TEEMIL: MCQ Difficulty Estimation in Indic Languages
MBCC 2025
- Assessing LLM Metacognitive Ability
- FOCUS: Low-resolution OCR Confirmation
- EAGER: LLMs Guessing Bilingual Translation
- CLEAR: Comparing LLM/Human Proverb Reworking
- Evaluating Inferential Reasoning Under Ambiguity
- LLM Understanding of Language Learning Challenges
Meet the Team
Driven by dedicated researchers at IIT Mandi.
Dr. Rohit Saluja
Principal Investigator
Asst. Professor, SCEE
Manikandan R.
Collaborator
AI Lead
Thoughtworks India
Core Research Team & Associates
Future Directions
Continuing to innovate and expand the horizons of AI for India.
The BharatGen initiative plans to further enhance its impact through several key directions:
- Fair & Robust Handwriting Recognition: Implementing two-stage recognition (masking + correction).
- Contextual Language Correction: Building advanced Hindi error correction tools using LLMs for real-world learner scenarios.
- Enhanced OCR: Improving robustness for degraded historical documents and complex scripts.
- Dialect & Low-Resource NLP: Expanding the dialect corpus and developing better processing techniques.
- Cognitively-Inspired AI: Integrating cognitive theories and adaptive learning for LLMs with better reasoning.