India Built a Government AI That Speaks 22 Languages. Silicon Valley's Speaks Two.

#ai #india #technology #machinelearning

BharatGen just launched Param 2, a 17-billion-parameter foundation model that understands all 22 of India's constitutionally recognized languages. It was built by nine universities led by IIT Bombay, funded by $145 million in public money, and designed for a country where 780 million people don't speak English.

Silicon Valley's frontier models handle English and Mandarin fluently. Everything else gets a footnote.

The Model

Param 2 uses a mixture-of-experts architecture — the same design behind DeepSeek and Qwen — but optimized for linguistic diversity rather than raw benchmark scores. It covers languages from Hindi (600 million speakers) to Manipuri (1.8 million), exploiting phonetic similarities between scripts to share learned representations across language families.

The model handles speech recognition, text generation, and text-to-speech across all 22 languages. BharatGen CEO Rishi Bal put the philosophy bluntly: "Most real-world use cases don't need trillion-parameter AI."

He's not wrong. A farmer in Bihar checking crop prices doesn't need GPT-5. He needs a model that understands Maithili.

The Money

The Indian government's Department of Science and Technology and the India AI Mission funded development with 1,235 crore rupees — roughly $145 million. That's less than what OpenAI spends on compute in a single quarter.

This happened alongside the AI Impact Summit in New Delhi, where 88 nations adopted the "New Delhi Declaration on AI Impact." The declaration is organized around seven pillars the organizers called "Chakras," covering everything from democratizing AI resources to energy-efficient infrastructure. Adani pledged $100 billion for renewable-powered AI data centers by 2035. Reliance committed $110 billion. Combined: $210 billion in Indian AI infrastructure investment.

For context, that's roughly what Amazon alone plans to spend on AI in 2026. Except India is splitting it across an entire country's grid, universities, and sovereign compute capacity.

Why This Matters

The default assumption in AI is that capability requires scale, scale requires capital, and capital lives in San Francisco. Param 2 challenges each link in that chain.

It's not a frontier model. It won't win benchmarks against Claude or Gemini. But it does something no frontier model currently does well: serve a population where the median user doesn't type in English and can't afford a $20/month subscription.

India is the second-largest market for both ChatGPT and Claude. Those companies serve Indian users in English. Param 2 serves them in the language they think in.

The mixture-of-experts design keeps inference costs low enough for government deployment — health clinics, agricultural extension services, legal aid — where per-query cost matters more than per-query brilliance.

The Geopolitical Play

India hosted the AI Impact Summit with 100+ countries and 20+ heads of state. The New Delhi Declaration was signed by 88 nations. The subtext: AI governance doesn't have to be written in Washington or Brussels.

Three weeks earlier, the UN General Assembly voted 117-2 to create an independent AI scientific panel. The United States was one of the two "no" votes. Now India is building an alternative governance framework that emphasizes access over restriction — and backing it with $210 billion in concrete infrastructure commitments.

The US approach to AI governance is "move fast and regulate later." The EU approach is "regulate everything now." India's approach is different: build sovereign capacity first, then govern from a position of capability rather than dependency.

BharatGen is what that looks like in practice. A publicly funded model, built by public universities, for public use. Not open-source in the Silicon Valley sense — open in the sense that any Indian government agency can deploy it without paying rent to Anthropic or Google.

The Gap Nobody Talks About

There are approximately 7,000 languages spoken on Earth. Frontier AI models handle maybe 30 well. The remaining 6,970 languages represent billions of people who are functionally locked out of the AI economy.

Param 2 covers 22 of those languages. It's a small fraction of the global total, but it's 22 more than any single Silicon Valley model prioritizes at the same depth.

The question isn't whether Param 2 can compete with GPT-5 on English-language coding benchmarks. The question is whether the next billion AI users will interact with models that were built for them — or models that tolerate them as an afterthought.

India just bet $145 million that the answer matters.

If you work with AI prompts professionally, check out my prompt engineering toolkit on Polar.sh — structured templates for getting better results from any model.

DEV Community