I spend my weeks watching AI news for one reason: I automate workflows for Indian businesses, and the ground shifts under my feet every Monday. Most of what trends on X this week isn't useful. A few things are. Here's the signal, filtered for developers in India who ship real work.
1. Agent frameworks are consolidating — pick one and move on
The "which agent framework" debate is finally cooling. Three winners keep showing up in production code reviews: the Claude Agent SDK for operator-style tasks, LangGraph for stateful pipelines, and plain function calling on top of OpenAI / Anthropic / Gemini APIs for the 80% of jobs that don't need a framework at all.
My take: if you're building internal ops tooling — GST reconciliation, vendor onboarding, invoice triage — skip the framework. A 60-line Python script with tool definitions outperforms most "agent platforms" by six months. Here's the shape I use:
import anthropic
client = anthropic.Anthropic()
TOOLS = [
{
"name": "fetch_gst_status",
"description": "Fetch GST filing status for a GSTIN",
"input_schema": {
"type": "object",
"properties": {"gstin": {"type": "string"}},
"required": ["gstin"],
},
}
]
def run(user_msg):
messages = [{"role": "user", "content": user_msg}]
while True:
resp = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=TOOLS,
messages=messages,
)
if resp.stop_reason == "end_turn":
return resp.content[-1].text
# handle tool_use blocks, append tool_result, loop
Start here. Add complexity only when you can name the specific failure you're fixing.
2. Long context is cheap enough to abuse
Token prices for 200K+ context windows dropped again this week. For Indian teams dealing with PDF-heavy workflows — RERA filings, Income Tax orders, CBDT circulars — this changes the math. You no longer need RAG for a 300-page document. You paste it in, you ask the question, you're done — for roughly ₹3–5 per query on Sonnet-class models.
Does this kill RAG? No. For a 50,000-document corpus it's still essential. But for the common "make sense of this one fat PDF" case, vector databases have been over-engineered for two years.
3. Voice agents stopped being demos
Real production deployments I saw this week: a Pune-based clinic chain running outbound appointment reminders in Marathi and Hindi on a voice stack that costs under ₹2/minute. An edtech in Bangalore is using real-time voice for spoken English practice — latency under 400ms, good enough that kids don't game it.
The stack that shows up most often: Deepgram or Sarvam for ASR (Sarvam wins on Indic languages by a wide margin), a small LLM for routing, ElevenLabs or Bhashini for TTS. If you've been waiting for "voice AI is ready" — it's ready. Not for every use case, but the ones that work, work well.
4. Coding agents moved from autocomplete to delegation
The shift I watched this week in team Slacks: developers stopped saying "Copilot suggested" and started saying "I gave the agent the ticket." The mental model is different. You hand over a scoped task with acceptance criteria, you go make chai, you come back to a diff.
This isn't free. It works when:
- The task has clear input/output contracts
- Tests exist or can be written first
- The code review discipline is still tight
It falls apart when teams treat agent output as trusted. I've seen two production incidents this quarter caused by unreviewed agent-written DB migrations. Review the diff. Always.
5. India's public AI stack shipped real things
The DPI + AI crossover keeps quietly shipping. Bhashini added better code-mixed Hinglish handling. ONDC's search layer gained semantic understanding for product descriptions. UPI's fraud detection got a visible accuracy bump — small merchants I talk to noticed fewer false declines this month.
If you build for Bharat, plug into these layers instead of building parallel ones. The cost math on a startup trying to replicate Indic NLP from scratch doesn't work.
6. Evals are the new moat
Every serious AI team I talked to this week was investing in evals, not prompts. The pattern: write 50–200 real examples with expected outputs, run your pipeline, score with a judge model, track drift over deploys.
Minimum viable eval in Python:
import json, anthropic
client = anthropic.Anthropic()
cases = json.load(open("evals.json"))
def judge(expected, got):
prompt = (
f"Expected:\n{expected}\n\n"
f"Got:\n{got}\n\n"
"Is this correct? Reply PASS or FAIL with a one-line reason."
)
r = client.messages.create(
model="claude-haiku-4-5",
max_tokens=100,
messages=[{"role": "user", "content": prompt}],
)
return r.content[0].text.strip()
passed = 0
for c in cases:
got = run_your_pipeline(c["input"])
verdict = judge(c["expected"], got)
if verdict.startswith("PASS"):
passed += 1
else:
print(f"FAIL: {c['input'][:60]} -> {verdict}")
print(f"{passed}/{len(cases)} passed")
Run this before every deploy. You'll catch 80% of regressions that would otherwise ship.
7. The "AI doing accounting" story is real, and boring
I'll close on the topic I spend most of my time on. Last week I helped a mid-sized D2C brand cut their monthly reconciliation from 14 person-days to under 2. Nothing glamorous — a Python pipeline that pulls Razorpay settlements, tallies against Shopify orders, flags mismatches, and writes a Tally-compatible CSV. Savings: roughly ₹85,000/month in ops cost.
This is where AI is quietly winning in India. Not chatbots. Not "AI-powered" products. Just ordinary automation where one of the steps happens to call an LLM because regex would have taken three weeks to write.
If you're a developer looking for paid work in 2026, this is the niche. Every SME in Tier 1 and Tier 2 has reconciliation, vendor onboarding, compliance filing, or customer support workflows that are one weekend of focused work away from a 70% time reduction. The businesses don't know it. You can tell them.
What I'm watching next week
- Pricing from the newest Indic-language foundation models
- Whether updated DPDP rules change what data can flow through foreign-hosted LLMs
- Real benchmarks (not vibes) on code-gen for Python-heavy automation
That's the signal. Everything else this week was noise — keep your head down and ship.
I'm Archit Mittal — I automate chaos for businesses. Follow me for daily automation content.
Top comments (0)