Praveen Kumar

Posted on Mar 31

Five reasons why most AI pilots don't make it to production in Indian BFSI

#ai #bfsi #india #machinelearning

Most technology leaders in Indian BFSI approved the budgets, sat through the demos, and watched the pilots deliver. The numbers were good. The room was pleased.

And somewhere between that room and production, the work stopped moving.

Not because the technology failed. Not because the team lost interest. But because five things were never scoped during the pilot, and all five arrived as surprises after it succeeded.

The data was never ready for production

The pilot ran on clean data. Someone spent weeks preparing it - pulling records, fixing gaps, making it consistent.

Production means live data from a core banking system that has been running for years. Duplicate records. Missing fields. Customer profiles split across systems that were never meant to talk to each other.

Reconciling that data was always going to take time. It just was not on the plan.

That work is still sitting in someone's backlog right now.

The model has not cleared model risk management

Any model that touches a credit decision, a collections workflow, or a fraud flag needs to clear model risk management before it goes anywhere near a real customer.

When the pilot arrives at the MRM review, three things are usually missing.

No way to see if the model in production is behaving the way it did in the pilot. No record of which version is running and when it was last updated. And nothing stopping the model from producing the wrong output when it encounters something it was not trained for.

These are not new requirements. Every regulated institution knows they are coming. They just never get scoped during the pilot because the pilot is focused on proving the model works, not preparing it for production.

The MRM review comes back with gaps. The team that built the pilot has moved on. Nobody owns the list.

That list is sitting in someone's inbox right now.

The model is live but nobody can see what it is doing

Some teams get past MRM. The model goes to production. And then a quieter problem starts.

The model was validated on data from six months ago. Production data has shifted. Customer behaviour has changed. But nobody has a clear view of whether the outputs are still accurate.

No dashboard. No alerts. No record of what the model did when a customer complained.

This is the problem that surfaces at the worst time - when an examiner asks, or when something goes wrong, and nobody in the room can explain what happened.

The teams that avoided this built three things before go-live: a way to watch the model in production, a log of every version and every change, and clear boundaries on what the model is allowed to do. Before something went wrong. Not after.

The integration queue is longer than the plan shows

The pilot sat cleanly outside the real systems. Production means connecting to the core banking platform, the CRM, the collections workflow, and the reporting layer.

Each connection is a separate project sitting behind everything else IT is already committed to.

Picture week nine of a twelve-week plan. The team flags that one connection needs a change request. The change request queue is committed through the end of the quarter. The go-live date the MD has been told is now sitting in a queue nobody is managing.

That conversation is already happening in some organisations right now.

Nobody owns the full path

The pilot was built by the digital team. Production belongs to the business unit. Technology is owned by IT. Compliance sits with risk.

When something breaks, there is no single person whose job it is to fix it. It gets escalated. Then parked. Then it becomes the line that has said progressing for two quarters.

Each team is reporting accurately on the piece they own. Read together, the updates describe a relay race where the baton has been on the ground for four months. Nobody is lying. Nobody has the full picture.

The vendor conversation your team is not having

At some point recently a vendor showed your team an agentic workflow. The demo was good. Someone in the room said this is where we need to go.

The question nobody answered: are we actually ready for it?

An agent makes a series of decisions across a single customer interaction. That is genuinely more capable than the automation you already run. It also sits differently inside RBI's model risk framework, which was built around models that make one traceable decision at a time.

If something goes wrong in an agentic collections workflow, what does the audit trail look like? That is the question an examiner will ask.

Good vendors will welcome this conversation. Before any pilot agreement, one question is worth asking: can we map the governance requirements and the data readiness gap together, in writing, before we start?

You have an AI inventory. Not an AI program.

When the board asks for a view of the full AI program, the answer in most Indian BFSI organisations is a slide from every business unit.

That is not a program view. That is an inventory count.

Pick any two AI initiatives running right now. Are the teams sharing infrastructure? Did they start from the same governance assumptions? Do they know enough about each other to avoid duplicating work?

In most organisations the answer is no. Every initiative approved separately, budgeted separately, reported separately. The board sees twelve things running. Not a direction.

Adding a thirteenth does not make the program stronger. It makes the inventory more expensive.

The organisations that have moved from inventory to program have one thing in common. One named person - not a committee, a person - with a view across all the initiatives, making active decisions about what runs next and what stops. The board hears one coherent update from one owner.

That person does not need to sit at the Chief level. But they need to exist.

The question that shows you where you actually are

Put every active AI initiative on one page. Pilots, proofs of concept, live deployments, vendor commitments.

Does that page tell a coherent story? Or does it tell you things have been approved without anyone watching the whole?

If it is the second, that is not a failure of the individual teams. It is a structural gap. And it is the right question to bring into the room before the people above ask it first.

What the session covers

In 45 minutes we will work through the five gaps, what is causing them, and what a realistic path forward looks like for each one. Every registered attendee gets a checklist they can take into their next team meeting.

Most teams who read this far already know which gap they are sitting in. That is where we start.

DEV Community