DEV Community: Rishabh Jain

40% of AI-Agent Projects Will Be Dead by 2027. Here's Which Side of the Line You're On.

Rishabh Jain — Tue, 23 Jun 2026 07:00:27 +0000

Gartner expects more than 40% of agentic AI projects to be cancelled by the end of 2027. Not delayed. Not rescoped. Cancelled - the budget pulled, the pilot quietly shelved, the vendor not renewed. If you are building or buying an AI agent right now, that single number is the most useful thing anyone will tell you all year, because it splits the market cleanly in two. There is the 40% that will spend money and have nothing in production to show for it. And there is the rest that will quietly compound an advantage while their competitors write off the loss.

This article is about which side of that line you end up on - and, more importantly, how the decision is mostly made before a single line of code is written.

The reflex when a stat like this lands is to assume the technology is not ready. That is the wrong lesson, and it is an expensive one. The agents that survive and the agents that get cancelled often run on the same models, the same frameworks, the same cloud. What separates them is almost never capability. It is scope, ownership, data, economics, and governance - the unglamorous decisions that happen in the first three weeks and quietly determine the outcome months later.

Why 40% Get Cancelled: The Five Real Reasons

When a project gets killed, the post-mortem rarely says "the model could not do it." It says some version of the five failures below. Read them as a checklist of what to avoid, not a eulogy.

1. The use case never had a number attached

The single most common killer. A project gets greenlit on a vibe - "we should have an AI agent for support" - and nobody writes down the one metric it must move or the baseline it is measured against. Six months later, finance asks what the agent actually did, and the honest answer is "it does... things." With no agreed number, there is no way to prove value, and anything you cannot prove gets cut in the next budget cycle. Survivors define success in writing on day one: first-response time from nine hours to under ten minutes, or four hours of manual triage a week eliminated. A target you can point at is a project that survives a finance review.

2. Escalating cost with no ceiling

Agents are not a fixed-cost piece of software. They call models on every step, and a multi-step agent can fire dozens of model calls to finish one task. Run that across thousands of tasks a day and the bill behaves nothing like a SaaS subscription. Teams that never modelled the per-task cost, never set a budget cap, and never instrumented spend get a surprise invoice and a CFO who shuts the project down on principle. The fix is boring and decisive: know your cost per completed task before you scale, set hard spend limits, and treat token budget as a first-class metric alongside accuracy.

3. Skipping straight to full autonomy

The most expensive demo in the world is the fully autonomous agent that nobody asked to be fully autonomous. A buyer reads a headline, asks for an agent that runs end to end with no human in the loop, and gets something that looks magical in a controlled demo and then takes a wrong action on a live system in week three. Now trust is gone, and a project loses its sponsor the moment it does something embarrassing in production. Agents that survive earn autonomy in stages: assist first, then supervised, then - only once the audit trail and the trust exist - a longer leash.

4. Dirty data and missing integrations

An agent is only as good as the systems it reaches into. If your CRM is half-empty, your records contradict each other, and the data lives in five tools that do not talk to each other, the agent will confidently act on garbage. A large share of failures here are not technical-AI problems at all - they are plumbing problems that were never scoped. The unsexy truth is that most of the real work in a successful agent project is data cleanup and integration, not prompt engineering.

5. No owner, no governance, no kill switch

"Whose job is this agent?" If the answer is unclear, the project is already dying. Agents that take real actions need an accountable owner, an audit log of every action, a way to review outcomes, and a switch to pause it instantly when something goes wrong. Without that, the first incident - one wrong email blast, one mis-updated batch of records - becomes the last incident, because leadership pulls the plug rather than risk a second one.

The pattern underneath all five:

none of these are model failures. They are decisions about scope, money, control, and ownership that were skipped at the start and came due at the worst possible time. That is good news - it means survival is mostly within your control.

The Other 60%: What Surviving Agent Projects Do Differently

Flip every failure above and you get the survivor's playbook. The projects still running in 2028 will not be the ones with the most ambitious vision. They will be the ones that were disciplined about the boring parts.

They start narrow. One painful, repetitive, high-volume workflow with clear rules - not "transform the business." A tightly scoped agent that reliably handles ticket triage beats a sprawling one that half-handles everything.
They attach a number before they start. One metric, today's baseline, and a date to judge it. The project is accountable from week one, so it can always answer "what did this earn us?"
They model the economics up front. Cost per completed task, a spend cap, and an honest projection of what happens when volume doubles. No surprise invoices, no principle-based cancellations.
They climb the autonomy ladder. Assist, then supervised, then autonomous within tight guardrails. Each stage builds the trust and the audit trail the next one needs.
They treat data and integration as the main work. They budget for the cleanup and the plumbing, because they know that is where agents actually fail.
They build governance in from the start. A named owner, full logging, reversible actions, and a kill switch - so the first mistake is visible and recoverable, not fatal.
They run in parallel first. The agent runs alongside the human process, not instead of it, until the numbers earn the handoff. Trust is bought with evidence, not promised in a demo.

The honest reframe:

"40% will be cancelled" is not a warning that AI agents do not work. It is a warning that AI agents are easy to start badly. The technology is the cheap part now. The discipline around it is the expensive part - and it is exactly what separates the 60% from the 40%.

A 7-Point Self-Audit: Are You in the 40% or the 60%?

Run your current or planned agent project through these. Every "no" is a crack the 40% fell through. If you cannot answer most of them yes, you are not ready to scale yet - and that is a far cheaper thing to learn now than after the budget is spent.

☐ Is the scope one specific workflow - not a vague "AI transformation"?
☐ Can you state the single metric this agent must move, and today's baseline?
☐ Do you know the cost per completed task, and have you set a spend cap?
☐ Are you starting at "assist" or "supervised," not full autonomy?
☐ Is the data the agent relies on clean enough, and are the integrations actually built?
☐ Is there a named owner, an audit log, and a kill switch?
☐ Will it run in parallel with the human process before it replaces any of it?

What This Means For You

If you are a founder or an operator weighing an AI agent, the takeaway is not "wait and see." The companies that win the next two years are deciding now - they are just deciding carefully. The cost of being in the 40% is not only the wasted budget; it is the time your competitor spent compounding a real advantage while you reset. And the cost of doing it right is mostly discipline, not dollars.

The good news hiding inside Gartner's number is that the failure modes are known, predictable, and almost entirely avoidable. You do not need a research lab. You need a tightly scoped first use case, a metric, an honest cost model, a human in the loop, clean data, and an owner with a kill switch. Get those right and you are already most of the way into the 60%.

Make Sure Your Agent Is in the 60%

Bring us one workflow you are thinking of handing to an AI agent. We will pressure-test it against the exact failure modes above, model the real cost per task, and tell you honestly whether it is ready to build - with a fixed, written estimate and no obligation.

Book a Free 20-Min Scoping Call

See How We Build AI Agents

About Shanti Infosoft

Shanti Infosoft LLP is a CMMI Level 5 software engineering company that builds custom web and app products, AI integration, and agentic workflows for businesses without an in-house AI team. We build for the 60%: a named team of senior engineers, fixed-scope written estimates before any work begins, clean data and integration work scoped honestly, and full source-code and IP ownership handed to you. We will tell you where an agent already wins, where a human still needs to gate the decision, and what it will realistically cost - before you spend a rupee on the build.

Frequently Asked Questions

Does "40% of agentic AI projects cancelled by 2027" mean the technology does not work?

No. Gartner's projection is about projects being cancelled, not about agents being incapable. Most cancellations trace back to scope, cost control, data quality, and governance - decisions made by the buyer, not limits of the model. Well-scoped agents are already delivering real results today.

What is the single biggest reason AI agent projects get cancelled?

No measurable goal. When a project launches without one specific metric to move and a baseline to measure against, it cannot prove its value, and unprovable projects get cut in the next budget review. Define success in writing before you build.

How do I keep an AI agent project from becoming part of the 40%?

Start narrow with one workflow, attach a single success metric and baseline, model your cost per completed task and cap spend, begin in assist or supervised mode rather than full autonomy, invest in clean data and real integrations, and put a named owner, audit log, and kill switch in place. Run it in parallel with the human process until the numbers earn the handoff.

Why do AI agents cost more than regular software?

Agents call AI models on every step of a task, and a multi-step agent can make many calls to finish one job. At scale the cost behaves like usage-based metered consumption, not a flat subscription. Knowing your cost per completed task and setting a spend cap before scaling is what prevents the surprise invoice that gets projects cancelled.

Should I wait until the technology matures before starting?

Waiting carries its own cost - competitors who start carefully now compound an advantage while you reset later. The smarter move is to start small and disciplined: one well-scoped workflow, measured and governed, that you can expand once it proves itself. You learn what works on a small bet instead of a large one.

When NOT to Use an AI Agent: A Decision Rubric Before You Build

Rishabh Jain — Mon, 22 Jun 2026 06:30:05 +0000

The most valuable thing we tell some clients is "you do not need an agent for this." It is not what they expect from a company that builds AI agents, but it is usually the cheaper, more reliable answer - and saying it has earned us more trust than any demo. An AI agent is a powerful, flexible, and genuinely expensive tool. Reached for reflexively, it turns a problem a script would solve in an afternoon into a system you have to monitor forever.

So here is the rubric we actually use - the questions we ask before we agree an agent is the right call. If a problem fails enough of these, the honest recommendation is to build something simpler.

1. Is the workflow fixed, or does it genuinely vary?

If the steps are the same every time - take this input, validate it, transform it, write it there - that is not an agent. That is a workflow, and plain deterministic code will run it faster, cheaper, and with a reliability an agent cannot match. Agents earn their cost when the path genuinely varies: when the right next step depends on understanding messy, open-ended input that you cannot enumerate in advance. "Route this email to the right team based on its content" might warrant intelligence. "Move this file when it arrives" never does.

2. Can you tolerate a wrong answer sometimes?

Agents are probabilistic. They will, occasionally, be wrong in ways you did not predict. For drafting a reply a human reviews, that is fine. For calculating a tax figure, posting a ledger entry, or anything that must be exactly right every single time, "usually correct" is a non-starter. If the task has one correct answer and the cost of a wrong one is high, you want deterministic logic - possibly with AI assisting a human, but not an agent acting on its own.

3. Does the task actually need language understanding?

This is the question that cuts the most projects. Agents are extraordinary at language - understanding intent, summarizing, reasoning over unstructured text. If your problem is fundamentally about structured data, math, lookups, or rules, you are paying for a capability you do not use and inheriting unpredictability you do not want. A regular expression, a database query, or a rules engine is the right tool, and it will outperform the agent on its own turf every time.

4. Is the volume high and the value per task low?

Every agent call costs money and time - a model invocation, often several, plus latency. For a low-value action repeated millions of times, those costs dominate fast and the economics stop working. For high-value, lower-volume tasks - the complex support case, the nuanced document review - the cost per task is easily justified. Run the simple math before you build. The cost structure of an agent is very different from a function call, and at scale the difference is the whole business case.

5. Can you live with the latency?

Agents think, and thinking takes seconds - more if the agent makes several model calls and tool calls in sequence. For a thoughtful response to a customer, fine. For something that must happen in milliseconds, or inside a tight loop, an agent is simply the wrong shape. Match the tool to the timing the task actually requires.

6. Will you maintain it like the living system it is?

This is the one people forget. An agent is not a feature you ship and walk away from. Models change. Your data changes. New edge cases surface in production. An agent needs ongoing evaluation, monitoring, and tuning - real operational commitment. Deterministic code, once correct, tends to stay correct. If no one will own the agent after launch, do not build the agent; you will end up with an unmonitored, drifting system making decisions nobody is watching.

The honest pattern

The best architectures we ship are usually hybrids. Deterministic code handles the structured, high-stakes, high-volume, must-be-exact parts. The agent handles the genuinely open-ended, language-heavy, judgment-shaped slice - the part where flexibility is worth its cost. The skill is not "use AI for everything." It is drawing that line in the right place: giving the predictable work to predictable systems, and reserving the agent for where intelligence actually pays for itself.

If you take one thing from this: the question is never "can an agent do this?" - a capable model can attempt almost anything. The question is "is an agent the right tool for this, given the cost, the risk, the volume, and the consequences of being wrong?" Ask that honestly and you will build fewer agents, ship more reliable systems, and spend your AI budget where it genuinely earns its keep. That is the advice we give our own clients, even when it means a smaller project. It is also why they come back.

About Shanti Infosoft: Shanti Infosoft is a CMMI Level 5 AI development company that has delivered 700+ projects across 16+ industries. We help teams move from AI ideas to dependable, production-grade software - shantiinfosoft.com | AI consulting services.

If you are not sure whether your problem actually needs an agent, we will tell you honestly - sometimes the right answer is a simpler build. Talk to our team.

Rishabh Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

The AI Agent Exit Plan: How to Avoid Getting Locked In Before You Sign

Rishabh Jain — Mon, 22 Jun 2026 03:30:04 +0000

Everyone plans the launch of an AI agent. Almost no one plans the exit. Yet the moment that decides whether you are trapped with a vendor is not the renewal a year from now -- it is the contract you sign today, before a single agent goes live.

At Shanti Infosoft we build agents and we integrate other people's, so we see lock-in from both sides. And we will say the unglamorous thing out loud: the time to negotiate your freedom to leave is when the vendor wants your business most, which is right now, at signing. Once you are dependent and the renewal is in front of you, your leverage is gone. An exit plan is not pessimism about a partnership. It is the thing that keeps the partnership honest.

Lock-in is rarely a clause. It is an accumulation.

People imagine vendor lock-in as some scary line in a contract. Usually it is quieter than that. It builds up from small dependencies: your agent's logic lives only in their platform, your data and conversation history sit in their system in their format, your integrations are wired their way, and your team has learned their tools and nobody else's. None of these feels like a trap on day one. Together, a year later, they are the reason "we should probably look at alternatives" never goes anywhere.

The danger of accumulated lock-in is that it is invisible until you try to leave. By then, switching means rebuilding logic from scratch, extracting data that does not want to be extracted, and retraining everyone. So the renewal gets signed, not because the vendor is the best choice, but because leaving is too painful. That is a bad reason to keep paying anyone.

The four questions to ask before you sign

You do not need a legal degree to protect yourself. You need to ask four things while you still have leverage.

First: can I get my data out, and in what format? You want a clear answer that your data -- records, history, the things the agent has produced and learned from -- is exportable in a usable, standard format, not locked in a proprietary blob you would have to reverse-engineer. "You can export to PDF" is not a real answer.

Second: who owns what the agent produces and the configuration behind it? Make sure the outputs are yours, and understand how much of the agent's setup is portable versus permanently theirs. The more of the value that walks out the door with you, the freer you are.

Third: what does leaving actually look like? Ask them to describe the off-boarding. A confident vendor can tell you how a customer leaves cleanly. A vendor who gets uncomfortable at the question has just told you something important.

Fourth: how standard is the underlying approach? An agent built on widely-used, open foundations is far easier to move or rebuild elsewhere than one built on a closed, one-of-a-kind stack. Standard is portable. Bespoke-and-secret is sticky by design.

Avoid lock-in by design, not by distrust

Designing for an exit does not mean treating your vendor as the enemy. It means a few sensible habits from the start. Keep your own copy of the important data and the business rules, in your own hands, so the logic of how your business works is not held hostage. Favour standard interfaces over proprietary ones where you have the choice. And know, roughly, what it would take to move -- not because you plan to, but because knowing keeps everyone honest, including you.

The paradox is that planning your exit usually makes you stay -- happily. When leaving is genuinely possible, you renew because the vendor earned it, and the relationship is healthier for both sides. The vendor who is comfortable with that is usually the one worth working with.

A short test before you commit

So before you sign for any agent platform, run the simple version: if this vendor doubled their price or let the product slide in a year, could we leave without rebuilding everything from zero? If the answer is a confident yes, sign with confidence. If the answer is no, that is not a reason to abandon the deal -- it is a reason to negotiate the exit terms now, while they are still listening.

The launch will get all the attention. Spend a little of it on the exit. It is the cheapest insurance in the whole project, and you can only buy it before you sign.

If you are evaluating an agent platform and want a clear-eyed read on how portable it really is, that is exactly the kind of thing we help clients pressure-test before they commit. A short look now can save a painful renewal later.

If you want to avoid vendor lock-in before you commit, we are happy to review the contract and architecture choices that keep your options open. Talk to our team.

Sagar Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

The Retail AI Agent That Knows When to Stop: Handoffs, Returns, and Inventory Truth

Rishabh Jain — Sun, 21 Jun 2026 06:30:43 +0000

A retail client showed us a transcript they were proud of. A shopper asked about a jacket, the agent recommended it warmly, confirmed it was in stock, and promised next-day delivery. One problem: the jacket had sold out an hour earlier, and the store did not offer next-day delivery to that region. The agent had been confident, helpful, and completely wrong - which in retail is worse than being unhelpful, because now you have a broken promise and a refund.

Retail and e-commerce are some of the best places to deploy an agent and some of the easiest places to deploy one badly. The difference comes down to two things the hype never mentions: grounding the agent in the truth about your inventory, and teaching it when to stop and hand off to a human. Here is what actually works.

Inventory truth is the whole foundation

An agent that recommends products, answers "is this in stock?", or promises delivery is only as trustworthy as its connection to live data. The failure above happens when the agent answers from a snapshot - yesterday's catalog export, a cached price, a stock figure that was true this morning. In retail, "true this morning" is false by lunch.

The fix: the agent must read inventory, pricing, and availability live, at the moment of the answer, from the system of record - not from a copy embedded in its training or a stale cache. If the real-time check is slow, cache for minutes and show the limitation honestly ("showing availability as of a few minutes ago"). Never let the agent invent or assume stock. A "let me check" that is accurate beats an instant answer that is wrong.

Recommendations are useful; promises are dangerous

Draw a clear line between what an agent can suggest and what it can commit. Suggesting products, comparing options, explaining the difference between two models - low risk, genuinely helpful, plays to the model's strengths. Committing to a price, a delivery date, a discount, or a refund amount - those are promises the business must honor, and a hallucinated promise is a real liability. We let the agent recommend freely and route every commitment through a verified action: the price comes from the pricing system, the delivery estimate from the logistics system, the discount from a rules engine. The agent presents; the systems guarantee.

Returns and refunds: the highest-stakes conversation

Returns are where retail agents earn trust or destroy it, because money and policy collide with an often-frustrated customer. An agent can absolutely handle the front of this - explain the return policy, check whether an order is eligible, generate a label for a clear-cut case. But it should know its limits cold. The policy edge cases, the "I know it is past 30 days but here is why" appeals, the high-value disputes - those are exactly where the agent should hand off, not improvise a goodwill exception that sets a precedent no human approved. Encode the clear rules; escalate the judgment calls.

The skill that separates good retail agents: knowing when to stop

The most underrated capability in a customer-facing agent is recognizing when it is out of its depth and handing off gracefully. A frustrated customer, an unusual request, a question the agent cannot answer confidently, anything touching a complaint or a refund dispute - these should trigger a clean handoff to a person, with the full conversation context passed along so the customer never has to repeat themselves. An agent that hands off well feels like good service. An agent that stubbornly tries to resolve everything itself feels like a wall, and it is the number-one reason customers come away hating "the bot." Design the handoff as carefully as you design the conversation.

Set the tone before something goes wrong

A retail agent represents your brand in every message. Two practical guards we always put in. First, a defined voice and clear boundaries on what it will and will not discuss, so it stays on-brand and does not get baited into off-topic or risky territory. Second, transparency: let customers know they are talking to an AI assistant and make reaching a human easy. Customers are remarkably forgiving of an AI that is upfront and helpful, and remarkably unforgiving of one that pretends to be a person and then fumbles.

Where the value really is

Done right, a retail agent handles the high volume of routine questions - where is my order, what is your return policy, which of these two fits my need - instantly, at any hour, freeing your human team for the conversations that actually need a human. That is real value: faster service, lower cost, happier customers. But it rests entirely on the unglamorous foundations - live inventory truth, verified commitments, encoded policy, and a graceful handoff. Get those right and the agent is an asset. Skip them and you have automated the act of breaking promises at scale.

If you run retail or ecommerce, we can build a customer-facing agent that knows when to act, when to hand off, and when to trust your inventory data. Talk to our team.

Rishabh Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

Your AI Agent Is Only as Good as the Process You Point It At

Rishabh Jain — Sun, 21 Jun 2026 03:30:38 +0000

When an AI agent underperforms, the first instinct is to blame the model. Pick a smarter one, tweak the prompt, wait for the next release. Most of the time that is treating the symptom. The real problem is sitting underneath the agent: the process you pointed it at was never clean enough to automate.

At Shanti Infosoft we build agents for a living, and the single biggest predictor of whether one succeeds is not which model we use. It is the quality of the process and the data feeding it. An agent does not fix a broken process. It runs the broken process faster, and now the mess has momentum. Get this right and a modest model performs brilliantly. Get it wrong and the best model on the market will still disappoint you.

Agents amplify the process, for better or worse

Think of an agent as a very fast, very literal new hire who never asks clarifying questions unless you build that in. If the process it inherits is clear, consistent and well-documented, that new hire is a superstar. If the process is "ask Priya, she just knows," or "it depends," or three undocumented exceptions everyone carries in their heads, the agent has no chance. It cannot absorb tribal knowledge it was never given.

This is why two companies can deploy the same kind of agent and get opposite results. The difference is rarely the technology. It is that one of them had a process tight enough to hand over and the other had a habit dressed up as a process.

Your data is the other half of the problem

The second place agents quietly fail is data. An agent is only as good as what it can read. If your customer records are half-empty, your product information lives in four places that disagree, or your past tickets were never tagged consistently, the agent inherits all of that confusion. It will answer confidently from bad inputs, which is worse than not answering at all.

Before automating, ask a blunt question: if a sharp new employee had only the data the agent will have, could they do this job well? If the honest answer is no, the agent will not do better. It does not have intuition to fill the gaps. It has exactly what you give it.

The test before you automate

So before pointing an agent at a workflow, we run a simple readiness check with clients. Can you write the process down clearly enough that a capable stranger could follow it? Are the rules actually consistent, or do they quietly bend depending on who is doing the task? Is the data the agent needs reasonably complete and trustworthy? And are the exceptions known and documented, rather than living in someone's memory?

If a workflow passes, it is a strong candidate, and automation will feel almost easy. If it fails, you have just learned something more valuable than which model to buy: you have found the work to do first.

Fixing the process is not wasted time

Here is the part that surprises people. The effort you spend cleaning up a process before automating is not a tax on the AI project. It is often the most valuable part of it. Writing the process down clearly, settling the inconsistent rules, tidying the data -- that work pays off whether or not you ever deploy an agent, because it makes the workflow better for the humans too.

We have had clients discover, halfway through this clean-up, that the process was so tangled it was costing them more than they realised even before AI. The agent project became the reason they finally fixed something they had tolerated for years. The automation was almost a bonus on top of a process that was now simply better.

Start with the process, not the tool

The temptation is always to lead with the technology -- pick the platform, choose the model, then look for somewhere to apply it. Reverse it. Start with a workflow that is genuinely ready: high-volume, rule-stable, well-documented, with decent data. Point a perfectly ordinary agent at that, and it will outperform a cutting-edge agent aimed at a mess every time.

A clean process with an average agent beats a brilliant agent on a chaotic one. That is the whole lesson, and it is the opposite of how most AI projects get planned.

If you are eyeing a workflow for automation and are not sure it is ready, that readiness check is exactly where we like to start with clients -- often before any talk of models or platforms. It is the cheapest way to make sure the agent you build actually works.

If an agent is underperforming, we can help you map and tidy the process underneath it first, so automation lands on solid ground. Talk to our team.

Related reading: Your AI Demo Works. That's the Problem

Sagar Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

AI Agents on the Shop Floor: Where They Help in Manufacturing and Where Determinism Must Win

Rishabh Jain — Sat, 20 Jun 2026 06:30:32 +0000

A plant manager asked us a fair question: "Everyone keeps telling me to put AI on the factory floor. Where does it actually belong, and where will it get someone hurt or scrap a batch?" That is exactly the right framing. On a shop floor, the line between "AI agent" and "control system" is not a style choice - it is a safety and quality boundary. Cross it carelessly and the costs are physical, not just embarrassing.

We have done enough manufacturing work to have a clear answer. AI agents have a real, valuable place in a plant. It is just not the place the hype suggests. Here is how we draw the line.

The hard rule: determinism controls the machine

Anything that physically moves, heats, cuts, doses, or stops a machine must be governed by deterministic control - the PLCs, the safety interlocks, the SCADA logic that does the same thing every single time. A probabilistic language model has no business issuing a torque command or overriding a safety stop. Not because the model is bad, but because "usually correct" is the wrong standard when the failure mode is a damaged press or an injured operator. The control layer stays deterministic. Full stop.

So where do agents earn their keep? Around the machine, not inside it

The value of an agent on the floor is in the information layer - the messy, language-heavy, human-facing work that surrounds production:

Answering "why did line 3 stop?" An agent that can read maintenance logs, recent alarms, and the manual, then explain the likely cause in plain language, saves a technician twenty minutes of hunting. It informs a human; it does not touch the line.
Triaging maintenance. Fed sensor trends and service history, an agent can flag "bearing on pump 7 is trending toward the pattern that preceded the last two failures" and draft a work order. A planner approves it. The agent surfaces; the human decides.
Making tribal knowledge searchable. Decades of fixes, quirks, and "the trick with machine 12" live in retiring people's heads and scattered PDFs. An agent over that corpus turns a thirty-year veteran's knowledge into something a new hire can query at 2 a.m.
Handling the paperwork. Shift reports, compliance documentation, quality records - an agent can draft and structure them from the raw data, and a supervisor signs off. Pure time-back, zero physical risk.

Vision and quality: assist the inspector, gate the automation

AI vision for defect detection is genuinely useful and increasingly mature. But mind the same boundary. Using a model to flag suspect parts for a human inspector is low-risk and high-value. Letting a model autonomously reject or accept parts, or adjust process parameters in response, is a different risk class - now a misread is scrap or a defect shipped. Start with assist. Earn the autonomy with measured accuracy over time, and keep a deterministic rule as the backstop on anything safety- or spec-critical.

Plan for the floor being a hostile environment for software

The shop floor is not a data center. Connectivity drops. Machines speak old industrial protocols. Latency matters when a human is waiting at a station. An agent that needs a fast round-trip to a cloud model will frustrate everyone the moment the network hiccups. Design for it: cache aggressively, degrade gracefully, and make sure that when the agent is unavailable the floor keeps running exactly as it did before - because the agent was never in the critical path to begin with. That last point is the tell of a well-designed system.

A grounded example

One client wanted "AI to run the line." What they actually needed - and what we built - was an agent that sat beside the line. It ingested machine alarms and maintenance history and answered operators' questions in plain language: what an obscure fault code meant, what fixed it last time, which spare to grab. Downtime dropped because diagnosis got faster, not because anything autonomous touched a machine. The control system stayed exactly as deterministic as it had always been. The agent made the people faster; the machines stayed safe.

The right question

On the factory floor, do not ask "what can AI control?" Ask "what decisions and information are slow, manual, and language-heavy - and which of those can an agent speed up while a human and a deterministic system stay firmly in charge of the metal?" Answer that honestly and you get real, durable value. Ignore the boundary and you get a very expensive lesson in why control systems are deterministic in the first place.

If you are weighing where AI agents help on the shop floor and where determinism must win, we can help you draw that line for your operation. Talk to our team.

Rishabh Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

Who Owns Your AI Agent After It Ships? The Orphaned-Agent Problem

Rishabh Jain — Sat, 20 Jun 2026 03:30:03 +0000

Here is a question that quietly decides whether your AI agent succeeds, and almost no one asks it before launch: after this goes live, whose job is it? Not who built it. Who owns it on a random Tuesday three months from now when it starts behaving oddly.

At Shanti Infosoft we see the same failure again and again. A perfectly good agent ships, works for a while, and then slowly degrades -- not because the technology failed, but because nobody owned it. It became an orphan. Everyone assumed someone else was watching. This is the most preventable way for an agent to fail, and it costs nothing but a decision made before deployment.

Software you install can be ignored. An agent cannot.

We are used to software that, once installed, mostly runs itself. An AI agent is not that. It acts in the world, it touches systems that change, it meets inputs nobody anticipated, and it can be confidently wrong in ways a spreadsheet never is. That means it needs a person the way a process needs a manager -- not constant attention, but a clear owner who notices when something is off and has the authority to act.

Treat an agent like fire-and-forget software and it will drift. The model has not changed, but the world around it has: a tool it depends on got upgraded, a policy changed, the kind of requests coming in shifted. Without an owner watching, those small drifts accumulate into a system that is quietly doing the wrong thing, and you find out from a customer instead of from your own team.

The handoff is where ownership disappears

The orphaning almost always happens at the handoff. The builder -- a vendor, or an internal team -- finishes the work and moves on. The agent lands somewhere between IT, the operations team and the people who actually use its output. Each group has a reasonable story for why it is not really theirs. IT says it is a business process. Operations says it is a technical system. The users say they just consume what it produces.

So it belongs to no one. And an agent that belongs to no one gets no monitoring, no tuning, no decision about when to expand or pull it back. It coasts on the trust it earned in week one until that trust runs out.

What an owner actually does

Ownership here is not a full-time job for a single agent. It is a defined, modest responsibility held by one named person. That person watches the one metric that proves the agent is working. They review a small sample of its output regularly enough to catch drift early. They are the first call when it misbehaves, and they have the authority to pause it or narrow its scope without convening a committee. They decide when it has earned more autonomy or a wider remit. And they own the relationship with whoever maintains it, internal or external.

Crucially, the owner should sit close to the work the agent does, not in a distant technical team. The person who feels the pain when the agent gets it wrong is the person most motivated to keep it right. A support-team lead owning the support agent will catch problems that an IT ticket queue never would.

Name the owner before you build, not after

The fix is almost embarrassingly simple, which is why it is so often skipped. Before the project starts, write down one name: this person owns the agent in production. Make it part of the plan, not an afterthought once the builders have left. If you cannot name an owner before you build, that is a signal worth listening to -- it usually means the agent does not have a real home in the organisation, and an agent without a home is an agent that will be orphaned.

This also reframes the build itself. When there is a named owner from the start, the people building it know who they are handing the keys to, and they build for that handoff: clear monitoring, understandable logs, a sane way to pause and adjust. Build for an orphan and you get an orphan. Build for an owner and you get something maintainable.

The cheapest reliability you will ever buy

Of all the things that make an agent dependable over time, assigning an owner is the cheapest. It costs a decision and a slice of one person's attention. Skipping it costs you the slow, invisible decay that ends with a system nobody trusts and nobody quite knows how to fix.

So before your next agent ships, ask the Tuesday question. If you have a confident answer, you are most of the way to an agent that lasts. If you do not, that is the gap to close first.

If you are deploying agents and are not sure how to structure ownership so they stay healthy after launch, that is something we help clients get right from the start. It is a small step that quietly prevents most of the problems.

If an agent you launched is quietly drifting with no clear owner, we can help you put the ownership and monitoring in place to keep it trustworthy. Talk to our team.

Sagar Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

What Your AI Agent Quietly Leaks: A Data-Privacy Checklist for Fintech and Healthtech

Rishabh Jain — Fri, 19 Jun 2026 06:30:42 +0000

Here is a question that stops most AI projects cold once someone in compliance asks it: "When a customer types their account number into your agent, where does it go?" In a lot of hastily built systems, the honest answer is "into a third-party model's logs, into our own debug traces, into a vector database, and into three other places nobody mapped." For a fintech or healthtech business, that is not a bug. It is a reportable incident waiting to happen.

Agents are unusually good at leaking data because they touch so much of it - user inputs, retrieved documents, tool outputs, conversation history - and they pass it all around in plain text. If you handle regulated data under DPDP, GDPR, HIPAA or your sector's equivalent, you cannot treat privacy as a feature to add later. Here is the checklist we run before any agent goes live.

1. Know exactly what leaves your boundary

The moment you call a hosted model API, data crosses from your environment into someone else's. Map that flow explicitly. What fields are in the prompt? Does the provider retain inputs? For how long, and for what purpose? Many providers offer zero-retention or no-training tiers for exactly this reason - use them, and get it in the contract. If the data is too sensitive to leave at all, that is a real architectural signal: you may need a model that runs inside your own environment. Decide this deliberately, not by accident.

2. Redact before you send, not after

The cheapest privacy win is to never send the sensitive value in the first place. Before a prompt leaves your system, strip or tokenize what the model does not actually need - account numbers, full names, national IDs, medical record numbers. The agent can reason about "the customer's checking account" without ever seeing the 16-digit number. Where it needs to act on the real value, your tool layer holds it and substitutes a token in the text the model sees. The model orchestrates; it does not need to memorize the secrets.

3. Your logs are a data store - govern them like one

This is the leak almost everyone misses. To debug an agent you log prompts, responses, and tool calls. Those logs are now full of customer data, sitting in a system that often has looser access controls than your production database and a much longer retention. We have seen more privacy exposure come from verbose debug logs than from the model calls themselves. Redact at the logging layer, restrict who can read agent logs, and set an aggressive retention policy. A log you deleted cannot be breached.

4. Treat the vector database as regulated data

If you embed customer documents for retrieval, those embeddings and the source text live in a vector store - which is, for compliance purposes, a copy of the original data. It needs the same encryption, the same access controls, and crucially the same deletion path. When a customer exercises their right to be forgotten, can you actually remove their data from the vector index, not just the primary database? If you cannot answer yes, you have a gap a regulator will find.

5. Enforce permissions at the data layer, not in the prompt

A tempting shortcut is to tell the agent "only show this user their own records." Do not rely on that. Instructions in a prompt are guidance, not a security boundary, and they can be talked around. Real access control lives in your tools and your data layer: the query the agent triggers is scoped to that authenticated user's permissions, server-side, every time. The agent should be structurally incapable of fetching data the user is not entitled to - not merely instructed to avoid it.

6. Watch the inputs, too

Privacy is not only about what leaks out; it is also about what users put in. People paste things into chat boxes they should not - full card numbers, a colleague's health details, credentials. A regulated agent should detect and handle sensitive input: refuse it, mask it, or route it appropriately, and certainly never echo it back or store it raw. Assume your input box will be used in ways you did not intend, because it will be.

Privacy is a design constraint, not a disclaimer

In regulated industries, the projects that ship are the ones that treated data protection as an architectural requirement from day one - mapped the flows, minimized what they sent, governed the logs, and enforced access where it actually counts. The projects that stall are the ones that built an impressive demo and then discovered, in a compliance review, that they could not say where the data went. You cannot retrofit trust. Build the agent so that the honest answer to "where does the customer's data go?" is short, complete, and comfortable to say out loud.

If you operate in fintech or healthtech, we can review what your agent touches and close the data-privacy gaps before they become a compliance problem. Talk to our team.

Rishabh Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

Where to Put the Human in an AI Agent (It Is a Business Decision, Not a Technical One)

Rishabh Jain — Fri, 19 Jun 2026 03:30:38 +0000

The first question people ask about an AI agent is "can it run on its own?" It is the wrong first question. The one that actually determines whether the project succeeds is quieter: where does the human sit, and what do they approve?

At Shanti Infosoft we treat this as a business decision, not a technical setting. The model does not care where the human checkpoint goes. Your customers, your risk, and your team do. Put the human in the wrong place and you either drown them in pointless approvals or hand a machine authority it should not have. Put them in the right place and the same agent becomes both safe and genuinely useful. Here is how we think it through with clients.

Full autonomy is a destination, not a starting line

The instinct to make an agent fully autonomous on day one is understandable and almost always wrong. Early on you do not yet know where it fails, and trust has not been earned. The fastest, safest wins come from a "draft and approve" pattern: the agent does the work, a person clicks approve, and you capture most of the time savings while keeping control.

The bonus is that those approvals are data. Every time a human accepts or corrects the agent, you learn exactly how reliable it is on real inputs. After a few weeks you are no longer guessing about autonomy -- you have evidence. Autonomy then becomes something you grant deliberately, in the areas where the agent has proven itself, rather than a leap you take on hope.

The real question: how reversible is the action?

The single most useful lens for placing the human is reversibility. Sort what the agent does into three buckets.

Easily reversible actions -- drafting a reply that a person will still read, tagging a ticket, summarising a document, suggesting a next step -- can run with little or no approval. If it is wrong, you notice and fix it cheaply. Slowing these down with mandatory sign-off just wastes the time you were trying to save.

Hard-to-reverse actions -- sending money, emailing a customer something you cannot unsend, changing a record other systems depend on, making a promise on your behalf -- belong behind a human checkpoint, at least until trust is deep. The cost of one bad action here dwarfs the cost of an approval click.

Irreversible or high-stakes actions -- anything legal, financial or reputational that you genuinely cannot walk back -- should stay human-decided for a long time, with the agent preparing the option rather than pulling the trigger.

Place the gate by consequence, not by how clever the agent looks. The goal is to spend your human attention where a mistake is expensive and to get out of the way everywhere else.

Approval fatigue is a real failure mode

There is a failure that looks like caution but is actually dangerous: making humans approve everything. When people are asked to rubber-stamp hundreds of low-stakes actions, they stop reading. Approval becomes a reflex, and the one risky item in the stream sails through because it looked like the other ninety-nine.

So fewer, meaningful checkpoints beat many trivial ones. If your reviewers are clicking approve without thinking, you have not added safety -- you have added theatre, and trained your team to ignore the very gate you built. Reserve approvals for the actions where a human genuinely changes the outcome.

Design the gate, and design what the human sees

A good checkpoint is not just a yes/no button. It is the right information at the moment of decision: what the agent is about to do, why, and what it is basing it on, so the reviewer can judge in seconds rather than re-investigating from scratch. A checkpoint that forces the human to redo the agent's work has saved nobody any time. The craft is making approval fast and informed, so the human stays a real check rather than a bottleneck.

Move the line as trust grows

The placement is not permanent. As the agent proves itself on a category of work, move the human checkpoint outward -- from approving every action, to spot-checking a sample, to simply being notified. The right end state for many workflows is not zero humans; it is humans watching the edges and the exceptions while the routine flows on its own.

That progression -- approve all, then sample, then notify -- is the whole game. It lets you start safe, earn autonomy with evidence, and end up with an agent that is both trusted and genuinely hands-off where it has earned the right to be.

If you are weighing how much rope to give an agent, the answer is rarely "all" or "none." It is a thoughtful placement of one or two checkpoints by consequence. That is a design conversation we are always glad to have before anything goes live.

If you are deciding how much autonomy to give an agent, we can help you place the human-approval gate where it protects the business without slowing it down. Talk to our team.

Sagar Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

When Your Agent Calls the Wrong Tool: Making Function-Calling Reliable Enough to Ship

Rishabh Jain — Thu, 18 Jun 2026 06:30:18 +0000

The first time we put an agent in front of real tools, it did something instructive. Asked to "refund the customer's last order," it called cancel_subscription instead of issue_refund. Both tools existed. Both were plausibly related to an unhappy customer. The model picked the wrong one and executed it with complete confidence.

This is the part of agent engineering that the demos skip. Letting a model generate text is easy. Letting it take actions in your systems - call functions, hit APIs, change state - is where reliability either exists or it does not. Here is how we make function-calling trustworthy enough to put in production.

Fewer tools, sharper names

The more tools you hand an agent, the more chances it has to choose wrong. We have watched accuracy fall off a cliff once a single agent has twenty-plus tools with overlapping purposes. Two fixes: keep the tool set small and distinct, and name and describe each tool so precisely that confusion is hard. refund_order with a description that says "issues a monetary refund for a completed order; does NOT cancel subscriptions" beats a vague handle_order every time. The description is not documentation - it is the instruction the model actually reads when deciding.

Validate arguments before you trust them

Even when the agent picks the right tool, it can pass nonsense: a refund amount larger than the order, a date in the wrong format, a customer ID that does not exist, a negative quantity. The model is generating plausible-looking arguments, not verified ones. So every tool we expose validates its inputs hard, on the server side, before doing anything - type checks, range checks, existence checks, business-rule checks. An invalid call returns a clear error the agent can read and correct, rather than corrupting data. Treat agent-supplied arguments exactly as you would treat input from an anonymous user on the internet: never trusted.

Separate "read" tools from "write" tools - and gate the writes

Reading data is low-risk; changing data is not. We split tools into those two classes and treat them very differently. Read tools an agent can call freely. Write tools - anything that moves money, sends a message, deletes a record, changes an order - go through extra gates: stricter validation, rate limits, and for the highest-stakes actions, a human approval step. The agent prepares the action; a person confirms it. As trust in a specific workflow grows, you can loosen the gate. You do not start there.

Make actions idempotent and reversible where you can

Agents retry. A tool call times out, the agent assumes failure and calls again - but the first call actually went through. Now you have refunded twice. The defense is an idempotency key on every state-changing tool: a deterministic identifier the server checks so a repeated call returns the original result instead of acting again. And wherever the business allows, prefer reversible actions - a "draft" or "pending" state a human can release - over irreversible ones the agent commits instantly.

Give the agent honest errors, not silence

When a tool fails, what you return matters enormously. Return a generic "error" and the agent flails - retries blindly, or invents a success message to the user. Return a specific, readable message - "refund failed: order 4471 is already fully refunded" - and a capable model will reason about it correctly, explain it to the user, or choose a different path. We treat tool error messages as a first-class part of the design, written for a reader who has to decide what to do next.

Log every call, because you will need the trail

When an agent does something surprising in production, the only way to understand it is to see exactly which tools it called, with which arguments, in what order, and what came back. We log every tool invocation as a structured record. This is not optional - it is the difference between "we fixed it in an hour" and "we have no idea what happened." It is also what lets you build the evaluation set that catches the next regression before it ships.

Test the actions, not just the chat

Most teams test whether the agent says the right thing. Far fewer test whether it does the right thing. We build a suite of scenarios - "customer asks for a refund on an already-refunded order," "user requests a cancellation but means a pause" - and assert on the actual tool calls the agent makes, not the words it produces. That is where the real bugs hide, and it is the only way to ship action-taking agents with a straight face.

The shift from a chatbot to an agent is the shift from generating words to taking actions, and actions have consequences. Reliable function-calling is not one trick - it is a stack of small disciplines: a tight tool set, hard validation, gated writes, idempotency, honest errors, and tests that check behavior. Put them in place and an agent becomes genuinely useful. Skip them and you have shipped an unpredictable hand on your production systems.

If your agent calls the wrong tool often enough to make you nervous, we can harden its function-calling layer until it is reliable enough to ship. Talk to our team.

Rishabh Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

The One Number That Tells You an AI Agent Is Actually Working

Rishabh Jain — Thu, 18 Jun 2026 03:30:08 +0000

Ask most teams how their AI agent is doing and you get a dashboard: messages handled, tokens used, average response time, an accuracy score. All of it green. None of it answers the only question that matters -- is this thing actually paying off?

At Shanti Infosoft we have learned to be suspicious of busy dashboards. A wall of green metrics is comforting and frequently misleading. An agent can be fast, available and confident while quietly creating work, annoying customers, or solving a problem you did not really have. The discipline is not measuring more. It is choosing the one number that, if it moves the right way, means the agent is genuinely working -- and ignoring the rest.

Vanity metrics feel like progress and prove nothing

Start by naming the numbers that look impressive but do not decide anything. Volume is the classic trap: "the agent handled 4,000 requests this month" tells you it was busy, not that it was useful. Uptime and speed matter only as table stakes; a fast wrong answer is still wrong. Even a raw accuracy score can mislead, because it averages away the failures that actually hurt and counts easy cases you never needed help with.

These metrics are not useless. They are diagnostics for when something breaks. But none of them, on its own, is evidence of value, and treating them as the scoreboard is how teams keep a useless agent alive because it "looks healthy."

The one number is the one tied to the reason you built it

Every agent is built to move a specific business outcome. The metric that matters is that outcome, measured before and after, with nothing else dressed up to distract you.

If you built a support agent to free up your team, the number is hours your people got back, or first-response time on the cases that used to wait. If you built it to capture more leads, the number is qualified leads that turned into conversations. If you built it to close the books faster, the number is days to month-end close. If you built it to deflect repetitive tickets, the number is the share of tickets resolved without a human -- and held to a quality bar, not just closed.

Notice what these have in common. Each one is a thing your business already cared about before AI entered the room. That is the test of a real metric: it would matter even if the agent did not exist.

Pick it before you launch, not after

The trap is choosing the metric after the agent is live, because by then you will be tempted to pick whatever number happens to look good. Decide up front: this is the one number we are trying to move, this is what it is today, and this is the threshold that means it is working. Write it down before launch. It turns a vague "the AI is helping, I think" into a clear yes or no.

This also protects you from the most expensive outcome in automation -- the agent that runs for a year because nobody could prove it was not helping. If you set the number on day one, you get an honest answer by week six.

Guard the number against gaming

One caution: a single metric can be gamed, by the agent or by good intentions. A ticket-deflection target can be hit by closing tickets that should have escalated. A response-time target can be hit by sending fast, useless replies. So pair your one number with a single quality guardrail -- a small sample of outputs reviewed by a human, or a customer-satisfaction check on the cases the agent touched. Not a second dashboard. One guardrail, to make sure the headline number is honest.

What this looks like in practice

The healthiest agent reviews we run fit on a sticky note. One line: the outcome metric, where it started, where it is now. One line: the quality guardrail, still holding or not. That is it. If the outcome moved and quality held, the agent is working and you can widen its scope with confidence. If the outcome did not move, no amount of green on the volume chart should save it.

Fewer numbers, chosen honestly, beat a dashboard that makes everyone feel good and decides nothing.

If you are not sure which single number your agent should be moving -- or you suspect your current dashboard is hiding the answer -- that is exactly the kind of thing we help clients pin down. It is usually a short conversation with a very clarifying result.

If your agent dashboards are not telling you whether it is paying off, we can help you define the one metric that actually proves business value. Talk to our team.

Sagar Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.

Wrapping an AI Agent Around a 20-Year-Old ERP: A Field Guide to Legacy Integration

Rishabh Jain — Wed, 17 Jun 2026 06:30:22 +0000

Every impressive AI agent demo runs against a clean, modern API. Every real enterprise runs on something else: a 20-year-old ERP, a mainframe that only speaks a fixed-width file format, a system whose vendor went out of business a decade ago and whose "integration layer" is a nightly CSV dropped on an FTP server. The gap between the demo and the deployment is almost always this gap.

We do a lot of this work - putting an agent in front of systems that were never designed to be talked to by software, let alone by a probabilistic language model. Here is what we have learned about making it work without breaking the thing the business actually runs on.

Wrap, do not replace

The first temptation is to "modernize" the legacy system so the agent has something nice to call. Resist it. That ERP processes payroll, or inventory, or claims. Touching its internals is a multi-year, high-risk project, and the agent will be obsolete before you finish. Instead, build a thin integration layer around it: a small service that exposes a few clean, well-defined endpoints and translates them into whatever the legacy system understands - a SOAP call, a stored procedure, a screen-scrape, a file drop. The agent talks to your clean layer. Your layer absorbs the ugliness.

The agent should never touch the system of record directly

This is the single most important rule. An agent will, eventually, do something you did not predict - call the same action twice, pass a malformed value, retry a request that already succeeded. If that lands directly on the system that holds the truth about money or stock or patient records, you have a real incident. So we always insert a boundary: the agent proposes an action, the integration layer validates it, and only validated, well-formed, idempotent operations reach the core. The legacy system gets clean input or no input.

Make every write idempotent

Old systems rarely forgive a double-submit. Run the same "create invoice" twice and you get two invoices. Because agents retry - on timeouts, on ambiguous responses, on a step that looked like it failed but did not - you must assume duplicate calls will happen. We give every operation an idempotency key: a deterministic identifier the layer checks before it acts. Seen this key already? Return the previous result instead of doing it again. This one pattern prevents a whole category of expensive, embarrassing errors.

Translate the data model, do not leak it

Legacy systems carry decades of accumulated quirks: status codes like "07" that mean "approved-pending-review," date formats that assume a two-digit year, a customer record split across three tables for reasons lost to history. Do not make the agent reason about any of that. Your integration layer should translate the raw legacy data into clean, self-describing concepts - "status: approved, review_required: true" - before the agent ever sees it. The model is good at language and intent. It is bad at remembering that "07" is special. Keep the tribal knowledge in code, not in the prompt.

Plan for the system being slow, flaky, or asleep

That nightly batch window. The maintenance reboot. The query that takes 40 seconds because it scans a table with no useful index. Modern API assumptions - fast, always-on, returns JSON - do not hold here. Build for it: generous timeouts, retries with backoff, a queue for operations that cannot complete in real time, and a clear, honest response to the user when the core system is simply unavailable. An agent that says "I have queued your request and the system will process it tonight" is far better than one that hangs or hallucinates a confirmation.

A grounded example

A manufacturing client wanted an agent that could answer "when will order 4471 ship?" The data lived in an ERP from the early 2000s, reachable only through a clunky internal API that returned a wall of coded fields. We did not touch the ERP. We built a read-only service that called it, translated the coded fields into plain status concepts, cached results for a few minutes to spare the old system, and exposed one clean endpoint. The agent queried that. Total footprint on the ERP: minimal. Risk to production: near zero. Value to the floor staff who used to phone the planning office: immediate.

The real skill is boundaries, not models

Connecting an agent to a legacy system is rarely an AI problem. It is a systems-engineering problem - idempotency, validation, translation, graceful degradation - with an agent on one end. The teams that succeed treat the legacy system with respect, wrap it carefully, and never let the unpredictable part of the stack touch the irreplaceable part. That discipline is the whole game.

If you need an AI agent to work with a legacy ERP or CRM rather than around it, we can map the safest integration path for your stack. Talk to our team.

Rishabh Jain is a Director at Shanti Infosoft, where the team builds AI agents and automation for real business operations.