DEV Community: Arleen Kaur

Every Technology Creates Abundance While Quietly Creating a New Scarcity

Arleen Kaur — Thu, 30 Jul 2026 17:29:22 +0000

Lately I've been questioning the assumption sitting underneath most automation conversations: that removing a task removes the work behind it. Every system I've watched scale tells me the opposite. You automate something, and the work doesn't disappear. It moves somewhere you weren't looking.

The Part That Doesn't Disappear

When a task gets automated, what actually gets removed is the visible, repeatable part of it, the part that fit inside a checklist. What stays behind is everything that checklist was quietly absorbing: judgment calls, exceptions, edge cases that never showed up often enough to write down.

Before automation, a person handled those without anyone noticing, because the decision and the execution were fused into one job. Automation splits them apart. Execution goes to the system. Judgment stays with a person, except now that person has to find it, name it, and handle it deliberately instead of by habit.

This isn't specific to AI. Any system that compresses a manual process into an automated one leaves the same residue behind it:

A build pipeline replaces a person clicking deploy, but someone still has to own what happens when it fails at 2am.
A recommendation engine replaces a merchandiser choosing products, but someone still has to decide what the engine optimizes for, and catch it when it optimizes for the wrong thing.

The task disappears. The decision behind the task does not. Zoom out far enough and this is just conservation of complexity: you can relocate it, but you can't delete it.

The Decision Surface

I've started thinking about this as the decision surface of a task: the set of judgment calls a task requires that never get written down, because a human absorbed them silently while doing the work. Automating a task doesn't shrink that surface. It exposes it.

Once exposed, someone has to own it explicitly, because the system won't own it for them. This matters more with agentic AI than with any earlier wave of automation, and the adoption curve backs that up. McKinsey's State of AI 2025 survey found that 62% of organizations are already experimenting with agents that plan and execute multi-step workflows on their own, not just generate text.

An agent that plans and executes doesn't make a single decision. It makes a chain of them, closer to a call stack than a single function call. Ask an agent to handle a customer refund, and it's making a sequence:

Read the request and infer what the customer actually wants.
Classify the situation against a policy that may not cover it cleanly.
Decide whether the case needs to escalate to a human.
Draft a response that matches tone, policy, and precedent at once.

Each frame in that stack is a place where the agent's judgment can diverge from what you actually wanted, and every unresolved frame gets pushed up to a human. A person supervising ten agents isn't supervising ten tasks. They're supervising a decision network with more nodes than the process ever had when a human ran it end to end. We've tracked a similar expansion in what happens to middle management and span of control when AI takes over coordination: the layer absorbing that extra network is rarely the layer companies planned for.

This is why deploying agentic systems at scale tends to increase headcount demand rather than shrink it, at least in the operational layer. The organizations that get surprised by this made a sizing error: they budgeted headcount against the old task instead of the new decision surface the agent revealed. It's the same mismatch showing up industry-wide: 99% of companies are running AI transformation, and 84% haven't restructured a single job to match it.

A team that used to need five people to write content now needs two people to write it and three people to audit five hundred variants a week for consistency, because output volume scaled faster than judgment did. The capital is already moving this direction: PwC found that 88% of senior executives plan to increase AI-related budgets over the next year, citing agentic adoption as the driver, not headcount reduction.

The deeper mistake sits one level up, in what gets optimized:

Proxy: fewer people per task.
Target: decision throughput, the rate at which an organization can resolve the judgment calls a system surfaces, reliably, without every case routing to the same three senior people who become the bottleneck.

You can hit the proxy and miss the target completely. Most companies that cut headcount right after an automation rollout do exactly that. They shrink the group meant to absorb the decision surface at the exact moment that surface got bigger. McKinsey estimates generative AI carries $2.6 to $4.4 trillion in annual value across 63 use cases, and none of that value gets captured without people configuring, auditing, and correcting the systems that produce it.

Beyond Automation

Once you see it this way, the pattern stops being about AI specifically. It's about any system that increases the volume or speed of output faster than it increases the organization's capacity for judgment.

Scaling infrastructure doesn't remove the need for people who understand failure modes, it multiplies the places failure can occur. Scaling a sales motion with better tooling doesn't remove the need for judgment about which deals matter, it increases the number of deals where that judgment has to get applied.

The technology is never the whole system. It's the part of the system that runs without asking permission. Everything it can't decide on its own becomes a job, and being AI-ready is really a staffing question in disguise: who owns the decision surface a deployment is about to expose, decided before launch instead of discovered during an incident.

The macro data lines up with this. The World Economic Forum projects that 22% of formal jobs will be structurally affected by 2030, but also forecasts net growth of 78 million jobs over the same period, mostly in roles that configure, audit, and interpret systems that didn't need configuring or auditing before. That's the decision surface showing up as a labor statistic.

There's a second-order effect worth naming too. Research published in Science found that access to ChatGPT raised job satisfaction by roughly 0.50 standard deviations among knowledge workers, a large effect for a single tool. Automation that surfaces the interesting judgment calls and absorbs the repetitive execution doesn't just create work. It tends to create better work.

The companies that get squeezed by automation aren't the ones that adopted too much of it. They're the ones that treated adoption as the finish line instead of the starting condition for a new layer of work. Every system that runs on its own eventually needs someone deciding what "on its own" is allowed to mean. That job doesn't show up on the org chart until the first time it's missing.

Originally published on the Linksoft Technologies blog.

Your Company Runs on Business Hours. AI Doesn't Have To.

Arleen Kaur — Tue, 28 Jul 2026 13:15:27 +0000

Most organizations are still built around an eight-hour window. Decisions wait for Monday. Customer requests queue overnight. Approvals sit in inboxes until someone logs back in. That's not a technology gap; it's a structural one. And it's the gap that AI workflow automation is starting to close.
The shift isn't subtle. Enterprises that once deployed AI to help individuals work faster are now using it to redesign when and how work happens at all. Instead of accelerating the existing model, they're replacing its most fundamental constraint: the assumption that execution requires human presence.

That's a different conversation than "AI tools for productivity." It's a conversation about operating models.

Why Business Hours Have Always Been an Artificial Limit

Think about what actually stops work after 6pm. It's not complexity. Most of the decisions that queue overnight are routine: a customer support ticket that needs a policy check, a compliance flag that needs routing, a purchase order that needs three fields validated. None of it requires judgment. It just requires someone to be there.

AI agents can be there. They don't sleep, they don't context-switch, and they don't lose the thread of a workflow at the end of a shift. Wired into the right architecture, they can handle continuous execution across customer support, finance operations, cybersecurity monitoring, and software delivery, without a human in the loop for every step.

The counterintuitive part? This doesn't reduce the importance of human judgment. It concentrates it. When routine execution runs continuously, people stop spending their days on queue management and start spending them on the decisions that actually need them.

Enterprise AI Adoption Is Moving Past the Productivity Layer

The 2026 Microsoft Work Trend Index, surveying 20,000 AI users across 10 countries, found something that should reframe how leadership teams think about their AI programs. Organizational factors like culture, manager support, and talent practices account for 67% of AI's reported impact. Individual mindset and behavior? Just 32%.

Read that carefully. The biggest lever isn't the model you deploy or the tool you license. It's whether your operating model is designed to let AI actually do something.

Most aren't. The typical enterprise AI rollout follows a familiar arc: identify a productivity use case, deploy a copilot, measure time saved per user, repeat. That approach captures real value, but it's incremental. It leaves the operating model intact and adds AI on top. What it doesn't do is redesign workflows for continuous execution.

The companies pulling ahead aren't deploying AI into existing workflows. They're redesigning workflows around AI's ability to operate continuously, learn from each cycle, and escalate to humans only when genuine judgment is needed.

What Continuous AI Operation Actually Looks Like

The use cases aren't hypothetical. They're live, and the performance gaps they're creating are measurable.

In commercial insurance, AI-first operating models are compressing underwriting cycle times from weeks down to hours in some deployments, a compression that only becomes possible when document classification, risk scoring, and policy matching run in parallel rather than waiting for a human handoff at each stage.

In cybersecurity, threat detection workflows that once required an analyst to notice an anomaly, open a ticket, and escalate now run continuously. The analyst's role shifts from detection to response, a higher-leverage position by any measure.

In software engineering, CI/CD pipelines increasingly incorporate AI agents that don't just run tests but interpret failure patterns, suggest fixes, and route issues with context. Engineers spend less time in the queue and more time on architecture.

The pattern across all of these is the same: AI handles continuous execution; humans handle judgment, governance, and complex exceptions. That's the hybrid operating model, and it scales in ways that shift-based human execution never could.

Worth flagging before you get too excited about "continuous": an agent that never stops working also never stops calling the model, and that has a cost curve of its own. Linksoft broke down why agentic workflows are the real driver behind rising AI bills, even as per-token prices keep falling, so it's worth budgeting for before you flip continuous execution on at scale.

The Risk That Keeps Derailing Enterprise AI Adoption

Here's the finding that should give leadership teams pause: Gartner predicts over 40% of agentic AI projects will be canceled by the end of 2027, due to escalating costs, unclear business value, or inadequate risk controls.

That's not a failure of AI. It's a failure of implementation strategy.

The projects that get canceled share a pattern: they were deployed into workflows that weren't redesigned to support them. An AI agent dropped into a broken handoff process doesn't fix the handoff, it automates the breakage. And when the costs accumulate without the value materializing, the project dies.

The organizations that avoid this outcome design for it from the start. They map the workflow first, identify where continuous execution is genuinely valuable, and build governance structures that keep humans accountable for outcomes even when AI handles execution. Risk controls aren't a constraint on AI adoption, they're what makes it sustainable.

Part of that mapping exercise is deciding who actually builds the thing. Not every continuous workflow belongs in-house, and not every one belongs outsourced either. Linksoft laid out a decision framework for exactly that call: plumbing workflows are usually safe to hand to a specialist, but anything that touches your core product is worth the slower, messier path of building it yourself.

Redesigning Operating Models for Continuous Execution

Roughly 75% of technology executives acknowledge their current enterprise operating models will need to change within the next 12 to 18 months to successfully scale AI, according to Deloitte's 2026 Global Technology Leadership Study. The question is what that change actually looks like in practice.

It starts with a different design question. Instead of asking "which tasks can AI automate," ask "which workflows can run continuously if we remove the human-present requirement." That reframe surfaces a different set of opportunities and a different implementation roadmap.

The workflows that move first are usually the ones with high volume, low variance, and clear escalation criteria: customer support triage, invoice processing, compliance monitoring, incident routing. None of these require human judgment on every instance, they require human judgment on the exceptions. Build the exception criteria first, then let AI handle the rest continuously.

What emerges isn't just a faster version of the old model. It's an organization that learns from every cycle. Each interaction that runs through an AI workflow generates signal, about failure patterns, edge cases, customer behavior, process bottlenecks. Over time, that signal compounds. The organization doesn't just execute faster; it gets better at execution with each iteration.

That's the actual competitive advantage. Not the AI model you chose, but the operational flywheel you built around it.

Where This Is Happening First

The sectors where continuous AI operation has taken hold first share a common trait: high transaction volume, significant cost per human decision, and clear regulatory stakes that make governance structures non-negotiable.

Financial services workflows, loan processing, fraud detection, regulatory reporting, were early movers because the cost of a missed cycle is quantifiable and the volume makes human-at-every-step economically indefensible. Healthcare systems followed, with AI handling diagnostic triage, documentation, and prior authorization routing so clinicians could focus on the decisions that actually require clinical judgment.

Manufacturing has moved into predictive maintenance and quality control, workflows that used to depend on scheduled human inspections now run continuously, catching failure signals before they become downtime.

The lesson isn't sector-specific. It's structural: the organizations redesigning workflows around continuous execution are building advantages that compound, while those treating AI as a productivity layer are capturing efficiency that plateaus.

What This Does to Business Models, Not Just Workflows

The deeper shift is what continuous operation does to business model design. When execution is no longer constrained by business hours or headcount, the economics of serving customers at scale change.

A professional services firm that automates its delivery workflow can take on more clients without proportional headcount growth. An e-commerce operation that runs customer support and order management continuously can serve global markets without building regional teams. A SaaS company that routes incidents through an AI layer can maintain enterprise SLAs with a smaller ops team, and reinvest the margin into product development.

These aren't edge cases. They're the natural endpoint of designing for continuous execution rather than shift-based capacity. The organizations that get there first don't just run more efficiently, they can price differently, serve differently, and grow differently.

FAQ

What is AI workflow automation and how does it differ from traditional automation?

Traditional automation executes fixed, rule-based sequences, it does exactly what it's programmed to do and fails when conditions vary. AI workflow automation uses large language models and agentic systems to handle variability, interpret context, and make routing decisions dynamically. The practical difference: traditional automation breaks on edge cases; AI workflow automation escalates them.

What are the most common AI automation examples in enterprise operations?

Customer support triage, invoice processing, compliance monitoring, incident routing, document classification, and CI/CD pipeline management are the most mature deployments. In each case, AI handles continuous execution of high-volume, lower-variance tasks while humans handle exceptions and governance.

Why do so many enterprise AI adoption projects fail to deliver value?

The most common failure mode isn't a bad model choice, it's deploying AI into workflows that weren't redesigned to support continuous execution. When AI is added on top of a broken process, it automates the dysfunction. Start with workflow redesign, then add AI, and build governance structures that define escalation criteria clearly before you go live.

How does continuous AI operation actually change a business model, not just a workflow?

It works by removing the headcount-to-output constraint. When execution runs continuously without requiring proportional human presence, organizations can serve more customers, enter more markets, and maintain tighter SLAs without the corresponding cost growth. The business model changes because the cost structure of delivery changes, that's what makes it durable, not just efficient.

The Shift Is Already Underway

The enterprises building continuous operating models right now aren't doing it because the technology just became available. They're doing it because the competitive cost of not doing it is becoming visible. Faster response times, tighter SLAs, better customer experience, lower operational cost per transaction: these are no longer aspirational outcomes. They're the baseline competitors are setting.

The window to design for this proactively is narrowing. Organizations that treat AI workflow automation as a productivity layer will get productivity gains. Organizations that redesign their operating models around continuous execution will get something harder to replicate: compounding operational advantage.

Originally published on Linksoft Technologies.

B2B SaaS Pricing Is Breaking. Here's What Replaces the Per-Seat Model

Arleen Kaur — Fri, 26 Jun 2026 15:09:42 +0000

The per-seat model had a good run. For over a decade, SaaS companies priced by headcount, built dashboards around monthly active users, and called expansion revenue the metric that mattered. Then AI agents arrived and a single agent can now do the work of a team.

If you're a SaaS founder, RevOps leader, or product manager trying to figure out your pricing architecture for 2026, the honest answer is: the old playbook doesn't hold.

Why Seat-Based Pricing Is Cracking

Here's the tension nobody wants to say out loud: AI reduces the number of humans a company needs to operate software. According to McKinsey, 40% of IT buyers now cite seat reduction as their primary lever for cutting software spend a direct consequence of agentic AI handling workflows that once required a licensed user per task. That's not a pricing objection. It's a structural demand signal.

Think about what this looks like at a mid-sized company. A procurement team of eight used to mean eight seats in your contract management platform. Now, with AI agents handling intake, vendor matching, and first-draft approvals, that same team operates with three humans and two agents. Under a seat-based model, the vendor just lost 62% of that contract value and did nothing wrong. The software got more useful, not less.

The flaw isn't in the product. It's in the pricing unit. Seats measure presence, not value. And when AI decouples productivity from headcount, presence stops being the right thing to charge for.

The Shift Everyone Is Already Making

The market has moved. Maxio's 2025 Pricing Trends Report, based on data from 316 SaaS companies, found that 67% now use some form of usage-based pricing up sharply from 52% in 2022. That's not a trend. That's a majority position achieved in three years.

But the real finding is subtler. Pure usage-based models -- no subscription floor, just consumption actually underperform. Companies running usage-only models reported a median growth rate of 13%. Pure subscription companies hit 20%. The outperformer? Hybrid. Subscription base plus usage overlay clocked in at 21% median growth. That gap looks small on paper. Compounded over three years, it isn't.

The reason hybrid wins makes practical sense. A subscription floor gives finance teams at the buyer's end something predictable to budget. The usage component captures genuine expansion as customers deploy more agents, run more queries, or process more transactions. You're giving procurement a committed baseline and letting value naturally drive the rest upward.

How AI-Native Companies Are Pricing Differently

There's a striking divergence emerging between legacy SaaS vendors and AI-native builders. McKinsey's data shows 68% of incumbent software providers still lean on flat-fee pricing, while 40% of AI-native companies have already defaulted to activity-based or consumption models. That's not just a tactical difference it reflects a fundamentally different theory of how value gets delivered.

Flat-fee pricing assumes the value of software is relatively constant once deployed. AI-native products don't work that way. A coding assistant that ships ten features a week delivers more value than one a developer opens twice a month. Charging both the same amount is a terrible deal for the heavy user and an unjustifiable bill for the light one.

The AI-native approach charges for outcomes, throughput, or consumption: API calls, documents processed, agents run, decisions made. It aligns incentive structures. The vendor wins when the customer uses more, which happens when the product actually works. It's almost obvious in hindsight.

If you're building agentic software specifically -- the kind that automates complex multi-step workflows understanding what separates AI agent architectures from each other is worth reading before you finalize how you price access to those capabilities.

The Enterprise Complication

McKinsey's State of AI in 2025 found that 88% of organizations now use AI in at least one business function, up from 78% the year before. More telling: 62% are actively experimenting with AI agents. That's not a pilot cohort anymore that's a majority of enterprises touching agentic infrastructure.

Here's the complication this creates for pricing teams: the value surface of your product has expanded dramatically, but your contracts still reflect the old scope. When a customer deploys AI agents that interact with your platform 24/7, your original pricing model wasn't built for that workload. You either capture that value through consumption pricing -- or you watch it leak.

The companies struggling most right now aren't the ones without AI features. They're the ones with great AI features and a pricing model that can't monetize them. That mismatch is a revenue problem disguised as a product problem.

What Hybrid Pricing Actually Looks Like

Abstract strategy is easy to nod at. Let's get specific. A B2B document intelligence platform might structure pricing as: a $2,000/month platform fee covering core access, storage, and up to 10,000 document pages processed, then a per-page rate above that threshold. The customer knows their baseline cost. The vendor captures expansion as usage climbs. Neither party is surprised.

Stripe is the best-known example of this mechanic done right. Their model charges a percentage of transactions processed pure consumption but wraps enterprise customers in negotiated volume tiers that create predictability. The value metric (money moved) is impossible to argue with. It scales with customer success by definition.

For SaaS products where the value metric is less obvious, the work is in identifying what customers actually pay for not what they buy. A project management tool charges for seats, but customers pay for delivered projects. Find the variable that tracks with those delivered projects and you've found your consumption metric.

The Counterintuitive Risk of Getting Too Clever

There's a trap worth naming here. Not every product should sprint toward consumption pricing. Complex, multi-variable billing creates friction at the point of sale. Enterprise procurement teams don't like unpredictable invoices regardless of the upside rationale. And opaque metering -- charging for API calls the buyer can't easily audit destroys trust faster than any pricing model restores it.

The best pricing models in 2026 aren't the most sophisticated ones. They're the clearest. If your customers can't explain your pricing in two sentences, that's not their failure it's yours. Complexity is often a sign that the pricing strategy was designed around what's measurable rather than what's meaningful to the buyer.

This is where many AI-native startups overengineer. They build fifteen variables into their pricing model because the infrastructure makes it technically possible, not because it makes the sale easier or the relationship stronger. Hybrid works because it's legible: here's your floor, here's the variable, here's how you track it.

Building a Pricing Strategy That Survives Agent-Led Workflows

A few principles hold regardless of your product category.

Audit your current value metric against agentic usage. If AI agents interact with your platform without a human present, does your current pricing model capture that activity? If not, you're already leaving revenue on the table and the gap will widen as agent adoption accelerates.

Model the hybrid floor carefully. The subscription component isn't just revenue protection it's the number your customer's finance team will anchor to. Price it too high and you're fighting procurement on day one. Price it too low and your consumption component looks like a bait-and-switch the first time usage spikes.

Invest in consumption transparency. Real-time usage dashboards, in-app spend tracking, alert thresholds these aren't nice-to-haves; they're trust infrastructure. The SaaS vendors winning enterprise deals in 2026 are the ones that actively help buyers understand and control their spend. It sounds counterintuitive. It isn't.

The Pricing Model Is Now a Competitive Moat

Here's the conclusion most pricing strategy conversations avoid: your monetization model is now a product decision, not a finance one. The companies that figure out how to charge for AI-delivered value not AI access, but AI outcomes will build a compounding advantage over the ones still selling seats to shrinking teams.

The shift from seat-based to hybrid and consumption pricing isn't a pricing team's project. It's a company-level strategic bet. If you're building or scaling a SaaS product in 2026 and your pricing model looks the same as it did in 2022, that's worth a hard look.

FAQ

How is AI changing B2B SaaS pricing models right now?

AI decouples productivity from headcount, which breaks seat-based pricing. As AI agents replace human users for many software tasks, buyers are actively reducing seat counts. High-growth companies have shifted toward usage-based or hybrid pricing tied to consumption, outputs, or transactions metrics that scale with value delivered regardless of how many humans are present.

What's the difference between hybrid and usage-only pricing models for SaaS?

Usage-only means customers pay purely on consumption with no minimum commitment. Hybrid adds a subscription floor below the variable component. Maxio's 2025 data shows hybrid models outperform both: pure usage companies grew at 13% median, hybrid grew at 21%. The subscription base gives buyers budget predictability; the usage layer captures expansion as customers get more value.

What are the best SaaS monetization strategies in 2026 for AI-native products?

For AI-native products, the strongest approach is identifying a high-signal consumption metric tied to genuine value delivery documents processed, decisions made, agent-hours run and building a hybrid model around it. Pair a clean subscription floor with a transparent usage layer, invest in in-app spend visibility, and negotiate enterprise volume tiers rather than flat commitments. The goal is a pricing model your champion can defend to procurement in under three minutes.

How do I decide between usage-based pricing and seat-based pricing for my SaaS?

Ask one question: does your product deliver more value when more humans are logged in, or when more work gets done? If the answer is the latter which it is for most AI-augmented tools -- a seat model systematically underprices heavy users and overprices light ones. If your buyers still require budget predictability, a hybrid model solves both problems without sacrificing either.

Where should I start with AI pricing strategy for my SaaS company?

Start by mapping what your best customers actually get out of your product -- the specific output or outcome that makes them renew without a conversation. That's your value metric. Then check whether your current pricing scales with that metric. If it doesn't, you have a monetization gap. From there, model a hybrid structure: a floor covering your infrastructure cost, a variable tier tied to your value metric, and a volume discount curve for enterprise buyers who want commitment discounts in exchange for minimum spend.

Originally published on Linksoft Technologies.

Your Customer Acquisition Strategy Is a Founding Decision, Not a Tactic

Arleen Kaur — Thu, 25 Jun 2026 15:34:03 +0000

Most founders treat customer acquisition like a growth lever they can adjust later. It isn't. The customers you pursue in your first 18 months write the rules your company operates under for years. Get this right and everything else -- product, pricing, hiring, sales motion -- snaps into alignment. Get it wrong and you'll spend years optimizing a go-to-market machine built for the wrong buyer.
That's not a warning. It's a structural reality worth understanding before you make your first sales call.

Your Go-To-Market Strategy Is a Mirror

There's a reason "you become what you measure" survives every offsite. The same logic applies to customer segments. Whoever you sell to defines how you sell, which shapes your team's skills, which narrows your future options. It's a feedback loop that tightens with every quarter.
Consider two SaaS startups solving the same workflow problem. One targets 10-person agencies. The other chases enterprise procurement teams. Within 18 months, they're practically different companies -- different contract lengths, different sales cycles, different product roadmaps, different burn profiles. The seed-stage decision about who to acquire as a customer has compounded into an entirely different business.
B2B buyers have accelerated this divergence. They now use an average of 10 channels to complete a purchase, double the number from five years prior, according to McKinsey's B2B Pulse research. That explosion of touchpoints means your acquisition strategy has to account not just for who you're targeting, but where they expect to be reached and how they expect to transact.

The Hidden Cost of the Wrong Early Customer

Most founders understand that enterprise deals carry long sales cycles. What they underestimate is the organizational debt those deals generate.
You hire a VP of Sales with a Salesforce background. You build a demo environment. You create a security questionnaire response library. None of that scales to a product-led or self-serve motion later without a painful rebuild.
The reverse is equally punishing. A startup that opens with a freemium-to-SMB funnel and then tries to move upmarket discovers its product lacks the audit logs, SSO, and admin controls enterprise buyers require on day one of their evaluation. You're not just selling differently. You're rebuilding the product.
Approximately 89% of software companies are integrating AI into their products to stay competitive, according to Bain's Technology Report. That density of competition makes customer positioning more critical than it's ever been. When every competitor can gesture at an AI feature -- from AI agents handling customer workflows to automated decision-making layers -- the question isn't what you've built -- it's who you've built it for, and whether your acquisition strategy reflects that clarity.

Product-Led vs. Enterprise Go-To-Market: Two Different Operating Philosophies

These aren't just different sales tactics. They touch every function in the company.

Product-led growth front-loads value delivery. The product itself does the selling -- through a free tier, a trial, or a usage-based entry point. Acquisition cost is low. Conversion is gradual. The growth engine is viral loops, in-product upgrade triggers, and activation rate optimization. Your customer success team is small early on; product analytics is your biggest investment.

Enterprise go-to-market flips that model entirely. Value is communicated before it's experienced -- through demos, pilots, security reviews, and stakeholder presentations. Your acquisition cost is high but your contract value justifies it. The growth engine is relationship density, referrals within enterprise accounts, and expansion revenue from existing logos. Your customer success team is large and quota-carrying from the start.

What makes this genuinely difficult is that the metrics look similar in the early innings. Both models can produce $500K ARR by month 12. The organizational DNA required to scale each one diverges sharply after that.

McKinsey's research found that remote sales representatives operating in hybrid models can reach four times as many accounts and generate up to 50% more revenue than traditional field-heavy approaches. That's compelling -- but only for founders whose customer segment actually responds to a digital-first motion.

How Early Customers Shape Your Trajectory

There's a pattern that repeats across failed go-to-market pivots: the founding team optimized for the customer that was easiest to close, not the one that was most strategically aligned with their product vision.

An agency owner who moves fast and doesn't require procurement approval is an easy first customer. A Director of Operations at a 2,000-person manufacturer is not. But if your product solves an enterprise operations problem, that agency win is a distraction. It generates revenue, sure. It also generates a support burden, a set of feature requests that don't generalize, and a false confidence signal that you've found product-market fit.

The non-obvious principle here: the right early customer isn't the one most willing to buy. It's the one whose feedback most accurately predicts what the market will eventually require.

Enterprise buyers -- despite their slower cycles -- often surface the structural requirements (security, compliance, integrations) that become table stakes for the whole market within 24 months. 71% of B2B buyers are willing to spend more than $50,000 in a single transaction via remote or self-service models, per McKinsey's hybrid sales research. That figure matters because it dismantles the assumption that high-value enterprise deals require expensive in-person sales motions.

Choosing the Right Acquisition Strategy for Your Stage

No single framework applies universally. But there are three structural questions that cut through the noise.

First: where does your product deliver its clearest value fastest? If a user can reach an "aha moment" in under 10 minutes without a sales conversation, product-led is likely the right motion -- and your acquisition strategy should center on removing friction from that path. If your product requires configuration, integration, or change management before it delivers value, you need a human in the loop from day one.

Second: what does your target buyer's decision-making process actually look like? A startup targeting individual contributors at mid-market companies faces a fundamentally different acquisition challenge than one targeting C-suite buyers at 500-person enterprises. The former converts through bottom-up adoption; the latter requires top-down sponsorship. Misreading this dynamic is the most common reason a technically excellent product fails to scale.

Third: what growth rate does your capital structure demand? Product-led strategy can produce extraordinary unit economics at scale, but it compounds slowly in year one. An enterprise strategy can produce large early contracts but burns capital on sales headcount before the revenue arrives. Understanding which startup growth stage you're actually in changes how you approach this choice entirely -- the decisions that unlock the opening stage are structurally different from the ones that determine whether you survive the midgame.

The Bain Technology Report found that AI tools can accelerate roughly 20% of knowledge worker tasks without sacrificing quality -- which changes the capacity math for early sales teams significantly. A two-person sales team with the right AI workflow coverage can operate with the reach of a five-person team. That changes the viable runway window for an enterprise motion.

The Case for Starting Deliberately Narrow

The counterintuitive play -- the one most growth advisors won't recommend -- is to start deliberately smaller than your market allows.

Targeting a hyper-specific segment (say, Series A SaaS companies with a sales team of 5 to 15) gives you something most startups lack in year one: a reference class. Your product gets compared to other tools these specific buyers use. Your pricing gets benchmarked against their existing stack. Your onboarding gets stress-tested by a buyer who has seen every competitor's onboarding. That specificity generates sharper feedback, faster iteration cycles, and -- critically -- a clearer story when you expand.

Stripe started with developers. Slack started with small teams. Both could have targeted larger buyers earlier. Neither did. The discipline to stay narrow until the motion is proven is what separated their growth trajectories from the cohort of startups that tried to serve everyone and converted no one well.

The team composition question follows directly from this. Once you've validated a narrow segment and start expanding, the instinct is often to hire full-time senior leaders. But there's a credible alternative worth weighing: fractional executives who deliver senior GTM expertise at a fraction of the headcount cost, particularly during segment transitions where the motion is still being refined.

FAQ: Customer Acquisition Strategy for Startups

How do early customers shape startup growth?

Early customers define your product roadmap, sales motion, team structure, and pricing model. Whichever segment you optimize for first generates the feedback loops -- and the organizational muscle memory -- that compound into your company's identity. A startup that acquires enterprise customers early builds compliance infrastructure, relationship-heavy sales, and long contract cycles. One that acquires SMBs builds self-serve tooling, low-touch success, and monthly churn management. These are different companies, even with identical founding visions.

What's the difference between product-led and enterprise go-to-market strategy?

Product-led growth lets the product do the selling -- users discover value before speaking to sales, and conversion happens through in-product triggers. Enterprise GTM inverts this: value is communicated through human conversations, demos, and pilots before the product is even configured. The operational difference is profound: PLG prioritizes product analytics and activation rates; enterprise GTM prioritizes AE headcount, deal velocity, and expansion revenue from existing accounts.

How do customer segments influence startup success?

The wrong segment creates a gravitational pull toward features, processes, and pricing that don't generalize. Serving agency clients when your product is built for enterprise ops teams generates revenue but misaligns the entire company. The right segment accelerates product-market fit by generating feedback that mirrors what the broader market will eventually demand -- even if that customer was harder to close initially.

What is the best go-to-market strategy for an early-stage SaaS startup?

There's no universal answer, but there is a reliable diagnostic. If your product delivers value in a single session without onboarding, start with PLG. If it requires setup, integration, or stakeholder buy-in to work, start with an enterprise motion -- and hire your first sales rep before you think you need one. The costliest GTM mistake is letting traction in the wrong segment convince you that you've found product-market fit.

When should a startup switch from product-led to enterprise go-to-market?

The signal isn't a revenue threshold -- it's buyer behavior. When inbound leads start arriving from enterprise procurement teams, security reviews, or legal, your product has outgrown a pure PLG motion. The practical trigger is usually the first deal that stalls because you lack a security questionnaire, an MSA template, or an SSO integration. That's the market telling you to build an enterprise layer -- not a reason to abandon PLG, but a signal to add enterprise infrastructure on top of it.

The GTM Strategy Is a Founding Decision

The startups that scale fastest aren't the ones with the cleverest acquisition tactics. They're the ones that chose the right segment early and built every system -- product, pricing, sales, success -- to serve that segment exceptionally well. Tactics are easy to change. The organizational DNA that forms around your first 50 customers is not.

Originally published on Linksoft Technologies.

AI Agent Types Explained: A Practical Guide for Business Decisions

Arleen Kaur — Thu, 25 Jun 2026 15:11:45 +0000

Choosing the wrong AI agent architecture doesn't just underperform — it creates failure modes that are harder to diagnose than whatever you were trying to automate in the first place. So before you commit to a direction, here's a clear breakdown of every major agent type, what distinguishes each one, and how to pick the right fit for your context.

Why This Decision Actually Matters

The adoption curve has been steep. ChatGPT hit 100 million monthly active users in two months — the fastest consumer app ever. And approximately 89% of software companies are now building AI into their products to stay competitive.

What that stat hides is how many are building the wrong type of agent for the job. A conversational agent where you need deterministic automation. A single-agent design where the workflow demands coordination. A reactive system when the use case needs planning.

Each mismatch carries a real cost: wasted engineering time, poor outcomes, and compounding technical debt. Let's fix that.

Type 1: Simple Reflex Agents

The simplest architecture in the stack. A reflex agent works on a condition-action rule: if X, then Y. No memory. No planning. No world model.

Think of an email routing system that sends support tickets to the right queue based on keyword matching. Or a fraud alert that flags transactions above a threshold. Fast, predictable, auditable — and exactly right when the rules are well-defined and the environment is stable.

Where they break: anywhere there's nuance or variability. The rules need manual updating every time the world changes. There's no mechanism for adaptation.

Type 2: Model-Based Reflex Agents

The next step up adds internal state — a persistent representation of the world that updates between inputs.

A practical example: a customer service system that remembers what the user said three messages back and uses that context to interpret the current message. Recommendation engines work the same way. They're not just responding to what you clicked; they're tracking a model of your behavior.

Still rule-driven at its core, though. More robust than a pure reflex agent, but it still struggles when inputs fall outside what the internal model was designed to handle.

Type 3: Goal-Based Agents

Here's where planning enters the picture. Goal-based agents reason about future states and select actions based on whether they move toward a desired outcome.

Navigation systems are the clearest example — the agent knows where it is, knows where it needs to go, and evaluates routes based on which one gets there most efficiently. In a business context, this might be a supply chain agent that knows current stock levels, knows the target threshold, and selects purchasing actions to hit that target within lead time constraints.

If one route is blocked, it finds another. If the goal shifts, it reorients. That flexibility is what separates goal-based agents from reflex systems — and why they're increasingly used in logistics, scheduling, and resource allocation.

Type 4: Utility-Based Agents

Goal-based agents ask: did I achieve the goal? Utility-based agents ask: how well did I achieve it, compared to every other option?

That distinction matters in real-world systems where multiple goals compete. A ride-sharing dispatch doesn't just want to get drivers to passengers — it wants to minimise wait time, maximise driver utilisation, account for surge pricing, and balance fairness across the network simultaneously. A utility function lets the agent trade these off against each other in a principled way.

Use these when outcomes have degrees, when there are competing objectives, or when the cost of a suboptimal decision is high. Computationally heavier, but in complex operational domains that overhead pays for itself.

Type 5: Learning Agents

This is the category generating most of the conversation right now — and for good reason. Learning agents improve their own performance over time based on feedback. They're not static rule-executors. They adapt.

The architecture has four components:

A performance element that takes actions.
A critic that evaluates those actions.
A learning element that adjusts based on the critic's feedback.
A problem generator that proposes experiments to expand knowledge.

Large language models, recommendation systems, and most modern computer vision applications fall here. Research from Bain found that AI tools can help companies accelerate approximately 20% of worker tasks without compromising quality — and most of that uplift comes from learning agents, not static automation.

The trade-off is opacity. Auditing a learning agent's decision logic isn't as straightforward as reading a reflex agent's rule set. In regulated industries, that creates compliance challenges you'll need to design around explicitly.

Type 6: Deterministic Automation vs. AI Agents

Worth drawing this line clearly before going further, because vendor conversations blur it constantly.

Deterministic automation (RPA, traditional workflow tools) follows a fixed script. Same input, same output, every time. Change the input format and it breaks — that brittleness is a known limitation.

AI agents are built to handle variability. They reason about inputs rather than pattern-matching against a template. When something unexpected appears, they adapt.

Neither is universally better. Stable, structured, high-volume processes are often better served by deterministic automation. Variable, judgment-intensive, context-dependent workflows are where AI agents earn their keep. Start with the environment, not the technology.Explore how AI and automation reshape human work.

Type 7: Conversational and LLM-Based Agents

The agent type most people encounter directly. LLM-based conversational agents generate responses based on a deep representation of language, context, and intent — not explicit rules. They produce outputs probabilistically.

Business applications are broad: customer-facing support that handles complex queries without rigid decision trees, internal knowledge assistants that synthesise across large document sets, sales tools that draft personalised outreach based on CRM context.

What makes them different from earlier conversational systems is their ability to handle genuinely novel inputs. But the operational implication is significant: they require ongoing evaluation, not one-time configuration. Organisations that treat LLM-based agents as set-and-forget systems produce measurably worse outcomes than those who build deliberate human oversight into the deployment.

Type 8: Multi-Agent AI Systems

The most architecturally sophisticated category, and the one generating the most enterprise excitement right now.

Multi-agent systems deploy networks of specialised agents that coordinate to complete tasks too complex for any single agent to handle reliably.

A practical example: a market intelligence workflow where one agent scrapes and summarises competitor activity, a second cross-references that against internal sales signals, a third drafts strategic recommendations, and an orchestrator manages sequencing and resolves conflicts. No single agent has the full picture. Together, they produce output that would require significant analyst time to replicate manually.

McKinsey research on B2B purchasing shows buyers now use an average of 10 channels during their purchasing journey — exactly the kind of multi-source, high-complexity environment where multi-agent systems outperform single-agent alternatives.

When to use them: when the task has distinct sub-components that benefit from specialisation, when parallelisation meaningfully reduces time-to-output, or when the workflow is too complex for a single context window to hold reliably.

When not to: when the problem doesn't justify the orchestration overhead. Multi-agent systems aren't the default. They carry real complexity costs, and the most common failure point isn't the architecture itself — it's the routing logic between agents breaking down under real conditions.Here's why AI fails at scale and what enterprise teams consistently miss.

Architecture and Workflow: The Fit That Actually Matters

McKinsey found that remote sales representatives using well-designed AI-assisted workflows can reach four times as many accounts and generate up to 50% more revenue than traditional models.

The operative phrase is well-designed. The architecture alone doesn't produce the outcome. The fit between architecture, workflow, and human process does.

Organisations that get measurable results start with the workflow problem and work backward to the agent architecture that solves it. The ones that don't start with the technology and try to find a problem for it.

FAQ

What's the simplest way to understand the different agent types?

Simple reflex agents follow fixed rules. Model-based agents track context over time. Goal-based agents plan toward outcomes. Utility-based agents optimise across competing objectives. Learning agents improve through feedback. Multi-agent systems coordinate networks of specialised agents. Each fits a different problem type — the right choice depends on how variable your environment is, how much judgment the task requires, and how much operational complexity your team can manage.

How do you choose the right agent architecture?

Start with the task, not the technology. Structured, stable inputs? Deterministic automation or reflex agents are likely sufficient. Variability, context, or natural language? LLM-based or goal-based agents fit better. Distinct parallel workflow components? Multi-agent coordination becomes worth the overhead. The single biggest mistake is picking the most sophisticated option available instead of the most appropriate one.

What's the real difference between deterministic automation and AI agents?

Deterministic automation runs a fixed script — same input, same output, always. It breaks when the input changes. AI agents reason about inputs and handle variability. They're not more reliable in stable environments, they're more resilient in variable ones. It's a question of environmental stability, not a general capability ranking.

When does a business actually need multi-agent systems?

When the task genuinely has distinct sub-components that benefit from specialisation, when parallelisation matters for speed, or when no single agent can hold the full workflow context reliably. These aren't the default choice — they carry significant orchestration overhead and require more sophisticated oversight. Deploy them when the complexity of the problem justifies the complexity of the architecture.

What are the real business gains from AI workflow automation?

Speed, scale, and consistency. Tasks that took hours execute in minutes. Processes that couldn't scale due to headcount constraints become viable. Quality becomes more consistent. The secondary gain — often underestimated — is that automation surfaces decision points that were previously invisible inside manual processes, which creates opportunities to improve the underlying workflow, not just accelerate it.

This post is part of Linksoft Technologies' ongoing series on practical AI architecture. Linksoft is a software development services company helping teams build and deploy production AI systems. Learn more at linksft.com.

Why Your AI Agents Are Failing: The Routing Problem Nobody Is Solving

Arleen Kaur — Tue, 16 Jun 2026 12:46:29 +0000

AI Disclosure: This post was written with AI assistance and has been reviewed and approved for publication by the Linksoft Technologies team.

Everyone's racing to deploy AI agents. Speed creates the illusion of progress, but it doesn't guarantee advantage. The real cost shows up later — in how the system behaves under load.

Read those three numbers together. Almost every enterprise is running AI. Most say cost efficiency is a top priority. And almost none have built the AI agent architecture layer that would actually solve it. That's the defining infrastructure gap of this moment.

The conversation in most strategy decks is still stuck in the wrong place: which model to pick, which vendor to trust, build or buy. Surface-level. Symptom-chasing. Completely missing the structural problem underneath.

Companies running AI at real scale aren't running better models. They're running better systems around models. That's the difference most teams still miss and it usually shows up in the budget later.

The Instinct That's Costing You

When organizations get serious about AI, the instinct makes sense. Use the most capable model available. It reasons best, handles ambiguity best, writes best. So you build your first agent on GPT-4 or Claude Opus or whatever tops the benchmark table and it works. Impressively, even.

Then you try to scale it. That's where the math gets uncomfortable.

Large frontier models are built for complexity. But most tasks in any real-world AI pipeline aren't complex. They're repetitive, narrow, and structurally simple. When you route everything through a hundred-billion-parameter model, you're paying for capability you don't need, latency you don't want, and token counts that scale linearly with volume.

Google Research's work on Switch Transformers documented up to 7x gains in pre-training efficiency with the same compute, proving these aren't theoretical. The question is whether your orchestration layer is built to capture them.

Sequoia Capital's analysis points to a $500B annual revenue gap where infrastructure investment dramatically exceeds realized returns. Getting model routing wrong isn't just an efficiency concern. At scale, it turns into a margin problem.

The Architecture Is the Problem

The default approach produces a flat pipeline: one input, one large model, one output, repeat. No routing. No complexity awareness. Every task treated identically regardless of what it needs.

In a proof of concept this works fine. At scale, the cost problem stops being abstract and by then the architecture is already too embedded to change easily.

The pilot looks fine. Production is where things start to break and that's the trap most scaling teams walk into.

What Is Model Routing in AI and Why Does It Matter?

Model routing is the orchestration layer that decides which AI model handles which task. It sends complex, ambiguous requests to large frontier models and simple, repetitive ones to smaller, faster, cheaper models.

Without it, every task gets routed to the same model regardless of what it actually needs. You pay frontier-model prices for work a fraction of the cost could handle equally well.

At scale, that's not an efficiency gap. It's a margin problem. Model routing is what closes it by matching compute to complexity the same way a hospital matches patient complexity to the right tier of care, rather than routing every case to the senior specialist.

What the Fix Actually Looks Like

Think of it like triage in a hospital. You don't route every patient with a minor injury to your most senior specialist. You have a system that matches people to the right level of care, reserving specialist time for cases where their expertise is genuinely irreplaceable.

Your large model's compute is the specialist's time. The orchestration layer is the triage system. Without it, you have queues, waste, and costs that don't hold at scale.

"The key isn't just about choosing the cheapest option, but about finding the right recipe of tools and services that aligns with your workload patterns."
-- Google Cloud

How to Design Efficient AI Agent Architectures for Enterprises

Efficient enterprise AI agent architecture is built in tiers:

Tier 1 Lightweight model: Handles narrow, high-volume, structurally simple tasks
Tier 2 Mid-tier model: Handles moderate reasoning and mixed-complexity requests
Tier 3 Frontier model: Reserved for genuinely complex or high-stakes cases only

Each tier has defined cost, latency, and quality thresholds. On top of this sits an observability layer that tracks which tasks are going where, at what cost, and with what outcomes, so routing decisions can be continuously calibrated rather than set once and forgotten.

The organizations that reduce AI agent orchestration costs at scale aren't running better models. They're running better systems around models, with architecture that matches spend to need at every step.

Why Most Teams Haven't Built This Yet

There are really two reasons and neither has anything to do with a lack of skill.

Reason 1 The early pain isn't visible.
When you're running a proof of concept, the cost difference between a large model and a small one feels abstract. It only becomes obvious at scale, when the budget impact is undeniable and the system is already too embedded to change easily.

Reason 2 Tiered orchestration is genuinely harder to build.
A single model pointed at a task is simple. An orchestration layer that correctly classifies tasks, routes them, handles edge cases, and maintains consistency across multiple models is a serious systems problem. It's the kind that takes six to eighteen months to build properly.

The Agent Reality Check

Let's be direct: the hype cycle has significantly outpaced the deployment reality. Most of what organizations have built and called "agents" are, on close inspection, sophisticated chatbots with tool access bolted on. They fail in three specific, predictable ways and all three are architectural problems, not model quality problems.

This is precisely why now is the right moment to pivot. The infrastructure including Kubernetes, LangGraph, sandboxed execution environments, and proper observability tooling exists and is maturing. Companies that start building now will be early-to-mid players, not laggards doing emergency re-architecture two years from now.

NVIDIA defines agentic systems as "autonomous, long-running agents that reason, plan and act across complex, multi-step workflows," a definition that highlights how far most current implementations still have to go. This isn't a reason to pull back but a signal to treat this like a real systems problem.

What You Should Actually Be Tracking

Tracking the right metrics requires an AI oversight framework that connects routing decisions to business outcomes, not just benchmark scores.

Most AI business cases get approved on model performance benchmarks, which is the wrong number to optimize for. The real cost including container orchestration, workflow state management, sandboxed execution, observability tooling, and routing model maintenance rarely makes it into the same deck. So the ROI gap isn't surprising. The real cost was never fully accounted for in the first place.

McKinsey estimates generative AI could add $2.6T to $4.4T annually to the global economy, with total productivity impact reaching $7.9T. The cost of getting system design wrong will scale right alongside the opportunity, not independently of it.

Three metrics worth tracking instead of benchmark scores:

Cost per automated task: should decline as volume grows. Flat or rising cost signals wrong-tier routing
Routing accuracy rate: target above 92% of tasks correctly classified by complexity. Mis-routing routine tasks to frontier models is where budget leaks
Escalation override rate: target below 8% of auto-routed decisions manually corrected. A high rate signals the routing model needs recalibration, not more reviewers

Q&A: What Engineering and Architecture Teams Actually Ask

What's the difference between model routing and prompt routing?
Prompt routing selects between different prompts or instructions for the same model. Model routing selects between different models entirely based on task complexity. The distinction matters at scale: prompt routing doesn't reduce compute costs because you're still running the same model. Model routing does, by matching task complexity to appropriately sized infrastructure.

How do you classify task complexity reliably enough to route it?
Start with a lightweight classification model, often a fine-tuned smaller model trained on your own task distribution. The classification step itself costs almost nothing relative to the savings from correct routing. Track misroutes (tasks sent to the wrong tier) the same way you'd track model errors: as a calibration signal, not a failure.

What happens when a task is misclassified and routed to the wrong tier?
A task routed down (sent to a smaller model than it needs) produces a lower-quality output, detectable via output scoring or human review flags. A task routed up (sent to a larger model than needed) just costs more than necessary. Build fallback logic: if the lower-tier model's confidence score falls below a threshold, escalate automatically.

Does tiered routing work for LLM-based agents, or just classification tasks?
It works for both. For agents, the routing decision happens at the task-dispatch layer before any tool calls are made. Simple deterministic sub-tasks like formatting, extraction, and lookup go to lightweight models. Multi-step reasoning chains or ambiguous open-ended tasks go to frontier models. The orchestration layer manages the handoff.

How long does it realistically take to build a proper routing layer?
Six to eighteen months for a production-grade system, depending on the number of task types, the variance in your data distribution, and how mature your observability infrastructure is. The first version is always simpler. The hard part is continuous calibration: keeping routing decisions accurate as your task mix shifts over time.

Three Verdicts, One Principle

01 Single-model stacks are not production architectures.
Routing every task to the same frontier model has no cost-efficiency mechanism, no complexity awareness, and no path to economic viability at scale. Without an AI oversight framework to govern routing decisions, better models only delay the budget problem. They don't solve it.

02 Routing is required and it can't be an afterthought.
Bolted on after the fact, tiered orchestration requires re-architecting systems already embedded in production. The organizations building it now are the ones who won't be explaining budget overruns to their CFO eighteen months from now.

03 The infrastructure is where the advantage actually sits.
Kubernetes, LangGraph, sandboxed execution, observability tooling, feedback-integrated recalibration. These aren't operational add-ons. The organizations with structural AI advantages aren't running the most powerful models. They're the ones who figured out that the game is about using the right model for each task and built the systems to make that happen.

"Enterprises that build intelligent orchestration into their AI systems early will run dramatically more automations per dollar of cloud spend. The competitive advantage in agentic AI is not a better model. It is a better system."

That's not an AI strategy. It's a systems design strategy, applied to AI. And that distinction is where most of the real value is going to be created.

Everything else works right up until it hits a budget ceiling.

About the Author:
Arleen Kaur writes about enterprise AI, system architecture, and the gap between AI pilots and production systems at Linksoft Technologies, a custom software development company.

Sources referenced:
Sequoia Capital -- $500B AI infrastructure revenue gap analysis
McKinsey -- Generative AI economic impact ($2.6T to $4.4T annually)
NVIDIA -- Agentic AI system definition

Human in the Loop AI as a Production Requirement: Why Control Architecture Determines Enterprise AI Success

Arleen Kaur — Mon, 15 Jun 2026 18:10:16 +0000

AI Disclosure: This post was written with AI assistance and has been reviewed and approved for publication by the Linksoft Technologies team.

88% of enterprises are running AI. Only 4% are generating meaningful returns. The gap isn't the model it's everything built around it.

Here's a number that should make every engineering leader uncomfortable:

95% of enterprise AI pilots deliver zero measurable ROI. Not low ROI. Not disappointing ROI. Zero.
McKinsey Global AI Survey, 2025

Read that again. Not in under-resourced startups. Across the enterprise, across industries, after years of investment and board-level attention.

And the conversation in most strategy decks stays exactly where it's been for three years: better models, faster inference, which LLM to pick, whether to build or buy. Symptom-chasing. Completely missing the structural problem underneath.

The companies generating returns aren't running better models. They're running better systems and that starts before the output layer. The routing problem is where most architectures break first long before human oversight even becomes relevant.

The Adoption Numbers Tell a Story Nobody Wants to Read

The gap between 88% adoption and 4% meaningful returns isn't a model quality problem. GPT-4, Claude, Gemini these are not the bottleneck.

The bottleneck is organizational design: how the AI is deployed, what governs it, and what happens when it gets something wrong.

The dominant failure pattern, documented consistently across McKinsey's 2025 State of AI report and the Partnership on AI's Enterprise Landscape research, is this: organizations insert AI into existing workflows without redesigning those workflows first. AI inherits broken processes and accelerates them. Garbage in, faster garbage out.

55% of high-performing AI organizations redesign workflows around AI before deploying. Among the broader population, that figure is 20%. That 35-point gap in process redesign explains most of the performance differential.

Why the Architecture Is the Problem, Not the Algorithm

AI models are probabilistic systems. They output confidence scores that measure certainty not correctness. A model can be 94% confident and completely wrong, not because it's a bad model but because the input falls outside its training distribution. And here's what makes this dangerous in production: the model has no mechanism to know this.

The error propagates downstream, silently, until something breaks visibly.

In enterprise environments, three things compound this that simply don't exist in a controlled pilot: data that changes constantly, decisions that can't be reversed, and legacy infrastructure never designed for AI.

The standard autonomous architecture is:

Input → Model → Output → Action

No monitoring. No feedback. No correction layer.

In a controlled pilot, this works. In live production with financial and legal consequences, it fails not immediately, but inevitably.

64% of organizations stall at the scaling stage because of infrastructure debt a clean pilot environment never exposed.
The pilot succeeded. The production environment is not the pilot.

What Is Human-in-the-Loop and Why It's Not Enough on Its Own

Human-in-the-loop (HITL) places a human reviewer between an AI's output and the action it triggers. It creates an intervention point and satisfies regulatory mandates like EU AI Act Article 14, which requires human oversight for high-risk AI in employment, credit, healthcare, and critical infrastructure.
It's structurally necessary. But at production scale, HITL as currently implemented fails in three specific, predictable ways.

Failure 1 — Automation bias

Review interfaces present cases structured around the model's interpretation. Reviewers are evaluating a pre-framed answer, not the situation itself. Research is consistent: humans default to confirming AI outputs rather than questioning their premise. HITL looks like independent oversight. Functionally, it's a rubber stamp at velocity.

Failure 2 — Volume collapse

Human attention doesn't scale with decision throughput. As queues grow, reviewers apply faster heuristics to clear them effectively re-automating the decisions HITL was supposed to oversee. No amount of reviewer training changes this. It's an architectural constraint, not a personnel problem.

Failure 3 — The feedback loop nobody owns

A consistent 30% override rate on a specific case type means the model is wrong in that domain with high regularity. The correct response is structural: recalibrate the threshold, retrain the model, redesign the rule. The observed response, almost universally, is to absorb the overhead and move on. The feedback loop exists in the architecture it just doesn't operate in practice.

What a Closed-Loop Control System Actually Looks Like

The organizations generating real AI returns have built something structurally different. Whether they've named it this way or not, they've built closed-loop control systems architectures where uncertainty is managed rather than ignored, and where the system improves continuously from its own operational data.

Here's what that architecture looks like in practice:

1. Input & Confidence Scoring
Raw data enters. The model produces output and a calibrated confidence score. Uncertainty is highest here the system acknowledges this rather than suppressing it.

2. Decision Routing by Confidence + Risk Tier
High confidence + low risk → Auto-execute
Medium confidence or moderate risk → Human review
Low confidence or high risk → Hold / escalate.

3. Bounded, Auditable Action
Every decision executed with defined ownership. Confidence score, routing decision, and reviewer action all logged not just the outcome.

4. Outcome Tracking + Feedback Loop
Human corrections flow into retraining pipelines. Override patterns trigger threshold recalibration not queue management.

5. Drift Detection
Performance monitored continuously. Detected degradation triggers automatic adjustment before it causes outcome failure. The loop closes.

This isn't theoretical. It's the architecture every organization generating meaningful AI returns has built — most just haven't named it as a design principle.

AI Use Case Risk Profiles: Controls Scale With Consequence

Not every AI decision carries the same risk. The control requirements should match the consequence level not the model's confidence alone.

Fraud Detection: What the Two Architectures Actually Produce

Abstract architecture becomes concrete when you trace it through a real use case. Fraud detection exposes every failure mode at once.

In the standard pipeline deployment: a transaction is scored. High score triggers auto-block. Low score passes. No monitoring. No outcome tracking. No feedback.

Within weeks, two things happen: false positives accumulate silently, and fraudsters adapt to patterns the model wasn't trained on — novel attack vectors get low confidence scores and pass through undetected.

Both failures are architectural. A better model delays them. The same failures recur.

In a closed-loop system, novel attack vectors get flagged for human review based on low confidence, not auto-blocked or auto-passed. Override rates by fraud type feed back into threshold calibration. The model gets smarter because the system does.

The same logic applies to credit decisioning, insurance triage, HR screening anywhere AI handles high volume with variable exception rates.

What Architecture Is Needed to Scale AI Across an Enterprise

Scaling AI beyond a single use case requires four architectural layers most organizations lack:

A shared data and integration platform to avoid rebuilding pipelines for every new use case
Standardised confidence thresholding and routing logic configurable per use case, not hardcoded
An MLOps layer with model versioning, drift monitoring, and automated retraining triggers
An audit and governance layer that logs decisions with full context not just outcomes

Without these, every AI initiative stays one-off. With them, each deployment compounds the previous investment.

The Real Cost Is Never in the Deck That Gets Approved

Most AI business cases get approved on model performance which is the wrong number to optimize for.

The real cost infrastructure overhaul, compute, drift monitoring, retraining pipelines, and people who actually understand what they're reviewing rarely makes it into the same deck. So the ROI gap isn't surprising. The investment was undercounted from the start.

Then there's the people problem.

60% of organizations say AI literacy is their biggest scaling barrier. The humans assigned to oversee AI decisions often can't tell when something has gone wrong. Oversight exists on paper. In practice, it has no teeth.

Three metrics worth tracking instead of accuracy:

Q&A: What Engineers and Leaders Actually Ask

What's the difference between human-in-the-loop and human-on-the-loop?
HITL places a human between the model output and the action they must approve before anything executes. Human-on-the-loop means the system acts autonomously but a human monitors and can intervene. HITL gives stronger control; human-on-the-loop scales better but requires reliable drift detection to catch errors before they compound.

How do you set confidence thresholds without ground truth data?
Start with domain expert judgment for initial tiers, then calibrate empirically. Track override rates per confidence band if reviewers override 40% of "high confidence" decisions in a specific case type, the threshold is miscalibrated for that domain. Use those rates as recalibration signals, not anecdotes.

At what volume does HITL break down?
There's no universal number — it depends on decision complexity, reviewer expertise, and queue management. The signal to watch is reviewer throughput under load: when average review time drops sharply as queues grow, reviewers are heuristically clearing cases rather than genuinely evaluating them. That's the architectural ceiling.

Does closed-loop control require a full MLOps platform?
No. You can start with lightweight instrumentation: log confidence scores and outcomes, track override rates manually, and run threshold reviews quarterly. The architecture matters more than the tooling. A spreadsheet tracking overrides by case type is more valuable than a sophisticated platform that nobody queries.

How does EU AI Act Article 14 map to this architecture?
Article 14 mandates human oversight capability for high-risk AI systems the ability to understand, monitor, and intervene in AI outputs. A closed-loop system with tiered routing and full decision logging satisfies this structurally. A HITL layer bolted onto an autonomous pipeline satisfies it formally but often not functionally, because the override signals aren't acted on.

Three Verdicts

01 — Autonomous AI is not a production architecture.
The failure is structural, not algorithmic. A model operating without thresholding, routing, monitoring, and feedback has no mechanism for self-correction. Better models delay the failure. They don't prevent it.

02 — Human-in-the-loop is required, but it can't be the endpoint.
HITL provides accountability and an intervention point before errors propagate. At scale, it fails under automation bias, volume pressure, and the absence of feedback integration. Treating it as a permanent solution builds systems constrained by human bandwidth — not systems that improve.

03 — Closed-loop control is the engineering requirement.
Confidence thresholding, risk-tiered routing, structured escalation, continuous monitoring, feedback-integrated retraining, and drift detection. These are not operational add-ons. They are the product.

Every organization that has generated meaningful AI returns has, in practice, built this. Most haven't recognized it as the design principle it is. The ones who do are the 4%.

Everything else is a pilot waiting to fail.

About the Author:
Arleen Kaur writes about enterprise AI, system architecture, and the gap between AI pilots and production systems at Linksoft Technologies, a custom software development company.

Sources referenced:
McKinsey Global AI Survey, 2025
Partnership on AI — Enterprise Landscape Research
EU AI Act, Article 14 (Human Oversight Requirements)
Princeton / Georgia Tech GEO Study — Aggarwal et al., ACM KDD 2024

Related reading on linksft.com: