Every AI vendor has a number. "$80,000 saved per employee per year." "10x faster." "2,000 hours returned to the business." The numbers are large, compelling, and almost always wrong — not because companies are lying, but because they're measuring the wrong things.
I've spent enough time working with AI in business contexts to have a clear picture of what the ROI conversation usually gets wrong. Let me break it down the way I actually think about it.
The Standard ROI Story (And Why It's Incomplete)
The typical vendor claim goes like this: AI does Task X. Task X used to take Y hours. Therefore you've saved Y hours × hourly rate = $Z.
This math is real but incomplete in three ways:
1. Time saved ≠ cost saved. If AI shaves 20% off the time your marketing team spends on copy drafts, you haven't necessarily cut costs — you've freed up capacity. That's valuable, but it's only valuable if the team uses that capacity for something higher-return. If it disappears into slightly longer meetings, you've improved morale at best.
2. Time saved ≠ quality kept. This is the one almost nobody measures. AI speeds up output. It doesn't always preserve output quality at that speed. I've seen teams celebrate a 3x output increase from AI-assisted copy, then spend weeks quietly untangling why conversion rates dropped. Speed without quality isn't an improvement.
3. The hidden costs don't show up in the headline. Prompt tuning. Output review. Occasional corrections. The cognitive cost of keeping humans oriented in a workflow that AI is partially running. The real total cost of using AI seriously is higher than the license fee, and it's rarely included in the ROI pitch.
The Framework I Actually Use
I think about AI ROI across three dimensions. Not a formula — more of a forcing function for honest evaluation.
1. Time Saved
The real question here isn't "how much time does the task take now vs. before?" It's: what happens with the time that's freed?
A team that uses AI to process customer feedback 5x faster is doing something valuable — but only if the freed time goes into actually responding to that feedback, not more reporting. Ask explicitly: what is the recaptured time flowing toward? If you can't answer that, the time savings are theoretical.
2. Quality Delta
For every AI-assisted workflow, track a quality metric before and after. This sounds obvious and almost nobody does it.
- Writing tasks: track engagement, completion rates, replies — whatever matters for that content.
- Decision-support tasks: track decision outcomes over time. Are you making better calls? More confident ones?
- Research tasks: track accuracy on sampled outputs. What percentage of AI-generated summaries would you have caught something wrong in?
Quality can go up (AI helps produce more polished first drafts, catches errors humans miss), stay flat (AI speeds up work without changing its substance), or go down (AI introduces confident-sounding errors that don't get caught). You need to know which one you're experiencing.
3. Trust Built or Eroded
This is the long game, and most organizations are ignoring it.
Every time AI produces something wrong and it gets caught before causing harm: trust goes up slightly, the workflow gets better.
Every time AI produces something wrong and it doesn't get caught: trust erodes — sometimes slowly, sometimes catastrophically when the error surfaces later.
If you're deploying AI without clear human review points for high-stakes outputs, you're making a bet that the trust-erosion track doesn't activate. Some organizations will win that bet. Many won't.
The organizations that will do best with AI long-term are building cultures where it's normal and expected to verify AI outputs — not because the tool is bad, but because that's how you build a system that can be trusted at scale.
A Self-Assessment You Can Actually Do
Here's a quick way to evaluate any AI workflow you're considering or already running.
For each AI-assisted task in your work, answer:
- What was the output quality before AI? (Be honest about what "good" actually looked like.)
- What is the output quality now? (Spot-check 10 outputs. Don't trust vibes.)
- Where does the saved time actually go? (Track this for one week.)
- What are the failure modes? (What does it look like when this goes wrong, and how often does that happen?)
- Who reviews AI outputs before they become consequential? (If the answer is "nobody," that's your biggest risk.)
Most people skip question 4 and 5. Those are the ones that cost you.
What Good ROI Actually Looks Like
The teams I've seen genuinely benefit from AI share a few traits:
They use AI for high-volume, lower-stakes tasks first. Email drafts. Research summaries. First-pass document review. Routine data processing. These have short feedback loops — errors are caught quickly and the cost of getting one wrong is low.
They measure before they automate. They know what the baseline looks like, so they can actually compare.
They add review steps, not remove them. At least initially. AI in the middle of a workflow with a human at the output end is significantly more reliable than AI at the output end with nothing after it.
They think in compounding returns, not one-time savings. The ROI of good AI integration isn't a one-time efficiency jump — it's a gradually improving system where the humans and AI get better at working together. That takes time and looks slow at first.
The Honest Benchmark
Here's a simple framing I'd give any team evaluating AI:
If your AI use is making your work faster, roughly as accurate, and you're redirecting the saved time to something valuable — that's genuinely good ROI.
If your AI use is making your work faster but subtly less accurate, with no review step in place — you're borrowing against future trust.
If your AI use is mostly serving as demos and proof-of-concept projects that haven't changed any real workflows — your ROI is zero, and the investment is in theater.
Be honest about which of those describes you right now.
Next: what the latest AI release actually means for the people in the room who aren't engineers.
Top comments (0)