DEV Community

Joao Romao
Joao Romao

Posted on

We tracked 200K AI requests. Here's where the money actually goes

Six months ago I posted here about CostLens — a tool to reduce OpenAI costs. Since then, we've completely rebuilt it based on one question a VP asked our team:

"How much faster are we delivering with AI? What's the number?"

Nobody could answer. We had cost dashboards, but no way to connect spend to output. So we built that.

What CostLens does now

It's not just cost tracking anymore. It's three things:

  1. Cost attribution — not "you spent $4,200 on OpenAI" but "code review costs $340/mo, customer support costs $89/mo, and classification costs $12/mo." Per-feature, per-developer.

  2. Smart routing — simple prompts automatically go to cheaper models. No code changes. We're seeing 30-40% savings on teams that were sending everything to GPT-5.4.

  3. Productivity tracking via MCP — works inside Claude Code, Kiro, and Cursor. Tracks sessions, commits, PRs, and time-to-ship. Generates weekly reports your VP can forward to justify the AI budget.

CostLens weekly engineering report showing AI cost per commit ($4.36), per PR ($12.80), session trends, and deliverables — the numbers your VP needs to justify AI spend.

The setup is still one line

import { CostLens } from 'costlens';
const costlens = new CostLens({ apiKey: 'cl_...' });
const openai = costlens.wrapOpenAI(new OpenAI());
// Done. Every request is tracked and optimized.
Enter fullscreen mode Exit fullscreen mode

For coding agents (Claude Code, Kiro, Cursor):

{
  "mcpServers": {
    "costlens": {
      "command": "npx",
      "args": ["-y", "@costlens/mcp-server"],
      "env": { "COSTLENS_MCP_KEY": "your_key" }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

What we learned from tracking 200K requests

  • 52% of requests were simple tasks running on expensive models (classification, extraction, yes/no questions)
  • Smart routing cut our own bill from $4,100 to $2,337/month
  • The "code review" feature was 3x more expensive than we thought
  • Sessions with no output (abandoned/looping) accounted for 15% of spend

Kill switch

One thing we added after a background agent burned $50 in a weekend: a Slack kill switch. When spend velocity spikes, you get a message with a "Pause" button. One click stops that specific agent session. Auto-
resumes after 30 min.

Free for individuals

The SDK and MCP server are free. Team features (reports, routing, budgets) start at $99/month.

Would love feedback — especially from teams spending $1K+/month on AI. What would make you try this?

Top comments (0)