DEV Community

Cover image for We Were Paying 3.75x More Than Necessary on Every AI API Call — Here's How We Found It
kavyarani7
kavyarani7

Posted on • Edited on

We Were Paying 3.75x More Than Necessary on Every AI API Call — Here's How We Found It

Our Anthropic bill was higher than expected. Nobody on the team knew exactly why. So we built a scanner and ran it on our own codebase. First thing it found:

What We Found

server/services/divergence-detector.js was using claude-sonnet-4-6 with max_tokens=150 to generate 2-sentence explanations. Every night. On every divergence found.

const response = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 150,
  messages: [{ role: 'user', content: prompt }]
});
Enter fullscreen mode Exit fullscreen mode

Sonnet costs $15/M output tokens. Haiku costs $4/M. For a 2-sentence output there is zero quality difference. We were paying 3.75x more on every single call and nobody noticed.

What We Built

A GitHub Action that catches this automatically on every PR — before it merges.

GitHub PR comment showing AI Architecture Scan cost delta table. Main branch baseline is $157.81 per month. This PR increases cost to $165.74 per month, a delta of plus $7.93 or 5 percent shown in red. Warnings stay at 1, Info increases by 1 to 6, Recommendations increase by 1 to 8, Duplicates increase from 1 to 2. New recommendation flags server/routes/watchlist.js for prompt caching opportunity.

GitHub PR comment showing AI Architecture Scan results with cost delta table comparing main branch at $157.81 per month versus this PR, with 1 warning for expensive model misuse in divergence-detector.js

Works with Anthropic, OpenAI, Gemini, Bedrock, and LangChain. JS and TS supported. Zero dependencies.

Add It in 2 Minutes

- uses: kavyarani7/ai-arch-scanner@v1
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    threshold: '500'
Enter fullscreen mode Exit fullscreen mode

🔗 GitHub Marketplace
🔗 Repo


What's the most expensive AI pattern you've found in your codebase? Drop it in the comments.

Top comments (0)