We Were Paying 3.75x More Than Necessary on Every AI API Call — Here's How We Found It

#ai #github #devops #javascript

Our Anthropic bill was higher than expected. Nobody on the team knew exactly why. So we built a scanner and ran it on our own codebase. First thing it found:

What We Found

server/services/divergence-detector.js was using claude-sonnet-4-6 with max_tokens=150 to generate 2-sentence explanations. Every night. On every divergence found.

const response = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 150,
  messages: [{ role: 'user', content: prompt }]
});

Sonnet costs $15/M output tokens. Haiku costs $4/M. For a 2-sentence output there is zero quality difference. We were paying 3.75x more on every single call and nobody noticed.

What We Built

A GitHub Action that catches this automatically on every PR — before it merges.

Works with Anthropic, OpenAI, Gemini, Bedrock, and LangChain. JS and TS supported. Zero dependencies.

Add It in 2 Minutes

- uses: kavyarani7/ai-arch-scanner@v1
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    threshold: '500'

🔗 GitHub Marketplace
🔗 Repo

What's the most expensive AI pattern you've found in your codebase? Drop it in the comments.

DEV Community

We Were Paying 3.75x More Than Necessary on Every AI API Call — Here's How We Found It

What We Found

What We Built

Add It in 2 Minutes

Top comments (0)