I built a local proxy to track exact LLM API costs per project

#ai #openai #claude #gemini

The problem was simple: I run a small software studio that builds
client work heavily using Claude. Every project ended with the same
awkward conversation — "what did the AI actually cost?"

Token estimates drift from the real bill. Nothing attributed costs
per project. I was either undercharging or handwaving, neither of
which builds client trust.

So I built Halton Meter.

What it does

Halton Meter is a local mitmproxy-based daemon that intercepts
outbound LLM API traffic, attributes each request to a project,
computes exact cost from published pricing, and writes everything
to a local SQLite database. Nothing about how you call the API changes.

pipx install halton-meter
halton-meter init --apps

Three processes come up:

Edge listener on 127.0.0.1:8081
proxy interceptor on 127.0.0.1:8090
Loopback API on 127.0.0.1:8765

Every LLM API call you make — via the Anthropic SDK, OpenAI SDK,
raw HTTP, whatever — gets intercepted, tagged, costed, and logged.

Why a proxy and not an SDK wrapper?

SDK wrappers only catch calls you make directly. They miss ChatGPT,
Gemini Code Assist, and anything going through a tool you don't
control. A proxy captures everything on the wire without touching
your code.

Project attribution

The daemon attributes each request using a three-step chain:``

Claude Code sessions, scripts, notebooks, and direct SDK calls all
get attributed correctly with zero code changes.

Terminal report

halton-meter report

Breakdown by project, model, and date. Numbers come directly from
the provider's published pricing — no estimates, no hidden margins.

What's supported

Six adapters across four providers: Claude, OpenAI, Gemini, and Grok.
Direct API surfaces and OAuth surfaces (ChatGPT, Gemini Code Assist)
both intercepted.

Local only

No cloud. No tracking. API keys never leave your machine. Everything
lives in ~/.halton-meter/db.sqlite. The bundled dashboard is open
source (Apache 2.0) and runs locally.

Docs and full architecture at haltonmeter.com.

Happy to answer questions on the proxy architecture, the attribution
chain, or the cost calculation in the comments.