Cheap AI tokens need per-key usage tracking

#saas #devtools

Cheap AI tokens help, but they do not solve the operator problem by themselves.

The moment a team routes requests across GPT, Claude, Gemini, official routes, discounted pools, retries, and fallbacks, the real question becomes less about the headline price and more about attribution.

For an API-heavy product, I want every request to answer a few basic questions:

Which API key or project made the request?
Which model route handled it?
Did the request use an official/direct balance or a lower-cost balance?
Did it retry or fall back to another route?
How much did a longer job consume as a complete run, not just as isolated calls?

This is the workflow Tokens Forge is built around.

Tokens Forge provides lower-cost token access for mainstream AI models, while keeping the accounting layer visible: project/API-key usage tracking, official Credit and RMB wallet balances, route visibility, fallback traces, and per-run accounting for heavier AI Researcher reports.

That last part matters because AI research workflows can be much heavier than a quick chat completion. A trading research report, for example, may pull market context, run multiple analysis passes, and generate a final report. If the user only sees a generic token charge later, the product feels unpredictable.

The practical setup I prefer is simple: