Ho costruito uno strumento che riduce i costi Anthropic del 67% — e trova gli sprechi prima che tu li paghi

Quantum Horizon — Wed, 24 Jun 2026 08:27:19 +0000

Stavo costruendo applicazioni AI sulle API di Anthropic e continuavo ad avere lo stesso problema: i costi erano più alti del previsto e non capivo dove andavano i token.
La maggior parte dei tool di monitoring ti dice quanto hai già speso. Io volevo qualcosa che mi dicesse cosa stavo per sprecare — prima di inviare la richiesta.
Ho costruito token-saver.

Cosa fa

Static Analyzer — analizza il codice Python prima che giri tsave scan chatbot.py Trova pattern come chiamate API dentro loop, system prompt non cachati, documenti interi passati ad ogni richiesta, modelli costosi usati per task semplici. Senza API key. Funziona come un linter ma per i token.
Conteggio token reale — usa l'API ufficiale Anthropic count_tokens, non tiktoken di OpenAI (che sottostima i token di Claude del 15-20%)
Compressore semantico — non tronca semplicemente. Valuta ogni messaggio in base al task corrente e rimuove solo l'irrilevante. Risultato: 67% di riduzione token su conversazioni reali.
Tracking utilizzo — ogni chiamata tracciata, proiezioni mensili incluse.

Benchmark reali
Su 1.000 richieste al giorno con Sonnet 4.6: risparmio stimato $200-$400 al mese.

Gira completamente in locale. I tuoi prompt non passano da nessun server esterno — vanno solo ad Anthropic. Per applicazioni in sanità e security, questo non è un dettaglio.
70 test verdi. Licenza MIT. Made in Italy.
👉 github.com/remo12262/token-saver

📌 TOPICS DA AGGIUNGERE SU GITHUB
Vai su github.com/remo12262/token-saver → ingranaggio ⚙️ accanto a "About" → aggiungi questi:

I built a tool that cuts Anthropic API costs by 67% and it finds the waste before you spend

Quantum Horizon — Wed, 24 Jun 2026 08:25:31 +0000

I was building AI apps on top of Anthropic's API and kept hitting the same problem: costs were higher than expected, and I had no idea where the waste was coming from.
Most monitoring tools tell you what you already spent. I wanted something that tells you what you are about to waste before the request is sent.
So I built token-saver.

What it does
Four things, in order of when they help you:

Static Analyzer — scans your Python source code before you run it tsave scan chatbot.py It finds patterns like API calls inside loops, uncached system prompts, full documents passed on every request, expensive models used for simple tasks. No API key needed. It reads your code like a linter reads style.
Token Counter + Cost Estimator — uses the official Anthropic count_tokens API, not tiktoken (which undercounts Claude tokens by 15-20%)
Semantic Compressor — doesn't just truncate. Scores each message by relevance to the current task, keeps the recent context intact, summarizes the rest. Result: 67% token reduction on real conversations.
Usage Tracking — every call tracked, monthly projections included.

Real benchmark
ScenarioBeforeAfterReductionMulti-turn chatbot (50 turns)12,400 tokens4,100 tokens66.9%RAG pipeline18,200 tokens5,600 tokens69.2%Batch classifier8,500 tokens2,800 tokens67.1%
At 1,000 requests/day on Sonnet 4.6, that is roughly $200-$400/month saved.

DEV Community: Quantum Horizon

Ho costruito uno strumento che riduce i costi Anthropic del 67% — e trova gli sprechi prima che tu li paghi

I built a tool that cuts Anthropic API costs by 67% and it finds the waste before you spend