Caching is criminally underused for LLM calls. So many teams are re-sending identical or near-identical prompts and paying for it every time. The other big one is context window bloat - stuffing way more into the prompt than necessary because it feels safer. At $2k/month the gains from optimizing are real. Is your tool available publicly or still internal?
Building open-source developer tooling for LLM-powered apps. Focused on the unglamorous stuff — cost visibility, benchmarking, output stability. Vibe coder using Claude Code + Codex.
Totally agree on both — caching especially feels like something everyone knows they should do but never prioritizes until the bill hurts. The near-duplicate problem is sneaky too, exact duplicates are easy to cache but prompts that are 95% the same with a different user name or timestamp still hit the API fresh every time.
Yeah it's public, just pushed it last week — pip install llm-spend-profiler, repo at github.com/BuildWithAbid/llm-cost-profiler. Still early but it detects the main patterns: duplicate calls, retry waste, context bloat, and model downgrade opportunities. Would love to know what it finds on your codebase if you try it.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Caching is criminally underused for LLM calls. So many teams are re-sending identical or near-identical prompts and paying for it every time. The other big one is context window bloat - stuffing way more into the prompt than necessary because it feels safer. At $2k/month the gains from optimizing are real. Is your tool available publicly or still internal?
Totally agree on both — caching especially feels like something everyone knows they should do but never prioritizes until the bill hurts. The near-duplicate problem is sneaky too, exact duplicates are easy to cache but prompts that are 95% the same with a different user name or timestamp still hit the API fresh every time.
Yeah it's public, just pushed it last week — pip install llm-spend-profiler, repo at github.com/BuildWithAbid/llm-cost-profiler. Still early but it detects the main patterns: duplicate calls, retry waste, context bloat, and model downgrade opportunities. Would love to know what it finds on your codebase if you try it.