I shipped a prompt that silently exploded our API bill — so I built a linter for prompts

#llm #ai #opensource #devtools

A few weeks ago one of my prompts failed in production.

Nothing crashed. No errors were thrown.

But overnight, our API bill spiked because the prompt started generating extremely long responses.

At first I assumed it was a model change or config issue. But after digging in, the real problem was simpler:

We had no way to validate prompts before they ran.

We lint code.

We test code.

But most teams don’t analyze prompts.

So I built a small CLI tool called CostGuardAI.

It analyzes prompts before they run and flags structural risks like:

The idea is simple: treat prompts like code and run static analysis on them.

npm install -g @camj78/costguardai
costguardai analyze my-prompt.txt

It outputs a CostGuardAI Safety Score (0–100, higher = safer) and highlights what’s driving the risk.

The goal isn’t to predict exact model behavior — that’s not possible statically.

It’s closer to a linter: catching prompt structures that tend to break in production.

For teams deploying LLM features, this helps catch issues before they reach users.

It’s still early, but I’m curious how others here are handling prompt validation.

Are you testing prompts, reviewing them manually, or just shipping and monitoring?

DEV Community