DEV Community

Cam
Cam

Posted on

I shipped a prompt that silently exploded our API bill — so I built a linter for prompts

A few weeks ago one of my prompts failed in production.

Nothing crashed. No errors were thrown.

But overnight, our API bill spiked because the prompt started generating extremely long responses.

At first I assumed it was a model change or config issue. But after digging in, the real problem was simpler:

We had no way to validate prompts before they ran.

We lint code.

We test code.

But most teams don’t analyze prompts.

So I built a small CLI tool called CostGuardAI.

It analyzes prompts before they run and flags structural risks like:

  • prompt injection / jailbreak surface
  • instruction ambiguity
  • conflicting directives
  • unconstrained outputs (hallucination risk)
  • token explosion / context misuse

The idea is simple: treat prompts like code and run static analysis on them.

Example

npm install -g @camj78/costguardai
costguardai analyze my-prompt.txt
Enter fullscreen mode Exit fullscreen mode

It outputs a CostGuardAI Safety Score (0–100, higher = safer) and highlights what’s driving the risk.

The goal isn’t to predict exact model behavior — that’s not possible statically.

It’s closer to a linter: catching prompt structures that tend to break in production.

For teams deploying LLM features, this helps catch issues before they reach users.


It’s still early, but I’m curious how others here are handling prompt validation.

Are you testing prompts, reviewing them manually, or just shipping and monitoring?

Top comments (0)