DEV Community

Weezy
Weezy

Posted on

I built a CLI to catch prompt injection & LLM jailbreaks. Want to try it out?

Prompt injection and jailbreaks aren’t theoretical anymore. They’re in the wild.

Most developers building with LLMs aren’t scanning their logs for red flags like:

• DAN-style jailbreaks
• System prompt leaks
• Hardcoded instructions being bypassed
• Prompt formatting attacks

I’ve been working on a CLI called PromptShield that detects these risks automatically from your logs and files. Think of it as ESLint or Semgrep, but for LLM safety.

What it does:

• Scans .json, .ndjson, .txt logs of prompts/responses
• Detects jailbreaks, prompt injections, prompt leakage, and policy violations
• Filters by severity and category (e.g., --category security, --fail-on=high)
• Outputs clean markdown, JSON, or terminal reports for CI/CD use

I’m looking for 3–5 devs building with LLMs to test it before I publish

It’s not live on npm yet, but fully working.

If you:

• Are building with OpenAI, Claude, Mistral, etc.
• Have logs or prompt templates you want to scan
• Want to validate you’re not shipping unsafe behavior

Drop a comment. I’ll send you private access.

Top comments (0)