DEV Community

Edvisage Global
Edvisage Global

Posted on

I Built a Prompt Injection Detection API From Real Honeypot Data — Now It's on RapidAPI

A few weeks ago I deployed a honeypot on my server — a fake SKILL.md file sitting on port 8888, designed to attract attackers probing AI agent configurations.

It worked. Real requests started hitting it. Prompt injection attempts. Credential probing. SSRF probes targeting internal metadata endpoints. Code injection patterns.

I'd been logging and classifying them manually for research and content. Then I thought — why not wrap the classifier as an API?

That's what Vigil is.

What It Does

Submit any text payload via POST request. Get back a JSON response with a risk score from 0 to 10, the primary attack type detected, all attack categories found, and an indicator count. No LLM involved. No latency. No per-token cost. Pure pattern matching against real attack signatures.

Six attack categories detected:

  • Prompt injection — jailbreaks, instruction overrides, system prompt probing
  • Code injection — eval, exec, subprocess abuse
  • Path traversal — directory climbing, sensitive file access attempts
  • SSRF — metadata endpoint probing, internal network scanning
  • Credential probing — API key fishing, token extraction attempts
  • XSS — cross-site scripting patterns

Why I Built It This Way

Most threat detection tools in the AI space are LLM-based — they send your payload to another model to evaluate it. That introduces latency, cost per call, and a dependency on another AI system to protect your AI system.

Vigil uses pattern matching against a curated signature library built from real honeypot captures. It runs in milliseconds. It costs nothing per call on my end. And it doesn't require trusting a second LLM with your potentially malicious payload.

Who It's For

  • Developers building AI agent pipelines who need input validation before tool execution
  • Security middleware for LLM-powered applications
  • Audit logging systems that need to flag suspicious inputs
  • Anyone who wants a fast, cheap sanity check on user-submitted text before it reaches an agent

Example Response

{
  "risk_score": 3.0,
  "risk_level": "medium",
  "primary_attack_type": "prompt_injection",
  "attack_types_detected": ["prompt_injection"],
  "indicator_count": 2,
  "clean": false,
  "analyzed_at": "2026-04-28T04:35:35Z"
}
Enter fullscreen mode Exit fullscreen mode

It's Live on RapidAPI Now

Three tiers:

Plan Price Calls
Free $0 1/month
Pay Per Use $0.05/call Unlimited
Basic $9/month 500/month

👉 Vigil Threat Classifier on RapidAPI

The honeypot is still running. The signature library will keep growing.

Top comments (0)