DEV Community: vorsken

What a policy gate catches in AI-generated code, and what slips through

vorsken — Sun, 07 Jun 2026 12:44:03 +0000

I maintain an open-source GitHub Action called vorsken. It does one thing: scan the diff on a pull request with Semgrep, apply a fixed policy, and return BLOCK, FLAG, or PASS. No dashboard, no model that drifts over time. Rules at ERROR/HIGH/CRITICAL severity block the merge, WARNING/MEDIUM flag it, the rest pass. Same diff, same verdict.

The usual pitch for a tool like this is that it catches the SQL injection your AI assistant wrote. I wanted to see what it actually catches against real assistant output, so I generated 28 functions and ran them through.

The test

Seven backend tasks: a FastAPI upload endpoint, a URL-fetch helper, JWT auth, a SQL filter, an ImageMagick subprocess call, a LangChain file agent, and a LangChain RAG pipeline. I generated each one four times, with ChatGPT (GPT-5.5 Instant), Claude Code (Opus 4.8), Claude Code plus the security-guidance plugin, and Cursor (Composer 2.5). Single-shot, neutral prompt, no security hints. Then I scanned all 28 with the same ruleset.

I'm reporting which rule fired on which file, not whether some model thinks the code is safe. That part you can reproduce.

Task	ChatGPT	Claude Code	+ plugin	Cursor	Verdict
file upload	—	—	—	—	PASS
url fetch (SSRF)	ssrf	ssrf	ssrf	—	FLAG / Cursor PASS
jwt auth	api8	api8	—	—	BLOCK / 2 PASS
sql filter	—	—	—	—	PASS
imagemagick	—	—	—	—	PASS
fs agent	—	overperm	—	—	1 BLOCK / 3 PASS
rag	dangerous	dangerous	dangerous	dangerous	BLOCK

7 BLOCK, 3 FLAG, 18 PASS across 28 functions.

The basics were fine

SQL filter, ImageMagick, file upload: clean on every tool. The SQL was parameterized, the subprocess calls passed argument lists instead of shell strings, the uploads weren't doing anything reckless. If you still expect current models to spray SQL injection across a straightforward CRUD task, they don't. On conventional work they get it right.

Two of the flags are soft. The JWT api8 hits landed on a SECRET_KEY = "CHANGE_ME" placeholder, which you can read as a false positive or as a gate doing its job. The other two configs passed that task: the plugin removed the secret while generating, and Cursor read it from an environment variable. The SSRF flag I'll come back to.

The two findings worth talking about were both in framework code, and they are two different kinds of problem.

Finding 1: an agent with the run of the filesystem

The file-agent task uses LangChain's FileManagementToolkit. Pass it a root_dir and a short selected_tools list and it's pinned to one directory with the operations you chose. Leave those out and it gets the whole filesystem and every operation, delete included.

Three of the four configs scoped it. Claude Code didn't, and the gate's overpermissioned-agent-tool rule blocked it. That is one tool out of four, so it is not evidence that "agentic code is dangerous," and I won't pitch it that way. But the scoped version costs one extra argument, and the unscoped one is what you get by default. That asymmetry is the reason to gate it.

Finding 2: the dangerous flag you can't avoid

The RAG task loads a local FAISS index. All four configs wrote allow_dangerous_deserialization=True, and all four got blocked.

This is different from the agent case. The flag isn't a mistake. FAISS won't load a local index without it, and the deserialization really is unsafe, because it's pickle underneath. The gate can't tell whether that index is your own build artifact or something an attacker dropped in the directory. So it stops at the merge and forces someone to answer that question: keep it because the index is trusted, or move to a format that isn't a code-execution path. The gate doesn't make the call. It makes you make it, in the open.

Where it misses

Now the SSRF flag. Three configs used requests, and the ssrf-via-requests rule flagged them. Cursor used httpx, which that rule doesn't cover, so it passed. The Cursor code isn't safer; it sets follow_redirects=True on an unvalidated URL, the same exposure as the others. The rule just has a hole. A pass from this gate means no rule matched, which is not the same as safe.

The upload task is similar: there's no path-traversal rule yet, so that PASS is partly the gate not checking. And when SSRF does fire, it's a blunt syntactic flag rather than a precise one. These are the limits of a pure-syntax gate, and they're written down in the repo.

That's the trade a gate like this makes. It isn't clever and doesn't try to be. It runs on every PR, and it doesn't care which tool wrote the code or whether a linter was running at the time. The plugin config fixed that hardcoded secret before the gate ever saw it, which is fine, but the plugin isn't on every repo or every machine, and it leaves no record. In-session tools are the first pass. The merge gate is the part that's always there and the same for everyone.

What I'd take from it

The models are not bad at security on direct tasks. The problems showed up one layer up: in framework defaults, and in a trust decision the code can't make on its own. Those are the things worth blocking deterministically at the merge, whoever or whatever wrote the diff.

vorsken is MIT and the rules are in the repo, so you can run the same scan on your own output.

→ vorsken on GitHub

I Built a Gate That Blocks Vulnerable AI-Generated Code Before It Merges

vorsken — Mon, 11 May 2026 13:23:46 +0000

A PR came in last week. All checks passed. Looked fine.

Hardcoded API key on line 11. SSRF vector in the request
handler. Command injection from a Copilot suggestion.

Nothing stopped it from merging. So I built something that does.

vorsken is a GitHub Action that runs Semgrep + Claude on every
PR and posts a BLOCK verdict before bad code reaches main.

The Patterns AI Gets Wrong

I've been running static analysis on AI-generated PRs for a while
now, and the same issues keep coming up.

1. Hardcoded Secrets

This one's almost embarrassingly common. AI tools have seen
millions of examples where inlining credentials "works" — so
they do it without hesitation.

client = OpenAI(api_key="sk-proj-abc123...")

Once this merges, it lives in your git history. Forever.

2. SSRF (Server-Side Request Forgery)

Ask an AI to "fetch data from a URL the user provides" and it
writes exactly that — no validation, no allowlist.

response = requests.get(user_provided_url)

Point that at http://169.254.169.254 and you're pulling cloud
credentials out of the metadata service. Classic.

3. Broken Object Level Authorization (BOLA)

This is the sneaky one. The endpoint looks totally fine at first
glance.

@app.get("/orders/{order_id}")
def get_order(order_id: int):
    return db.query(Order).filter(Order.id == order_id).first()

Any authenticated user can access any order just by changing
the ID. It's OWASP API Top10 #1, and it's basically invisible
in a normal code review.

4. SQL Injection via String Formatting

Even in 2026, AI still reaches for f-strings when building
queries — especially in less common ORMs or raw SQL contexts.

query = f"SELECT * FROM users WHERE username = '{username}'"

Not much to say here. We've known about this for 25 years.

The Fix: A Policy Gate at the PR Level

The standard CI pipeline checks if code works.
It doesn't check if code is safe.

Linters catch style. Tests catch regressions. Neither of them
catches "this endpoint has no ownership check."

What you actually need is a layer that runs security policy
against the PR diff — before merge, every time, automatically.
That means static analysis rules tuned to your threat model,
some AI-assisted context on top (not just pattern matching),
and a clear verdict on every PR: BLOCK, FLAG, or PASS.

Quarterly pentests and post-merge audits don't cut it anymore.
The enforcement has to happen at the pull request.

How vorsken Does It

I built vorsken to solve exactly this. It's a GitHub Action
that runs Semgrep + Claude AI on every PR diff and posts a
verdict as a PR comment.

Setup takes about two minutes:

# .github/workflows/vorsken.yml
- uses: zetide/vorsken@v0.2.6
  with:
    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

You can configure what gets blocked and what gets flagged:

# .stacksecai.yml
policy:
  block_on: ["ERROR"]
  flag_on: ["WARNING"]
claude:
  model: "claude-haiku-4-5"
  severity_block: ["CRITICAL", "HIGH"]
  severity_flag: ["MEDIUM"]

On every PR, you get something like this:

🚨 vorsken Policy Gate — BLOCK

Finding: Hardcoded API key detected
Risk: Credential exposure via git history
Fix: Use environment variables or a secrets manager
Rule: OWASP API8 – Security Misconfiguration

The PR can't merge until the finding is resolved. That's the
point.

Wrapping Up

AI coding tools aren't going away — and honestly, I don't want
them to. But the volume of AI-generated PRs is only going to
increase, and most pipelines aren't ready for what that means.

A policy gate at the PR level isn't a replacement for code
review. It's the layer that catches what humans miss when
they're moving fast.

If you're already shipping AI-generated code (and you probably
are), it's worth five minutes to see what's making it through.

→ vorsken on GitHub
→ GitHub Marketplace
→ vorsken.dev

Stop merging vulnerable API code — automate PR security gates with Semgrep + Claude AI

vorsken — Tue, 28 Apr 2026 00:53:20 +0000

Stop merging vulnerable API code — automate PR security gates with Semgrep + Claude AI

Every team says "we'll fix it after the merge."
They rarely do.

I built vorsken — a GitHub Action that blocks pull requests containing
API vulnerabilities before they reach your main branch.

It combines Semgrep static analysis with Claude AI to post a plain-English
verdict directly in the PR comment: BLOCK / FLAG / PASS.

Here's how to add it to any repo in under 5 minutes.

What it does

When a PR is opened or updated:

Semgrep scans changed files using OWASP API Security Top 10 rules
Claude AI analyzes the findings and generates a human-readable report
A verdict is posted as a PR comment
A BLOCK verdict fails the required check — the merge is prevented

PR opened
└─▶ Semgrep scans with OWASP API Top10 rules
└─▶ Claude AI explains each finding in plain English
└─▶ BLOCK / FLAG / PASS posted as PR comment
└─▶ BLOCK = merge prevented ✋

text

Why not just use Semgrep alone?

Semgrep gives you rule IDs and line numbers.
vorsken adds the context developers actually need:

	Semgrep alone	vorsken
Finding location	✅	✅
OWASP category	✅	✅
What the risk means	❌	✅ (Claude explains)
Concrete fix suggestion	❌	✅ (Claude suggests)
PR comment	❌	✅ (auto-posted)
Merge blocked on BLOCK	❌	✅

Setup (5 minutes)

1. Add your Anthropic API key

In your repository: Settings → Secrets → Actions → New repository secret

Name: ANTHROPIC_API_KEY
Value: sk-ant-...

Don't have a key yet? Get one at console.anthropic.com.

2. Create the workflow file

Create .github/workflows/policy-gate.yml:

name: Policy Gate

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  gate:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: zetide/vorsken@v0.2.6
        with:
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

That's it. Push this file and open a PR.

What the PR comment looks like

When a vulnerability is detected, vorsken posts a comment like this:

🚨 vorsken Policy Gate — BLOCK

Summary: A hardcoded API key was detected in the changed files.

Rule Severity OWASP Description
hardcoded-api-key CRITICAL API8:2023 Hardcoded credential found in source.

Risk:
An attacker with read access to this repository can use the exposed
credential to authenticate as your service and access protected resources.

Fix:
Remove the hardcoded key. Read it from the environment instead:

api_key = os.environ["API_KEY"]

text

No need to look up the rule documentation — the context is right there in the PR.

OWASP API Security Top 10 coverage

vorsken ships with Semgrep rules covering all 10 OWASP API Security risks (2023 edition):

API1 — Broken Object Level Authorization
API2 — Broken Authentication
API3 — Broken Object Property Level Authorization
API4 — Unrestricted Resource Consumption
API5 — Broken Function Level Authorization
API6 — Unrestricted Access to Sensitive Business Flows
API7 — Server Side Request Forgery (SSRF)
API8 — Security Misconfiguration
API9 — Improper Inventory Management
API10 — Unsafe Consumption of APIs

Optional: customize the policy

Add a .stacksecai.yml to your repo root to tune the behavior:

policy:
  block_on: ["ERROR"]
  flag_on: ["WARNING"]

claude:
  model: "claude-haiku-4-5"
  severity_block: ["CRITICAL", "HIGH"]
  severity_flag: ["MEDIUM"]

rules:
  overrides:
    - rule_id: "hardcoded-password"
      action: "BLOCK"

Use your own Semgrep rules

Point semgrep-rules to your own rule directory:

- uses: zetide/vorsken@v0.2.6
  with:
    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
    semgrep-rules: ./rules/my-custom-rules

How it's built

Semgrep — static analysis engine
Claude API (claude-haiku-4-5) — AI analysis and plain-English output
tenacity — exponential backoff retry on rate limits
GitHub Actions — zero-infrastructure deployment

The action is MIT licensed. Source: github.com/zetide/vorsken

Try it now

The easiest way to see it in action:

Fork or clone any Python API project
Add the workflow file above
Open a PR that touches a file with a hardcoded credential or missing auth check
Watch the BLOCK verdict appear in the PR comment

Available on GitHub Marketplace.

Feedback and contributions welcome — if you try it out, let me know what you think in the comments.

⭐ If this looks useful, a star on GitHub helps others find it:
github.com/zetide/vorsken