Siddharth Rathore

Posted on May 21

Vibe Coding Is Fun — Until Your AI Ships Code With No Auth, No Tests, and a SQL Injection Waiting to Happen

#ai #vibecoding #softwaredevelopment #security

AI coding agents are the fastest pair programmer you've ever had. They're also the most dangerously agreeable. Here's how to bake your engineering standards into every AI-assisted session — without copy-pasting a prompt manifesto every time you open a terminal.

The Uncomfortable Truth About Vibe Coding

Vibe coding is real, it's productive, and it's not going anywhere.

You describe what you want in plain language. The AI builds it. You iterate, refine, ship. Features that used to eat three days of focus now land before lunch. Boilerplate evaporates. The flow state hits different — it's faster, more fluid, more fun than anything the industry has seen in years.

But here's the thing nobody says loudly enough:

AI coding agents are not junior developers with bad habits. They're perfect soldiers who follow orders to the letter — including the ones you forgot to give.

They don't carry your team's institutional knowledge. They don't know about that authentication bypass your security team patched last quarter. They've never read your architecture decision records. They have no idea that your compliance officer will reject any PR that touches PII without an audit trail.

They know what you told them. In this session. In this prompt. Nothing more.

So when you're in the zone — shipping fast, riding the wave — the agent is right there with you. Producing code that works. Code that does the thing you asked for. Code that might also have:

❌ No input validation on user-facing endpoints
❌ No error handling beyond letting exceptions bubble to the client
❌ No tests — not even a smoke test
❌ No auth check on that shiny new admin route
❌ A raw SQL query stitched together with f-strings, practically begging for exploitation

The AI didn't ignore your standards. You never told it what they were.

And that's not a prompt engineering problem. That's an architecture problem.

Your SDLC Didn't Survive First Contact With Vibes

Software engineering has decades of battle-tested wisdom encoded into process: code review gates, static analysis, security scanning, test coverage thresholds, dependency audits, compliance checklists. None of it exists because someone thought bureaucracy was fun. Every rule in your SDLC is a scar from a production incident someone lived through.

Vibe coding, done carelessly, routes around all of it.

Not because the AI is incompetent — because it's obedient. It builds exactly what you describe. If your description doesn't mention security, the output won't include it. If you don't specify tests, none get written. If your compliance requirements live in a Confluence doc the AI has never seen, those requirements functionally don't exist.

The common response is predictable: "Write better prompts."

Include your standards. Remind the agent every session. Paste in requirements. Maintain a prompt template in Notion.

Sure. That works. It also means:

🔄 Every developer has to know which standards to paste, and when, and for which kind of project
🆕 Every new repo starts from zero on prompt setup — even if it's the third FastAPI service this quarter
📝 Every standards update has to be manually propagated across every team's prompt templates
🚫 Nothing enforces that anyone actually did any of it
👻 Nothing detects when someone's prompt drifted from the current baseline

You didn't eliminate the governance problem. You moved it from code review into an unversioned, unaudited text box.

What the Problem Actually Demands

You need your engineering standards — security rules, testing expectations, architecture constraints, compliance requirements — to be injected into the agent's context automatically. For every project. Every language. Every framework. Without anyone having to remember, copy-paste, or reinvent from scratch.

You need a policy layer that sits between your governance baseline and the half-dozen AI agents your team actually uses.

That's exactly what agent-policykit does.

pip install agent-policykit
agent-policykit init
agent-policykit generate

Three commands. Your AI agents — GitHub Copilot, Claude Code, Cursor, Aider, OpenAI Codex, Gemini CLI — all get instruction files that reflect your governance baseline, your stack's best practices, and your SDLC standards. Automatically. Consistently. Every time.

No prompt templates. No tribal knowledge. No hoping someone remembered.

A Policy Compiler for the Age of AI Agents

Think of agent-policykit the way you think about a compiler. You define policy once, in a structured format. The tool detects your repo's technology stack and compiles that policy into the native instruction files each agent reads at startup.

The pipeline is four stages:

1️⃣ Detect — scans the repo for languages, frameworks, and project type
2️⃣ Load — pulls in YAML rule packs covering governance, security, testing, and compliance
3️⃣ Merge — compiles everything into a single, coherent PolicyBundle
4️⃣ Render — outputs native instruction files for every configured agent target

The result isn't a vague "please write secure code" reminder bolted onto a system prompt. It's structured, contextual, stack-specific guidance that the agent processes as first-class instructions.

58 Rule Packs. Real SDLC Coverage. Out of the Box.

agent-policykit ships with 58 rule packs across four categories — enough to cover the vast majority of production stacks without writing a single custom rule:

Category	Count	What It Covers
Governance	8 packs	Architecture, security baselines, compliance, operations, testing, review, output contracts
Languages	28 packs	Python, TypeScript, Go, Java, Rust, Ruby, PHP, Kotlin, Swift, C#, and more
Frameworks	13 packs	FastAPI, Django, Express, NestJS, Next.js, Spring Boot, Rails, Flask, and more
Project Types	9 packs	API service, web app, microservice, worker, CLI tool, SDK, monolith, and more

The governance packs are the load-bearing ones. When agent-policykit generates instruction files for your project, the AI agent starts every session already knowing:

🔐 Security baseline — input validation, auth patterns, dependency hygiene, secret management
🧪 Testing requirements — expected coverage, testing idioms for your stack, what qualifies as "tested"
🏗️ Architecture constraints — layer boundaries, allowed dependencies, communication patterns
📋 Compliance posture — data handling rules, audit trails, regulatory considerations
👁️ Review standards — what a proper code review looks like for this kind of project

This is the difference between telling a developer "be secure" and handing them a security checklist calibrated to their exact stack.

Stack-Aware Instructions — Because One Size Fits Nothing

A Python/FastAPI API service and a TypeScript/Next.js web app live in different security universes. They have different injection surfaces, different auth patterns, different testing idioms, different deployment models. A generic "follow best practices" prompt is worse than useless — it's actively misleading.

agent-policykit eliminates this problem at the source. When you run init, it reads your repo:

agent-policykit detect
# → Python, FastAPI, api_service

Then generate compiles instructions that include:

FastAPI-specific security patterns (Depends-based auth, Pydantic validation, middleware ordering)
Python-specific testing conventions (pytest idioms, fixture patterns, coverage tooling)
API service-specific architecture guidance (request lifecycle, error response contracts, rate limiting)

All of it layered on top of your governance baseline. A Rails monolith gets different output — because it should.

Your developers stop guessing which standards apply. The agent already knows.

One Config. Every Agent. Zero Drift.

Here's what the configuration looks like in your pyproject.toml:

[tool.agent-policykit]
targets = ["copilot", "agents-md", "cursor", "claude-code", "aider", "gemini-cli"]
languages = ["python"]
frameworks = ["fastapi"]
project_type = "api_service"
review_mode = false

From this single source of truth, agent-policykit generate writes every instruction file your agents need:

.github/copilot-instructions.md              # GitHub Copilot
.github/instructions/project.instructions.md # VS Code Agents
AGENTS.md                                    # Generic agent instructions
CLAUDE.md                                    # Claude Code (project root)
.claude/rules/shared.md                      # Claude Code (rules)
.cursor/rules/project.mdc                    # Cursor
CONVENTIONS.md                               # Convention-based agents
.aider.conf.yml                              # Aider
GEMINI.md                                    # Gemini CLI
AGENT_POLICY.md                              # Universal policy reference

All consistent. All from the same policy. When your security standards evolve, you regenerate — and every file updates. No manual sync. No "oh, we forgot to update the Cursor rules."

Review Mode: Turn the AI Into Your Toughest Reviewer

Standards enforcement isn't just about what code gets generated — it's about what code gets caught.

agent-policykit generate --mode review

Review mode activates a stricter behavioral overlay. The agent shifts posture: instead of a helpful pair programmer, it becomes a technically demanding code reviewer. Skeptical of missing safeguards. Explicit about security gaps. Thorough on test coverage. Vocal about architectural drift.

The same policy that guided code generation now guides code review — from the same source, with the same rules, coherently.

Safety Guarantees Built for Governance

Because agent-policykit manages files that carry real governance weight, it ships with hard safety properties:

🔒 Security downgrade blocking — If a regeneration would remove a security rule, the operation halts. You must pass --force to override. You cannot silently weaken your agents' security posture.

📐 Managed-section ownership — Generated content is clearly demarcated. Human-authored additions outside managed sections are preserved through every regeneration.

👁️ Dry-run everything — diff is always non-destructive. Both generate and update support --dry-run so you can audit changes before they land.

⚠️ Structured conflict surfacing — Non-security rule removals aren't silently dropped. They're flagged with clear explanations.

The Day-to-Day Workflow

# First time in a repo
agent-policykit init        # Detect stack, write config
agent-policykit generate    # Compile policy, write all agent files

# When standards evolve
agent-policykit diff        # Preview what would change
agent-policykit update      # Regenerate safely, preserve human edits

# CI integration
agent-policykit validate    # Check structural correctness

This is the full loop. It's designed to live in your repo alongside your source code — checked in, versioned, CI-validated, reviewed in PRs like any other infrastructure change.

Who This Is For

👩‍💻 Developers who want to vibe code at full speed without silently shipping vulnerabilities
🏢 Engineering leads tired of standards being bypassed because nobody remembered the right prompt
🛡️ Platform & DevSecOps teams managing secure defaults across dozens of repos and agents
🤝 Consultancies onboarding clients onto different stacks with consistent governance
🌍 Open-source maintainers who want every contributor's AI agent following the same rules

The common thread: anyone who has watched an AI agent produce working, shippable, dangerously insecure code — because nobody told it not to.

Get Started in 60 Seconds

pip install agent-policykit
agent-policykit init
agent-policykit generate

sidrat2612 / agent-policykit

For teams using multiple AI coding agents: detect the repo stack, generate Copilot, Claude, Cursor, Codex, Aider, and Gemini instruction files from one policy, and update them safely.

agent-policykit

One engineering policy in. Agent-specific instruction files out.

For teams using multiple AI coding agents in the same repository

agent-policykit detects the stack in a repository, merges governance with language, framework, and project-type rules, and writes the exact instruction files each coding agent expects.

If your repo has Copilot, Cursor, Claude Code, Codex, Aider, or Gemini users, agent-policykit keeps them aligned on the same security, architecture, testing, and review guidance without hand-editing separate prompt files.

Why?

Most teams that adopt AI coding assistants hit the same problem quickly: every tool wants a different file, a different format, and a different maintenance path.

agent-policykit solves that with a compiler-style workflow:

Situation	Without agent-policykit	With agent-policykit
Multiple agents in one repo	Prompt files drift and contradict each other	One shared policy generates all outputs
Stack-specific guidance	Generic prompts ignore framework and project type	Packs inject Python, FastAPI, monolith, SDK, and other

…

View on GitHub

The repo includes real, validated example fixtures for FastAPI, Next.js, and Rails projects — tested in CI. Inspect exactly what the generated output looks like before running anything against your own codebase.

Vibe coding is worth keeping. Your engineering standards are non-negotiable. agent-policykit is how you stop choosing between the two.

MIT licensed. Contributions welcome — see CONTRIBUTING.md.

Found this useful? Drop a 🦄 and follow for more on AI-assisted development, DevSecOps, and developer tooling.

Top comments (2)

Harjot Singh • May 31

This is the exact failure mode that keeps me up at night about the vibe-coding wave - the model optimizes for "looks like it works," and security/correctness are invisible in a demo, so they're precisely what gets skipped. No auth, raw string-interpolated SQL, zero tests: none of those show up when you click around the happy path, which is why they ship silently and detonate in production. The dangerous bugs are the ones the demo can't reveal.

The only real defense is moving these from "hope the AI remembered" to "the pipeline enforces it": auth as a verified default, parameterized queries by construction, a security lint/SAST gate that fails the build on an injection pattern, generated tests that must pass. Make the safe thing the default and the unsafe thing impossible to ship, rather than trusting the model to be careful. That gate-don't-trust approach is the core of Moonshift (a multi-agent pipeline that ships a prompt to a deployed SaaS) - the boring-but-critical 20% (auth, input handling, tests) is generated as verified defaults and gated, so a vibe-coded app can't ship the SQL injection you're describing. Important, necessary post. Of the three (auth, tests, injection), which do you see vibe-coded projects skip most often? My money's on tests, with injection the scariest.

Harjot Singh • May 31

This is the exact failure I built around. Vibe-coding nails the feature and silently skips the parts that don't show in a demo: auth, input validation, tests, the SQL-injection-shaped hole. The model optimizes for "looks done," and security plus edge-cases are invisible in a screenshot. The fix isn't a better prompt, it's a hard gate: does it have auth, does input get validated, do tests exist and pass, before anything is allowed to ship. That generate-then-verify-the-boring-stuff loop is the core of Moonshift, the agent has to wire the unglamorous 20% or the run doesn't count as done. Which gap scares you most in the wild, the missing auth or the injection holes?