DEV Community

Cover image for Vibe Coding Is Fun — Until Your AI Ships Code With No Auth, No Tests, and a SQL Injection Waiting to Happen
Siddharth Rathore
Siddharth Rathore

Posted on

Vibe Coding Is Fun — Until Your AI Ships Code With No Auth, No Tests, and a SQL Injection Waiting to Happen

AI coding agents are the fastest pair programmer you've ever had. They're also the most dangerously agreeable. Here's how to bake your engineering standards into every AI-assisted session — without copy-pasting a prompt manifesto every time you open a terminal.

The Uncomfortable Truth About Vibe Coding

Vibe coding is real, it's productive, and it's not going anywhere.

You describe what you want in plain language. The AI builds it. You iterate, refine, ship. Features that used to eat three days of focus now land before lunch. Boilerplate evaporates. The flow state hits different — it's faster, more fluid, more fun than anything the industry has seen in years.

But here's the thing nobody says loudly enough:

AI coding agents are not junior developers with bad habits. They're perfect soldiers who follow orders to the letter — including the ones you forgot to give.

They don't carry your team's institutional knowledge. They don't know about that authentication bypass your security team patched last quarter. They've never read your architecture decision records. They have no idea that your compliance officer will reject any PR that touches PII without an audit trail.

They know what you told them. In this session. In this prompt. Nothing more.

A friendly robot saluting, oblivious to the trail of security vulnerabilities it left behind — broken locks, warning alerts, and missing tests scattered across code blocks

So when you're in the zone — shipping fast, riding the wave — the agent is right there with you. Producing code that works. Code that does the thing you asked for. Code that might also have:

  • ❌ No input validation on user-facing endpoints
  • ❌ No error handling beyond letting exceptions bubble to the client
  • ❌ No tests — not even a smoke test
  • ❌ No auth check on that shiny new admin route
  • ❌ A raw SQL query stitched together with f-strings, practically begging for exploitation

The AI didn't ignore your standards. You never told it what they were.

And that's not a prompt engineering problem. That's an architecture problem.


Your SDLC Didn't Survive First Contact With Vibes

Software engineering has decades of battle-tested wisdom encoded into process: code review gates, static analysis, security scanning, test coverage thresholds, dependency audits, compliance checklists. None of it exists because someone thought bureaucracy was fun. Every rule in your SDLC is a scar from a production incident someone lived through.

Vibe coding, done carelessly, routes around all of it.

Not because the AI is incompetent — because it's obedient. It builds exactly what you describe. If your description doesn't mention security, the output won't include it. If you don't specify tests, none get written. If your compliance requirements live in a Confluence doc the AI has never seen, those requirements functionally don't exist.

The common response is predictable: "Write better prompts."

Include your standards. Remind the agent every session. Paste in requirements. Maintain a prompt template in Notion.

Sure. That works. It also means:

  • 🔄 Every developer has to know which standards to paste, and when, and for which kind of project
  • 🆕 Every new repo starts from zero on prompt setup — even if it's the third FastAPI service this quarter
  • 📝 Every standards update has to be manually propagated across every team's prompt templates
  • 🚫 Nothing enforces that anyone actually did any of it
  • 👻 Nothing detects when someone's prompt drifted from the current baseline

You didn't eliminate the governance problem. You moved it from code review into an unversioned, unaudited text box.


What the Problem Actually Demands

You need your engineering standards — security rules, testing expectations, architecture constraints, compliance requirements — to be injected into the agent's context automatically. For every project. Every language. Every framework. Without anyone having to remember, copy-paste, or reinvent from scratch.

You need a policy layer that sits between your governance baseline and the half-dozen AI agents your team actually uses.

That's exactly what agent-policykit does.

pip install agent-policykit
agent-policykit init
agent-policykit generate
Enter fullscreen mode Exit fullscreen mode

Three commands. Your AI agents — GitHub Copilot, Claude Code, Cursor, Aider, OpenAI Codex, Gemini CLI — all get instruction files that reflect your governance baseline, your stack's best practices, and your SDLC standards. Automatically. Consistently. Every time.

No prompt templates. No tribal knowledge. No hoping someone remembered.


A Policy Compiler for the Age of AI Agents

Think of agent-policykit the way you think about a compiler. You define policy once, in a structured format. The tool detects your repo's technology stack and compiles that policy into the native instruction files each agent reads at startup.

YAML rule packs for governance, security, and testing flowing into a central compiler engine, which outputs agent-specific instruction files for six different AI coding tools

The pipeline is four stages:

1️⃣ Detect — scans the repo for languages, frameworks, and project type
2️⃣ Load — pulls in YAML rule packs covering governance, security, testing, and compliance
3️⃣ Merge — compiles everything into a single, coherent PolicyBundle
4️⃣ Render — outputs native instruction files for every configured agent target

The result isn't a vague "please write secure code" reminder bolted onto a system prompt. It's structured, contextual, stack-specific guidance that the agent processes as first-class instructions.


58 Rule Packs. Real SDLC Coverage. Out of the Box.

agent-policykit ships with 58 rule packs across four categories — enough to cover the vast majority of production stacks without writing a single custom rule:

Category Count What It Covers
Governance 8 packs Architecture, security baselines, compliance, operations, testing, review, output contracts
Languages 28 packs Python, TypeScript, Go, Java, Rust, Ruby, PHP, Kotlin, Swift, C#, and more
Frameworks 13 packs FastAPI, Django, Express, NestJS, Next.js, Spring Boot, Rails, Flask, and more
Project Types 9 packs API service, web app, microservice, worker, CLI tool, SDK, monolith, and more

The governance packs are the load-bearing ones. When agent-policykit generates instruction files for your project, the AI agent starts every session already knowing:

  • 🔐 Security baseline — input validation, auth patterns, dependency hygiene, secret management
  • 🧪 Testing requirements — expected coverage, testing idioms for your stack, what qualifies as "tested"
  • 🏗️ Architecture constraints — layer boundaries, allowed dependencies, communication patterns
  • 📋 Compliance posture — data handling rules, audit trails, regulatory considerations
  • 👁️ Review standards — what a proper code review looks like for this kind of project

This is the difference between telling a developer "be secure" and handing them a security checklist calibrated to their exact stack.


Stack-Aware Instructions — Because One Size Fits Nothing

A Python/FastAPI API service and a TypeScript/Next.js web app live in different security universes. They have different injection surfaces, different auth patterns, different testing idioms, different deployment models. A generic "follow best practices" prompt is worse than useless — it's actively misleading.

agent-policykit eliminates this problem at the source. When you run init, it reads your repo:

agent-policykit detect
# → Python, FastAPI, api_service
Enter fullscreen mode Exit fullscreen mode

Then generate compiles instructions that include:

  • FastAPI-specific security patterns (Depends-based auth, Pydantic validation, middleware ordering)
  • Python-specific testing conventions (pytest idioms, fixture patterns, coverage tooling)
  • API service-specific architecture guidance (request lifecycle, error response contracts, rate limiting)

All of it layered on top of your governance baseline. A Rails monolith gets different output — because it should.

Your developers stop guessing which standards apply. The agent already knows.


One Config. Every Agent. Zero Drift.

A single glowing TOML config file radiating golden light beams down to six AI agent workspaces, each displaying identical security shields — representing consistent standards from one source of truth

Here's what the configuration looks like in your pyproject.toml:

[tool.agent-policykit]
targets = ["copilot", "agents-md", "cursor", "claude-code", "aider", "gemini-cli"]
languages = ["python"]
frameworks = ["fastapi"]
project_type = "api_service"
review_mode = false
Enter fullscreen mode Exit fullscreen mode

From this single source of truth, agent-policykit generate writes every instruction file your agents need:

.github/copilot-instructions.md              # GitHub Copilot
.github/instructions/project.instructions.md # VS Code Agents
AGENTS.md                                    # Generic agent instructions
CLAUDE.md                                    # Claude Code (project root)
.claude/rules/shared.md                      # Claude Code (rules)
.cursor/rules/project.mdc                    # Cursor
CONVENTIONS.md                               # Convention-based agents
.aider.conf.yml                              # Aider
GEMINI.md                                    # Gemini CLI
AGENT_POLICY.md                              # Universal policy reference
Enter fullscreen mode Exit fullscreen mode

All consistent. All from the same policy. When your security standards evolve, you regenerate — and every file updates. No manual sync. No "oh, we forgot to update the Cursor rules."


Review Mode: Turn the AI Into Your Toughest Reviewer

Standards enforcement isn't just about what code gets generated — it's about what code gets caught.

agent-policykit generate --mode review
Enter fullscreen mode Exit fullscreen mode

Review mode activates a stricter behavioral overlay. The agent shifts posture: instead of a helpful pair programmer, it becomes a technically demanding code reviewer. Skeptical of missing safeguards. Explicit about security gaps. Thorough on test coverage. Vocal about architectural drift.

The same policy that guided code generation now guides code review — from the same source, with the same rules, coherently.


Safety Guarantees Built for Governance

Because agent-policykit manages files that carry real governance weight, it ships with hard safety properties:

🔒 Security downgrade blocking — If a regeneration would remove a security rule, the operation halts. You must pass --force to override. You cannot silently weaken your agents' security posture.

📐 Managed-section ownership — Generated content is clearly demarcated. Human-authored additions outside managed sections are preserved through every regeneration.

👁️ Dry-run everythingdiff is always non-destructive. Both generate and update support --dry-run so you can audit changes before they land.

⚠️ Structured conflict surfacing — Non-security rule removals aren't silently dropped. They're flagged with clear explanations.


The Day-to-Day Workflow

# First time in a repo
agent-policykit init        # Detect stack, write config
agent-policykit generate    # Compile policy, write all agent files

# When standards evolve
agent-policykit diff        # Preview what would change
agent-policykit update      # Regenerate safely, preserve human edits

# CI integration
agent-policykit validate    # Check structural correctness
Enter fullscreen mode Exit fullscreen mode

This is the full loop. It's designed to live in your repo alongside your source code — checked in, versioned, CI-validated, reviewed in PRs like any other infrastructure change.


Who This Is For

Before and after: chaotic tangled code with broken locks and warnings transforms into organized, secure pipelines with shields and green checkmarks

  • 👩‍💻 Developers who want to vibe code at full speed without silently shipping vulnerabilities
  • 🏢 Engineering leads tired of standards being bypassed because nobody remembered the right prompt
  • 🛡️ Platform & DevSecOps teams managing secure defaults across dozens of repos and agents
  • 🤝 Consultancies onboarding clients onto different stacks with consistent governance
  • 🌍 Open-source maintainers who want every contributor's AI agent following the same rules

The common thread: anyone who has watched an AI agent produce working, shippable, dangerously insecure code — because nobody told it not to.


Get Started in 60 Seconds

pip install agent-policykit
agent-policykit init
agent-policykit generate
Enter fullscreen mode Exit fullscreen mode

GitHub logo sidrat2612 / agent-policykit

For teams using multiple AI coding agents: detect the repo stack, generate Copilot, Claude, Cursor, Codex, Aider, and Gemini instruction files from one policy, and update them safely.

agent-policykit


One engineering policy in. Agent-specific instruction files out.



For teams using multiple AI coding agents in the same repository



CI
License
Stars




agent-policykit detects the stack in a repository, merges governance with language, framework, and project-type rules, and writes the exact instruction files each coding agent expects.

If your repo has Copilot, Cursor, Claude Code, Codex, Aider, or Gemini users, agent-policykit keeps them aligned on the same security, architecture, testing, and review guidance without hand-editing separate prompt files.

Why?

Most teams that adopt AI coding assistants hit the same problem quickly: every tool wants a different file, a different format, and a different maintenance path.

agent-policykit solves that with a compiler-style workflow:





















Situation Without agent-policykit With agent-policykit
Multiple agents in one repo Prompt files drift and contradict each other One shared policy generates all outputs
Stack-specific guidance Generic prompts ignore framework and project type Packs inject Python, FastAPI, monolith, SDK, and other





The repo includes real, validated example fixtures for FastAPI, Next.js, and Rails projects — tested in CI. Inspect exactly what the generated output looks like before running anything against your own codebase.

Vibe coding is worth keeping. Your engineering standards are non-negotiable. agent-policykit is how you stop choosing between the two.


MIT licensed. Contributions welcome — see CONTRIBUTING.md.


Found this useful? Drop a 🦄 and follow for more on AI-assisted development, DevSecOps, and developer tooling.

Top comments (0)