DEV Community: Siddharth Rathore

Vibe Coding Is Fun — Until Your AI Ships Code With No Auth, No Tests, and a SQL Injection Waiting to Happen

Siddharth Rathore — Thu, 21 May 2026 08:27:43 +0000

AI coding agents are the fastest pair programmer you've ever had. They're also the most dangerously agreeable. Here's how to bake your engineering standards into every AI-assisted session — without copy-pasting a prompt manifesto every time you open a terminal.

The Uncomfortable Truth About Vibe Coding

Vibe coding is real, it's productive, and it's not going anywhere.

You describe what you want in plain language. The AI builds it. You iterate, refine, ship. Features that used to eat three days of focus now land before lunch. Boilerplate evaporates. The flow state hits different — it's faster, more fluid, more fun than anything the industry has seen in years.

But here's the thing nobody says loudly enough:

AI coding agents are not junior developers with bad habits. They're perfect soldiers who follow orders to the letter — including the ones you forgot to give.

They don't carry your team's institutional knowledge. They don't know about that authentication bypass your security team patched last quarter. They've never read your architecture decision records. They have no idea that your compliance officer will reject any PR that touches PII without an audit trail.

They know what you told them. In this session. In this prompt. Nothing more.

So when you're in the zone — shipping fast, riding the wave — the agent is right there with you. Producing code that works. Code that does the thing you asked for. Code that might also have:

❌ No input validation on user-facing endpoints
❌ No error handling beyond letting exceptions bubble to the client
❌ No tests — not even a smoke test
❌ No auth check on that shiny new admin route
❌ A raw SQL query stitched together with f-strings, practically begging for exploitation

The AI didn't ignore your standards. You never told it what they were.

And that's not a prompt engineering problem. That's an architecture problem.

Your SDLC Didn't Survive First Contact With Vibes

Software engineering has decades of battle-tested wisdom encoded into process: code review gates, static analysis, security scanning, test coverage thresholds, dependency audits, compliance checklists. None of it exists because someone thought bureaucracy was fun. Every rule in your SDLC is a scar from a production incident someone lived through.

Vibe coding, done carelessly, routes around all of it.

Not because the AI is incompetent — because it's obedient. It builds exactly what you describe. If your description doesn't mention security, the output won't include it. If you don't specify tests, none get written. If your compliance requirements live in a Confluence doc the AI has never seen, those requirements functionally don't exist.

The common response is predictable: "Write better prompts."

Include your standards. Remind the agent every session. Paste in requirements. Maintain a prompt template in Notion.

Sure. That works. It also means:

🔄 Every developer has to know which standards to paste, and when, and for which kind of project
🆕 Every new repo starts from zero on prompt setup — even if it's the third FastAPI service this quarter
📝 Every standards update has to be manually propagated across every team's prompt templates
🚫 Nothing enforces that anyone actually did any of it
👻 Nothing detects when someone's prompt drifted from the current baseline

You didn't eliminate the governance problem. You moved it from code review into an unversioned, unaudited text box.

What the Problem Actually Demands

You need your engineering standards — security rules, testing expectations, architecture constraints, compliance requirements — to be injected into the agent's context automatically. For every project. Every language. Every framework. Without anyone having to remember, copy-paste, or reinvent from scratch.

You need a policy layer that sits between your governance baseline and the half-dozen AI agents your team actually uses.

That's exactly what agent-policykit does.

pip install agent-policykit
agent-policykit init
agent-policykit generate

Three commands. Your AI agents — GitHub Copilot, Claude Code, Cursor, Aider, OpenAI Codex, Gemini CLI — all get instruction files that reflect your governance baseline, your stack's best practices, and your SDLC standards. Automatically. Consistently. Every time.

No prompt templates. No tribal knowledge. No hoping someone remembered.

A Policy Compiler for the Age of AI Agents

Think of agent-policykit the way you think about a compiler. You define policy once, in a structured format. The tool detects your repo's technology stack and compiles that policy into the native instruction files each agent reads at startup.

The pipeline is four stages:

1️⃣ Detect — scans the repo for languages, frameworks, and project type
2️⃣ Load — pulls in YAML rule packs covering governance, security, testing, and compliance
3️⃣ Merge — compiles everything into a single, coherent PolicyBundle
4️⃣ Render — outputs native instruction files for every configured agent target

The result isn't a vague "please write secure code" reminder bolted onto a system prompt. It's structured, contextual, stack-specific guidance that the agent processes as first-class instructions.

58 Rule Packs. Real SDLC Coverage. Out of the Box.

agent-policykit ships with 58 rule packs across four categories — enough to cover the vast majority of production stacks without writing a single custom rule:

Category	Count	What It Covers
Governance	8 packs	Architecture, security baselines, compliance, operations, testing, review, output contracts
Languages	28 packs	Python, TypeScript, Go, Java, Rust, Ruby, PHP, Kotlin, Swift, C#, and more
Frameworks	13 packs	FastAPI, Django, Express, NestJS, Next.js, Spring Boot, Rails, Flask, and more
Project Types	9 packs	API service, web app, microservice, worker, CLI tool, SDK, monolith, and more

The governance packs are the load-bearing ones. When agent-policykit generates instruction files for your project, the AI agent starts every session already knowing:

🔐 Security baseline — input validation, auth patterns, dependency hygiene, secret management
🧪 Testing requirements — expected coverage, testing idioms for your stack, what qualifies as "tested"
🏗️ Architecture constraints — layer boundaries, allowed dependencies, communication patterns
📋 Compliance posture — data handling rules, audit trails, regulatory considerations
👁️ Review standards — what a proper code review looks like for this kind of project

This is the difference between telling a developer "be secure" and handing them a security checklist calibrated to their exact stack.

Stack-Aware Instructions — Because One Size Fits Nothing

A Python/FastAPI API service and a TypeScript/Next.js web app live in different security universes. They have different injection surfaces, different auth patterns, different testing idioms, different deployment models. A generic "follow best practices" prompt is worse than useless — it's actively misleading.

agent-policykit eliminates this problem at the source. When you run init, it reads your repo:

agent-policykit detect
# → Python, FastAPI, api_service

Then generate compiles instructions that include:

FastAPI-specific security patterns (Depends-based auth, Pydantic validation, middleware ordering)
Python-specific testing conventions (pytest idioms, fixture patterns, coverage tooling)
API service-specific architecture guidance (request lifecycle, error response contracts, rate limiting)

All of it layered on top of your governance baseline. A Rails monolith gets different output — because it should.

Your developers stop guessing which standards apply. The agent already knows.

One Config. Every Agent. Zero Drift.

Here's what the configuration looks like in your pyproject.toml:

[tool.agent-policykit]
targets = ["copilot", "agents-md", "cursor", "claude-code", "aider", "gemini-cli"]
languages = ["python"]
frameworks = ["fastapi"]
project_type = "api_service"
review_mode = false

From this single source of truth, agent-policykit generate writes every instruction file your agents need:

.github/copilot-instructions.md              # GitHub Copilot
.github/instructions/project.instructions.md # VS Code Agents
AGENTS.md                                    # Generic agent instructions
CLAUDE.md                                    # Claude Code (project root)
.claude/rules/shared.md                      # Claude Code (rules)
.cursor/rules/project.mdc                    # Cursor
CONVENTIONS.md                               # Convention-based agents
.aider.conf.yml                              # Aider
GEMINI.md                                    # Gemini CLI
AGENT_POLICY.md                              # Universal policy reference

All consistent. All from the same policy. When your security standards evolve, you regenerate — and every file updates. No manual sync. No "oh, we forgot to update the Cursor rules."

Review Mode: Turn the AI Into Your Toughest Reviewer

Standards enforcement isn't just about what code gets generated — it's about what code gets caught.

agent-policykit generate --mode review

Review mode activates a stricter behavioral overlay. The agent shifts posture: instead of a helpful pair programmer, it becomes a technically demanding code reviewer. Skeptical of missing safeguards. Explicit about security gaps. Thorough on test coverage. Vocal about architectural drift.

The same policy that guided code generation now guides code review — from the same source, with the same rules, coherently.

Safety Guarantees Built for Governance

Because agent-policykit manages files that carry real governance weight, it ships with hard safety properties:

🔒 Security downgrade blocking — If a regeneration would remove a security rule, the operation halts. You must pass --force to override. You cannot silently weaken your agents' security posture.

📐 Managed-section ownership — Generated content is clearly demarcated. Human-authored additions outside managed sections are preserved through every regeneration.

👁️ Dry-run everything — diff is always non-destructive. Both generate and update support --dry-run so you can audit changes before they land.

⚠️ Structured conflict surfacing — Non-security rule removals aren't silently dropped. They're flagged with clear explanations.

The Day-to-Day Workflow

# First time in a repo
agent-policykit init        # Detect stack, write config
agent-policykit generate    # Compile policy, write all agent files

# When standards evolve
agent-policykit diff        # Preview what would change
agent-policykit update      # Regenerate safely, preserve human edits

# CI integration
agent-policykit validate    # Check structural correctness

This is the full loop. It's designed to live in your repo alongside your source code — checked in, versioned, CI-validated, reviewed in PRs like any other infrastructure change.

Who This Is For

👩‍💻 Developers who want to vibe code at full speed without silently shipping vulnerabilities
🏢 Engineering leads tired of standards being bypassed because nobody remembered the right prompt
🛡️ Platform & DevSecOps teams managing secure defaults across dozens of repos and agents
🤝 Consultancies onboarding clients onto different stacks with consistent governance
🌍 Open-source maintainers who want every contributor's AI agent following the same rules

The common thread: anyone who has watched an AI agent produce working, shippable, dangerously insecure code — because nobody told it not to.

Get Started in 60 Seconds

pip install agent-policykit
agent-policykit init
agent-policykit generate

sidrat2612 / agent-policykit

For teams using multiple AI coding agents: detect the repo stack, generate Copilot, Claude, Cursor, Codex, Aider, and Gemini instruction files from one policy, and update them safely.

agent-policykit

One engineering policy in. Agent-specific instruction files out.

For teams using multiple AI coding agents in the same repository

agent-policykit detects the stack in a repository, merges governance with language, framework, and project-type rules, and writes the exact instruction files each coding agent expects.

If your repo has Copilot, Cursor, Claude Code, Codex, Aider, or Gemini users, agent-policykit keeps them aligned on the same security, architecture, testing, and review guidance without hand-editing separate prompt files.

Why?

Most teams that adopt AI coding assistants hit the same problem quickly: every tool wants a different file, a different format, and a different maintenance path.

agent-policykit solves that with a compiler-style workflow:

Situation	Without agent-policykit	With agent-policykit
Multiple agents in one repo	Prompt files drift and contradict each other	One shared policy generates all outputs
Stack-specific guidance	Generic prompts ignore framework and project type	Packs inject Python, FastAPI, monolith, SDK, and other

…

View on GitHub

The repo includes real, validated example fixtures for FastAPI, Next.js, and Rails projects — tested in CI. Inspect exactly what the generated output looks like before running anything against your own codebase.

Vibe coding is worth keeping. Your engineering standards are non-negotiable. agent-policykit is how you stop choosing between the two.

MIT licensed. Contributions welcome — see CONTRIBUTING.md.

Found this useful? Drop a 🦄 and follow for more on AI-assisted development, DevSecOps, and developer tooling.

Stop Treating Mixed Prompts Like One Task: Why I Built RouteSmith

Siddharth Rathore — Thu, 07 May 2026 14:24:28 +0000

I built RouteSmith because mixed prompts are workflows, not single tasks. It routes coding-agent work across real host constraints instead of pretending every environment works the same way.

TL;DR

I built RouteSmith because coding agents still make users do too much manual routing.

If a prompt says, "plan this feature, implement it, add tests, and write docs," that is not one task. It is a workflow.

RouteSmith detects the current host, decomposes the prompt into task types, maps those tasks to capability classes, and routes them using what the host can actually support. If the host supports switching, RouteSmith can suggest concrete models. If it does not, RouteSmith falls back honestly instead of pretending switching happened.

It is especially useful for:

people starting with coding agents
vibe coders who do not want to learn model tradeoffs first
solo builders doing mixed-task prompts
advanced users who want measurable, configurable routing

The Problem That Kept Annoying Me

I kept running into the same moment.

I would open a coding agent and give it one big prompt:

"Plan this feature, implement it, add tests, write docs, and review the result."

At first it felt smooth.

Then the flow broke.

I stopped thinking about the feature and started doing routing in my head.

Questions like:

should planning use a stronger reasoning model?
should coding use something different?
why am I spending the heaviest model on docs and formatting?
does this host even support switching the way I think it does?

That was the real problem.

The prompt was never one task. It was several different jobs bundled together.

What RouteSmith Is

RouteSmith is a host-aware routing layer for coding agents.

It is not another coding agent.

It is not an API gateway.

It sits between a mixed prompt and the host's real capabilities.

The basic flow looks like this:

detect the current host
classify the prompt into task types
map those task types to capability classes
resolve those capabilities against host-native models or strategies
preserve dependency order
track outcomes and improve routing over time

Why "Host-Aware" Is the Important Part

This is the part I care about most.

Too many conversations about multi-model workflows flatten away the host and act like every environment exposes the same control surface.

They do not.

Claude Code, Cursor, Copilot, Codex, Gemini CLI, and Aider do not all behave the same way. Some support real model switching. Some expose model choice differently. Some are much more host-controlled.

So RouteSmith is built around a simple rule:

the host is the source of truth.

If the host supports dynamic switching, RouteSmith can route tasks to concrete models.

If the host does not, RouteSmith does not fake it. It keeps the routing logic and applies prompt strategy instead.

That honest behavior matters more than a fake universal abstraction.

Who This Is For

This project is not just for people who already know the difference between reasoning models, coding models, fast utility models, and cost-optimized routing.

It is also for people who are new to all of that.

1. People starting with coding agents

If you are using agent tools but still do not know when to switch models, RouteSmith is meant to help reduce that decision burden.

2. People doing vibe coding

If your style is to describe the outcome in plain English and keep moving, RouteSmith helps because it treats the prompt like a workflow rather than a blob.

3. Solo builders and founders

If you are doing planning, implementation, tests, docs, and review yourself, task-aware routing becomes immediately useful.

4. Advanced users

If you care about policy overrides, plugins, telemetry, performance-aware routing, and host constraints, RouteSmith has room for that too.

The short version:

RouteSmith is for people who want the benefits of multi-model workflows without having to become experts in model routing first.

A Concrete Example

Say a beginner types this:

Build me a simple expense tracker with authentication, add tests, and write a README.

What that usually means is something like:

planning the feature structure
implementing the app
writing tests
documenting the result

Those are different kinds of work.

RouteSmith can treat them that way.

A conceptual route might look like this:

planning      -> deep_reasoning
coding        -> coding
testing       -> coding
documentation -> balanced

Then the host adapter decides what that means in practice.

If the host supports switching, RouteSmith can suggest concrete models for each step.

If the host does not, it still preserves the task-aware strategy without lying about model control.

How It Works Under the Hood

Deterministic planning

RouteSmith classifies prompts into task types such as:

planning
analysis
coding
testing
refactor
documentation
formatting
review

That planning is deterministic. It does not need live API calls just to understand the shape of the request.

Capability classes

Instead of hardcoding routes directly to model names, RouteSmith maps tasks into capability classes like:

deep_reasoning
coding
balanced
fast

That makes the system portable across hosts.

Dependency-aware execution

Mixed prompts are not just lists. Tests often depend on implementation. Docs usually follow the change. Review comes later.

RouteSmith keeps that order intact.

Performance-aware routing

RouteSmith also records local telemetry such as:

model used
host name
task type
capability class
success or failure
duration
telemetry source

That data is not just for display.

If enough evidence shows that a default model is weak for a capability and a better host-available option exists, RouteSmith can de-prioritize the weaker model.

That turns performance tracking into an active routing signal.

How It Compares to Other Tools

I do not think the useful framing here is "RouteSmith vs everything else."

The useful framing is that adjacent tools solve different layers.

Agent products

Claude Code, Cursor, and Aider are agent products. They are the tools doing the coding work.

RouteSmith is not trying to replace them.

API and gateway infrastructure

LiteLLM and Portkey solve a different problem: multi-provider routing, control, and observability at the API layer.

That is useful, but it is not the same layer RouteSmith lives in.

Rules, skills, and instructions

Instruction surfaces help shape behavior, but they are not routing brains on their own.

RouteSmith sits between these layers as a host-aware routing layer for coding-agent workflows.

If I had to summarize it simply:

use Claude Code, Cursor, or Aider when you want a coding agent
use LiteLLM or Portkey when you want API-layer routing or gateway control
use RouteSmith when you want mixed-task coding prompts routed more intelligently inside real host constraints

What It Actually Gives You

The main benefit is not novelty. It is leverage.

RouteSmith helps by:

reducing model micromanagement
making mixed prompts more structured
respecting host-specific constraints
helping beginners benefit from better routing without needing deep model knowledge
giving advanced users telemetry, policy, and performance-aware adaptation

Try It

pip install routesmith
routesmith detect-host
routesmith explain "Plan this feature, implement it, add tests, and write docs"
routesmith run "Plan this feature, implement it, add tests, and write docs"
routesmith stats

And if you want to use it as a tool inside larger workflows:

routesmith serve-stdio

Final Thought

The interesting part of coding-agent workflows is no longer just the model.

It is the routing layer around the work.

If a prompt contains planning, coding, testing, documentation, and review, then treating it like one undifferentiated request is a bad fit for how software work actually happens.

That is the gap RouteSmith is trying to close.

Github link: github.com/sidrat2612/routesmith
PyPI link: pypi.org/project/routesmith

How a small misconfiguration cost me $10000 in AWS Bill!!

Siddharth Rathore — Mon, 05 Jan 2026 06:39:18 +0000

This was the time when I was just starting with AWS. I came from a background where I worked on bare-metal servers and knew little about Cloud Platforms. I joined a startup, and there I started working on AWS, where the company's entire production environment had only two EC2 instances - one hosting the web application and the other running the database.

Being a startup, after some time, they started getting traction, and with growing users, the database started growing at a rapid pace. To ensure data protection and business continuity, I was asked to design and implement a backup and disaster recovery (DR) strategy.

At the time, AWS did not offer a secondary region within the same country for our geography. Due to strict data compliance requirements, storing data outside the country was not an option - effectively ruling out cross-region DR within AWS. So, after much discussion, I finalised the following plan

Primary Hosting would remain on AWS.
Cold Disaster Recovery would be hosted on Google Cloud Platform (GCP), solely for worst-case scenarios.

To achieve the above I created 3 types of backup jobs–

Full Backup - Once every Friday night
Differential Backup - Every night
Transactional backup - Every 30 min

I uploaded the backups to AWS S3 and then synced them to GCP. In GCP, I retained only two weeks of data to keep storage costs under control.

For the first few months, everything appeared to work as expected. However, after roughly 4 months, I started seeing our AWS bill rising. When I looked closely at the bill, the cost spike was primarily due to data transfer charges running into terabytes. This was puzzling. Our database size was around 200 GB, and even with regular backups, my calculations suggested that monthly transfer costs should not exceed $500.

We raised a support ticket with AWS. After reviewing the case, AWS confirmed that the data transfer charges were legitimate and advised us to inspect our S3 buckets more closely.

After more investigation, I found terabytes of broken multipart uploads - incomplete files that had never been cleaned up. These broken multipart uploads were being picked up by the S3-to-GCP sync process and transferred repeatedly, massively increasing data transfer costs.

With the root cause found, the solution was simple. I applied S3 lifecycle policy to automatically delete incomplete multipart uploads. Once this rule was applied, the unnecessary data transfers stopped, and the AWS bills came down to normal in subsequent months.

From that point onward, every new S3 bucket I created included a default lifecycle rule to clean up incomplete multipart uploads. This costly lesson not only taught me to include this in our best practice but also to check and verify small configurations for Cloud setup and its governance, which we often overlook.