DEV Community

Cover image for AI Agent Prompts: Guardrails to Prevent Code Disasters
Iurii Rogulia
Iurii Rogulia

Posted on • Originally published at iurii.rogulia.fi

AI Agent Prompts: Guardrails to Prevent Code Disasters

I opened an older project last month — one I hadn't touched in a few weeks — and started a new AI session without checking whether CLAUDE.md was current. Within ten minutes the agent had introduced a dependency I didn't ask for, created a new utility file alongside an existing one it should have modified, and written a component using a pattern I'd replaced in the previous sprint. Nothing was broken. All three things were subtly wrong in exactly the ways I've described in the earlier articles in this series.

Then I updated CLAUDE.md with the changes from the previous sprint and started the session again. Same agent, same task. This time the output fit — correct file location, no new dependency, component in the right pattern.

That's not magic. It's context. The agent optimizes within what it can see, and what it can see at the start of a session determines what it will build. You can either deliver that context in every prompt — exhausting, inconsistent — or write it down once, in a place the agent reads automatically. CLAUDE.md is that place.

This is the most practical article in this series. No philosophy, no assessments of where AI is headed. Just what I put in CLAUDE.md, what I keep in my system prompt, and the short phrases I use to stop an agent mid-session when it's going sideways.

What CLAUDE.md Is and Why It Matters

CLAUDE.md is a file you place in the root of your repository. Claude Code reads it automatically at the start of every session. Other tools have equivalents: .cursorrules in Cursor, a system prompt in Aider's config, AGENTS.md if you use OpenAI's tools. The names differ; the function is similar. The strength of the effect varies considerably between agents — some treat the file as a hard preamble that conditions every response, others as one input among many. Most of what follows assumes the stronger end of that spectrum, which matches my own Claude Code workflow. Adapt accordingly if your agent treats persistent context more loosely.

The right mental model: CLAUDE.md is the architectural onboarding document for an engineer who joins your project fresh every single session. That engineer is capable, fast, and has zero memory of anything that happened before this conversation started. They will make reasonable-sounding decisions about everything you don't specify. Most of those decisions will be individually defensible. Some will conflict with decisions made three sessions ago. A few will introduce patterns you spent time eliminating.

Without CLAUDE.md, the agent reconstructs a mental model of your project from whatever code it can see in the current context. It makes inferences. Some are correct. The ones that aren't produce the kinds of problems I described in the first article: duplicate patterns, reinvented abstractions, dependencies added without consideration of what's already there.

The second article in this series introduced the idea of specification precision as the senior developer's core advantage in the AI era. CLAUDE.md is where that precision lives persistently — not reconstructed from scratch in every prompt, but written once and inherited by every session.

A good CLAUDE.md makes architectural decisions visible to the model before it writes a single line. That is the most a text file can do — and it is not the whole game. The rest comes from engineering controls, which I'll come back to in a moment.

Structure That Works

I've tried various formats over the past year. What follows is what I've settled on, with reasoning for each section.

Tech stack and versions

Start with what's actually in use. Don't assume the model knows. Framework versions matter — Next.js 15 and Next.js 16 have different async params behavior; Tailwind v3 and v4 have completely different configuration. If you don't state the version, the model writes for whatever is most common in its training data, which may not match what you're running.

## Stack

- Next.js 16 App Router (not Pages Router)
- TypeScript 5.x — strict mode enabled
- Tailwind CSS v4 with @tailwindcss/postcss
- Drizzle ORM on PostgreSQL (not Prisma)
- Velite for MDX content compilation
Enter fullscreen mode Exit fullscreen mode

That list takes thirty seconds to write and prevents a class of errors that would each take ten minutes to diagnose.

Critical configuration that differs from the norm

This is where you document the things that look wrong to someone unfamiliar with your setup but are correct for a reason. The model defaults to common patterns; this section overrides them.

## Critical config

- Fonts: use `geist` npm package (`GeistSans` from `geist/font/sans`).
  Do NOT use `next/font/google` — it makes live network calls in dev and slows SSR.
- Node.js 22 (not 20). Some APIs differ.
- OG images are static files under /public/images/, NOT generated via Satori or opengraph-image.tsx.
  Adding opengraph-image.tsx to any slug dir causes a Turbopack CPU hang. Don't.
- All dynamic route params are Promise<{ slug: string }> — Next.js 16 requirement.
Enter fullscreen mode Exit fullscreen mode

These are not conventions. They are facts about your environment that, if unknown to the model, will cause bugs that look like configuration mistakes. Document them.

Architecture decisions

Not a full ADR, but a short record of the choices that shape how new code should be written. The key is to include the outcome, not the deliberation.

## Architecture

- Content lives in content/, compiled by Velite into .velite/ (imported as @/.velite).
  Never import from .velite directly in application code.
- Authentication happens in middleware.ts. Route handlers do not re-check auth.
- Services are defined in lib/services.ts. Do not add new service definitions inline in page files.
- Prices are always in euros (€). No dollar amounts anywhere in the UI.
Enter fullscreen mode Exit fullscreen mode

This section prevents the model from making architectural decisions that conflict with existing ones. Without it, the model invents. Its inventions are locally coherent — which is exactly why they're hard to notice until you're three sessions downstream and the codebase has two authentication patterns.

Conventions

Naming, file placement, component structure. Wherever your project follows a consistent style, write it down. The model will match whatever it can see in context; if the relevant files aren't in the current session, it will invent something reasonable-looking.

## Conventions

- Components go in components/shared/ if used in more than one place.
- Page-specific components go in components/[feature]/.
- Use cn() from lib/utils for class composition (clsx + tailwind-merge).
- Event handlers: onClick not handleClick.
- File names: kebab-case for all files under app/ and components/.
Enter fullscreen mode Exit fullscreen mode

What not to do — the most important section

This section consistently does more work than any other. The negative rules are where you prevent the failures that have actually happened or that you're actively worried about.

The key is to include the reason alongside the prohibition. "Don't do X" is vague. "Don't do X, because last time we did it Y happened, and we now use Z instead" gives the model enough context to recognize similar situations it might not have seen before.

## Do not

- Add npm dependencies without asking. We had an incident where `axios` was
  added alongside `fetch` already in use everywhere. Adding a dependency is
  a project-level decision, not a per-task convenience.
- Create new files for functionality that belongs in an existing module.
  Check whether a relevant file already exists before creating a new one.
- Use `any` in TypeScript. If the type is genuinely unknown, use `unknown`
  and narrow explicitly.
- Add try/catch blocks that swallow errors. Errors should propagate.
  If you catch, you must either re-throw or handle concretely — not log and return null.
- Write tests that assert return types. Tests assert behavior:
  specific inputs, specific outputs, specific side effects.
Enter fullscreen mode Exit fullscreen mode

Critical files

A short reference list of the files a new contributor would need to know about first. This helps the model orient quickly rather than inferring the structure from incomplete context.

## Key files

- config/site.ts — siteConfig, nav links, availability status
- lib/services.ts — all 8 service definitions (getServiceBySlug)
- lib/utils.ts — cn(), isPublished(), formatDate(), absoluteUrl()
- velite.config.ts — content schema for Blog, Project, Review collections
Enter fullscreen mode Exit fullscreen mode

This is not a replacement for good documentation. It's a pointer that lets the model find existing patterns rather than inventing new ones.

What Doesn't Work in CLAUDE.md

Making it too long. The file is part of the model's context window. If it's 500 lines, that's 500 lines the model can't use for your actual code. I keep mine under 150 lines. The rule: if you'd only mention it in one specific task, it belongs in the prompt, not in CLAUDE.md. The file is for things that apply in every session.

Vague rules. "Write clean code." "Follow best practices." "Keep it simple." These are decorative. The model is already trying to do these things by its own assessment of what they mean. Write specific rules, not aspirations: "decompose functions whose cyclomatic complexity exceeds 10" beats "keep functions short."

Outdated rules you forgot to remove. If you changed your approach but didn't update CLAUDE.md, the model will follow the old rule — confidently, consistently, and incorrectly. The file needs maintenance. When you change an architectural decision, update the file in the same commit. Stale rules are worse than no rules because they actively misdirect.

Duplicating the README. CLAUDE.md is written for the model, not for humans. It should not contain marketing prose, "project vision" paragraphs, or onboarding instructions meant for a new hire. Write it like you're writing a config file: dense, specific, imperative. The README is for people. CLAUDE.md is for the agent. For a real-world example of what this discipline looks like across a production SaaS, the vatnode.dev codebase uses exactly this structure.

Where Automation Beats Text

slug="fractional-cto"
text="I help engineering teams set up the full stack: CLAUDE.md, lint rules, CI gates, and review discipline — the structure that makes AI-assisted development architecturally consistent."
/>

This is the section I'd skip in a less honest article. Many rules that end up in CLAUDE.md should not be there — they should be enforced by the toolchain.

"Don't use any in TypeScript" is a lint rule. ESLint's @typescript-eslint/no-explicit-any catches it deterministically; CLAUDE.md catches it probabilistically. "Don't add a dependency without asking" is a CI check on package.json diffs. "Don't import from .velite directly" is an ESLint no-restricted-imports rule with a one-line message. "Cyclomatic complexity should not exceed 10" is a static-analysis rule. "Tests assert behavior, not return types" can be partly caught by a custom lint that flags typeof X === 'number' assertions.

The pattern: anything that can be mechanically verified should be mechanically verified. Linters, formatters, CI gates, type checks, AST-based rules, package-approval steps in dependency management — all of these enforce constraints regardless of what the model decides to output. They don't rely on the model reading and following instructions correctly; they reject the bad output at the boundary. CLAUDE.md is what catches the things you cannot encode that way: the architectural choices, the historical reasons, the "we tried this and it didn't work" context. For everything else, lean on the tooling.

The reason this matters is the same reason text prompts have limits at all. Even a perfectly written instruction is followed probabilistically. A lint rule is followed deterministically. If you have a choice, choose the deterministic mechanism. Use CLAUDE.md for what only a senior engineer's memory could otherwise hold — not for things a config file could enforce.

System Prompts vs. CLAUDE.md

CLAUDE.md captures per-project conventions. System prompts — configured in your AI tool's settings, applied to all sessions — capture per-developer habits that span all projects.

The distinction matters because it determines where a rule belongs. "Use Drizzle ORM, not Prisma" is a project convention — it only applies here. "Always ask before installing a dependency" is a developer habit — I want it enforced regardless of which project I'm in.

My global system prompt is short. The parts I'd suggest as universally useful:

Before installing any package or dependency, ask whether it's needed
and what already exists that covers the same need.

When asked to add functionality, check whether a relevant file or
module already exists before creating a new one.

If you're uncertain which of two approaches fits the project better,
state both options with trade-offs rather than picking one silently.
Enter fullscreen mode Exit fullscreen mode

The third one is, in my experience, the most useful of the three. The model tends to pick an approach and implement it. Surfacing the choice explicitly costs a few seconds and has saved me from several decisions I would have needed to revisit.

Prompts That Stop a Session Mid-Track

CLAUDE.md handles the start of a session. These handle what happens when, mid-session, the output goes in a direction you don't want.

I described the signals that trigger these stops in the previous article — the new dependency that appeared without being asked for, the new file created instead of an existing one modified, the pattern reinvented rather than matched. These prompts are the responses to those signals.

Don't write a new file. Modify [path/to/existing/file].

Use this the moment you see the model create a new file for functionality that belongs in an existing one. Don't try to reconcile the two outputs — discard what was produced and redirect explicitly with the target file named. The model responds to precision; "modify the existing file" is less effective than "modify lib/queries/orders.ts."

Check whether this pattern already exists in [path]. If it does, match it instead.

When the model has reinvented something. Point it at the directory where the existing pattern lives. This usually produces a much better result than asking it to "follow the existing conventions" without a concrete reference.

Why this approach over [alternative]? What are the trade-offs?

When you're not sure whether what you received is the right solution. Forces the model to articulate reasoning rather than just output code. Useful when the output looks plausible but something feels off — often the explanation reveals an assumption that doesn't hold in your specific context.

This doesn't compile / fails this test / breaks this assumption. Re-read [file] and fix only the [specific thing], without changing [X].

The second half of this prompt is what matters. Without it, the model will fix the reported problem by also changing the test, or adjusting the schema, or modifying a caller — whatever it thinks makes the error go away. Specifying what must not change forces it to find a solution within the actual constraint.

Remove the try/catch. Errors should propagate.

Short and specific. The model adds defensive try/catch blocks because it has seen a lot of code that does this. When the catch block logs and returns null, it turns a recoverable failure into a silent one. I reject these every time.

What's the failure mode if [the external service / queue / database] is slow or unavailable here?

Use this on any code that calls an external dependency. The model writes for the happy path. This prompt forces it to address what happens in the unhappy one: timeout handling, retry behavior, what the caller receives when things go wrong. The answer is sometimes already covered by surrounding code — in which case, good. Sometimes it isn't, and you want to know before the code is in production.

Add a comment explaining why this number / threshold / timeout.

setTimeout(callback, 5000) — why 5000? maxRetries = 3 — where did 3 come from? The model uses sensible-looking defaults without documenting the reasoning. I either add the comment myself or ask the model to explain and then document it. Six months from now, when the service has changed and someone is reading that code, the comment will matter.

These prompts are the operational layer of what I mean by stop signals. They're not clever. They're specific, they're short, and they redirect with enough precision that the model can do something useful with them.

When Prompts Don't Help

This matters as much as the prompts themselves.

Prompts are a lever for execution. They make the model more likely to do the right thing within a well-defined task. They do not substitute for architectural thinking, and they don't work when the problem is that the task itself isn't clearly defined.

There are also structural limits no prompt overcomes. Models have finite attention; constraints stated clearly at the start of a long session lose weight as the conversation grows. Instruction-following is statistical, not deterministic — the same CLAUDE.md rule that's honored ninety times out of a hundred will be quietly ignored on the ninety-first, often without any signal that it happened. Across a long refactor that spans multiple sessions, prompts in different conversations can produce decisions that are individually reasonable but globally inconsistent. None of this is a failure of how the prompt is written. It is a property of how current models work, and pretending otherwise leads to a false sense of safety.

At larger scale, more failure modes appear. Multiple agents working in parallel will each make locally reasonable choices that conflict at merge time — the resulting reconciliation often has to be done by a person, regardless of how well-specified each individual prompt was. A CLAUDE.md that diverges across long-lived branches can produce contradictory output until the branches reunify. Repository content itself can act as a prompt injection vector — AI-generated docs, comments left by other agents, onboarding files written upstream, even seemingly innocuous markdown can carry instructions that the model treats with the same weight as yours. And even with the most careful instructions, models will occasionally produce confident-sounding references to internal APIs that don't exist — the constraint that the function "must use the existing helper" does not prevent the model from inventing a helper that looks like it should exist.

If I find myself reprompting the same thing five times and the output isn't converging — that's a signal to stop the session, not to prompt more carefully. Either the task needs to be broken down differently, or the relevant context isn't visible to the model, or the right solution requires judgment that can't be encoded in a prompt. In any of these cases, twenty minutes of working directly in the code is more effective than twenty minutes of prompted iteration.

The other situation where prompts stop helping: when I notice cognitive offload in myself — accepting a suggestion not because I've evaluated it but because I'm tired and it looked fine. No prompt addresses that. The right response is to close the session, take a break, and come back when I can evaluate the output properly. This is the failure mode that doesn't produce an error — it produces code that seems fine until it doesn't.

Prompts are also not a fix for a missing architect. The first article in this series describes what happens when AI tools drive a project without someone in the decision path who can evaluate the output globally. CLAUDE.md captures existing decisions — it cannot substitute for the human making them. A well-structured CLAUDE.md in a project with no senior engineering oversight produces better-formatted drift. The fundamentals still require judgment.

A Different Frame for Onboarding

An AI agent is a system that can produce engineer-like output very quickly, with no memory of what came before this session and no model of your project except what it can see in the current context. That's not a criticism — it's a structural fact about how these tools work.

When you hire a contractor, you don't hand them a laptop and say "go build something." You give them context: the codebase, the constraints, the decisions that have already been made and why. You tell them which things to not touch, what the existing patterns are, and where to find relevant examples. You don't do this once at the start and then never again — you do it as the project evolves, as decisions change, as new constraints are discovered.

CLAUDE.md and system prompts are that onboarding, made persistent. The stop prompts are the corrections you'd give any engineer mid-task when their work is diverging from what you intended. None of this is complicated. It is the ordinary communication that makes any working relationship productive — adapted for the particular constraint that persistent memory in current tools remains partial and unreliable, so context has to be reconstructed at the start of every session whether you do it deliberately or not.

Honest framing: prompts and persistent context don't eliminate drift. They reduce its amplitude and keep it inside a smaller circle. The agent still makes thousands of micro-decisions you didn't ask for, and a fraction of them will diverge from what you wanted. That is the most a text file can do for you. The deterministic protection lives in the toolchain — linters, type checks, CI gates, package-approval rules — and the architectural protection lives in the human reading the diff. CLAUDE.md is the middle layer. It is necessary. It is not sufficient.


If you're setting up AI-assisted development and want the workflow to produce architectural consistency rather than well-formatted divergence, my fractional CTO engagements often start by putting exactly this structure in place — the persistent context, the deterministic guardrails, the review discipline around them. If you've inherited a codebase where none of that was set up and inconsistency has already compounded, that's the territory of my rescue projects service.

For a broader look at what the AI-assisted workflow looks like day-to-day, see AI Coding Workflow for Senior Developers. For the theory behind why specification precision matters more than ever, read AI Coding Is Wind.

Top comments (0)