Jack M

Posted on Jun 2

AI Code Guardrails for SaaS: Stop Agent-Written Bugs Before They Reach PR

#agents #ai #security #softwareengineering

AI coding agents are fast enough to create a new problem: bad patterns now scale at machine speed.

A human developer might copy a risky error-handling shortcut once. An AI agent can repeat it across ten files, wrap it in confident comments, update the tests to match the mistake, and open a pull request nobody wants to review.

That does not mean AI coding tools are useless. It means SaaS teams need AI code guardrails: repo-level checks that catch fragile, unsafe, or off-pattern code before it reaches review.

This guide shows how to build those guardrails with pre-commit hooks, static analysis, tests, CI checks, and simple policy-as-code. No vendor pitch. No magic prompt. Just practical workflow design for builders shipping AI-assisted SaaS.

Why AI-Written Code Needs Guardrails

AI coding agents are good at producing plausible code. That is also the risk.

They can generate boilerplate, refactor several files, write tests, and connect APIs quickly. But they also tend to repeat patterns that look reasonable in isolation and become dangerous at scale:

Catching broad exceptions and continuing
Swallowing errors with console.error() only
Adding retries without limits
Creating new abstractions when a shared one exists
Changing tests to fit broken behavior
Mixing tenant IDs across helper functions
Logging sensitive values while debugging
Adding dependencies for tiny utilities

The old fix was "review more carefully." That does not scale when the diff is 800 lines and half the team is also using agents.

The better fix is to move recurring review feedback into code. If a pattern is never acceptable, do not rely on a reviewer to catch it every time. Make the repository reject it.

What Are AI Code Guardrails?

AI code guardrails are automated checks that constrain how code can be generated, changed, tested, and merged.

They sit in places developers and agents cannot easily ignore:

Local pre-commit hooks
Formatting and linting rules
AST-based custom checks
Unit and integration tests
Security scanners
Type checks
CI/CD policy checks
Pull request templates
CODEOWNERS review rules

The key idea: prompts are helpful, but checks are enforceable.

A prompt can say:

Do not swallow database errors.

A guardrail can fail the commit when it sees:

try {
  await db.invoice.update(...)
} catch (err) {
  console.error(err)
}

That difference matters. AI agents can forget instructions. Hooks do not.

The Practical Goal: Make Bad Code Hard to Commit

For SaaS builders, the goal is not to block AI. The goal is to make the safe path the easy path.

A good guardrail system should:

Catch common AI-generated mistakes early
Give clear fix messages
Run fast enough for daily use
Work locally and in CI
Protect tenant boundaries, billing logic, auth, and data access
Keep pull requests smaller and easier to review

If a guardrail takes six minutes locally, people will bypass it. If the error message says "policy failed," people will hate it. Fast, specific, local feedback is the win.

Start With the Failure Patterns Your Agents Actually Create

Do not begin with a giant policy framework. Begin with the last five annoying AI-generated diffs.

Look for patterns like:

What did reviewers keep correcting?
Which bugs slipped into staging?
Which files did agents edit too aggressively?
Which tests were weakened?
Which production invariants are easy to express as rules?

For an AI SaaS product, common high-value targets are:

Area	Guardrail idea
Authentication	No direct user lookup without tenant scope
Billing	No price, credit, or refund change without domain service
Errors	No raw framework errors from business logic
Logging	No secrets, prompts, tokens, or customer content in logs
Database	No broad update/delete without tenant and limit checks
Agents	No tool execution without policy check
Tests	No `.only`, skipped tests, or snapshot churn without review
Dependencies	No new package without justification

Your first guardrails should target bugs you have already seen, not theoretical risks from a conference talk.

Layer 1: Pre-Commit Hooks for Fast Local Feedback

Pre-commit hooks are the best first layer because they run before the code leaves the developer machine or agent workspace.

A basic setup might run:

Formatter
Linter
Type checker for changed packages
Secret scanner
Test file sanity checks
Custom policy checks

Example .pre-commit-config.yaml:

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks:
      - id: end-of-file-fixer
      - id: trailing-whitespace
      - id: check-yaml
      - id: detect-private-key

  - repo: local
    hooks:
      - id: no-skipped-tests
        name: Block skipped tests
        entry: node scripts/guards/no-skipped-tests.js
        language: system
        files: "\\.(test|spec)\\.(ts|tsx|js)$"

      - id: no-unsafe-console-catch
        name: Block swallowed catch blocks
        entry: node scripts/guards/no-unsafe-console-catch.js
        language: system
        files: "\\.(ts|tsx)$"

Add this to your coding-agent instructions:

Before marking the task complete:
1. Run formatting.
2. Run pre-commit hooks for changed files.
3. Run the smallest relevant test set.
4. If a hook fails, fix the root cause. Do not bypass hooks.
5. Report what passed and what you did not run.

The prompt helps. The hook enforces.

Layer 2: AST Rules for Bugs Regex Cannot See

Regex checks are useful for simple patterns. But AI-generated code often needs structure-aware checks.

This is risky:

try {
  await createInvoice(input)
} catch (error) {
  console.error(error)
}

This is better:

try {
  await createInvoice(input)
} catch (error) {
  logger.error({ error, invoiceId }, "invoice creation failed")
  throw new BillingOperationFailed("Could not create invoice")
}

An AST rule can ask better questions:

Is there a catch block?
Does it only log?
Does it rethrow?
Does it return a typed error?
Is the function in a critical domain folder?

A small TypeScript guard can scan changed files:

// scripts/guards/no-unsafe-console-catch.ts
import ts from "typescript"
import fs from "node:fs"

const files = process.argv.slice(2)
let failed = false

for (const file of files) {
  const source = ts.createSourceFile(
    file,
    fs.readFileSync(file, "utf8"),
    ts.ScriptTarget.Latest,
    true
  )

  function visit(node: ts.Node) {
    if (ts.isCatchClause(node)) {
      const text = node.block.getText(source)
      const logsOnly = text.includes("console.error") &&
        !text.includes("throw") &&
        !text.includes("return")

      if (logsOnly) {
        const pos = source.getLineAndCharacterOfPosition(node.getStart())
        console.error(`${file}:${pos.line + 1} catch block logs but does not recover`)
        failed = true
      }
    }
    ts.forEachChild(node, visit)
  }

  visit(source)
}

process.exit(failed ? 1 : 0)

This kind of rule is perfect for AI coding agents because it turns team taste into executable policy.

Layer 3: Protect SaaS Invariants, Not Just Style

Style checks are useful, but production safety comes from protecting invariants.

For a multi-tenant AI SaaS app, examples include:

Every customer query must include tenantId
Background jobs must include an idempotency key
Agent tool calls must go through a policy broker
Billing changes must use a billing domain service
Admin actions must write audit logs
Prompt and completion logs must be redacted
External webhooks must verify signatures

Turn these into rules.

Example: block direct database access to invoices outside the billing service.

const fs = require("fs")
const allowed = ["src/billing/", "src/tests/"]
const files = process.argv.slice(2)
let failed = false

for (const file of files) {
  const text = fs.readFileSync(file, "utf8")
  const touchesInvoice = /db\.invoice\.(create|update|delete)/.test(text)
  const isAllowed = allowed.some(prefix => file.startsWith(prefix))

  if (touchesInvoice && !isAllowed) {
    console.error(`${file}: invoice writes must go through src/billing services.`)
    failed = true
  }
}

process.exit(failed ? 1 : 0)

A lot of SaaS incidents are not caused by exotic failures. They come from boring boundary violations repeated under deadline pressure.

Layer 4: Stop Agents From Weakening Tests

AI agents often "fix" failing tests by changing the expectation instead of fixing the bug.

That is not always malicious. The agent is optimizing for task completion. If the instruction says "make tests pass," it may treat the test as part of the editable solution.

Add guardrails such as:

Block .only
Block describe.skip and it.skip
Flag large snapshot updates
Require review when deleting tests
Require human review for auth, billing, and tenant test changes

Example PR rule:

critical_test_review:
  if_changed:
    - "tests/auth/**"
    - "tests/billing/**"
    - "tests/tenant-isolation/**"
  require_review_from:
    - "@backend-owners"

For small SaaS teams, this may just be one senior developer. That is fine. The point is to make risky test changes visible.

Layer 5: Add CI Checks Agents Cannot Skip

Local hooks are helpful, but they are not enough. Developers can bypass them. Agents can run in environments where hooks are not installed. CI is the source of truth.

Your CI should rerun the important checks:

name: Guardrails

on:
  pull_request:

jobs:
  guardrails:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
      - run: npm ci
      - run: npm run format:check
      - run: npm run lint
      - run: npm run typecheck
      - run: npm run guardrails
      - run: npm run test:changed

The local hook protects flow. CI protects the branch.

Layer 6: Require a Reviewable Agent Work Log

AI-written pull requests are hard to review when the agent does not explain its choices.

Add a short PR template for AI-assisted work:

## AI assistance disclosure

- [ ] AI generated or edited part of this PR
- [ ] I reviewed the generated code line by line
- [ ] I ran pre-commit hooks
- [ ] I ran relevant tests

## Risk areas touched

- [ ] Auth
- [ ] Billing
- [ ] Tenant isolation
- [ ] Agent tool execution
- [ ] PII or prompt logging
- [ ] Database migrations

## Notes for reviewer

What should the reviewer inspect most carefully?

This makes the author slow down and gives reviewers a map. You are not asking people to distrust AI code automatically. You are asking them to review it with context.

What to Guard First in an AI SaaS Codebase

If your product includes LLM features, start with these rules.

1. No raw prompt or completion logs

Bad:

logger.info({ prompt, response }, "llm call complete")

Better:

logger.info({ model, tenantId, tokenCount, latencyMs }, "llm call complete")

2. No tool calls without policy checks

Bad:

await sendEmail({ to, subject, body })

Better:

await toolBroker.execute({
  tenantId,
  actorId,
  tool: "email.send",
  payload: { to, subject, body },
  risk: "medium"
})

3. No tenant-free queries

Bad:

const docs = await db.document.findMany({ where: { status: "ready" } })

Better:

const docs = await db.document.findMany({ where: { tenantId, status: "ready" } })

4. No silent fallback to weaker models

Fallbacks are useful, but silent quality drops can break trust.

catch (error) {
  await recordModelFailure({ tenantId, model, error })
  return callFallbackModel(input, { qualityNotice: true })
}

5. No unbounded retries

AI APIs fail. Retrying forever makes cost and latency worse.

await retry(() => callModel(input), {
  retries: 2,
  timeoutMs: 15000,
  backoff: "exponential"
})

These five rules catch a surprising amount of AI-generated risk.

A Simple 7-Day Implementation Plan

You do not need a full platform to start.

Collect recurring review comments. Open recent AI-assisted PRs and list repeated mistakes.
Install baseline pre-commit hooks. Add formatting, linting, JSON/YAML checks, and secret detection.
Add two custom guard scripts. Start with skipped tests and prompt/completion logging.
Mirror hooks in CI. Make pull requests run the same rules.
Protect one SaaS invariant. Pick tenant isolation, billing writes, auth checks, or agent tool execution.
Update agent instructions. Tell the agent what checks exist and that bypassing them is not acceptable.
Add PR evidence. Require commands run, risk areas touched, and reviewer notes.

After one week, you will not have perfect safety. You will have a repo that teaches both humans and agents where the boundaries are.

Common Mistakes to Avoid

Building too many rules at once

A noisy guardrail system gets ignored. Start with high-confidence rules.

Only running checks in CI

That wastes time. Put fast checks locally.

Writing vague failure messages

Bad:

Policy violation.

Good:

src/billing/refund.ts:42 Refund writes must use BillingService.issueRefund() so audit logs and idempotency keys are created.

Blocking without offering the safe path

Every rule should tell developers what to do instead.

Treating AI code as automatically bad

The issue is not whether a human or model wrote the code. The issue is whether the code respects your system boundaries.

How This Fits a Larger AI SaaS Architecture

AI code guardrails are one piece of a broader production safety stack.

If you are building AI SaaS, connect this layer with:

Agent observability for traces, costs, and failures
Tool budgets for agent actions and API spend
Approval gates for risky production actions
Prompt injection tests for untrusted content
Tenant-aware audit logs
Model fallback policies

Think of it as a chain:

Code guardrails prevent fragile changes from entering the repo.
CI/CD guardrails prevent unsafe changes from merging.
Runtime guardrails prevent unsafe agent actions from executing.
Observability catches what still goes wrong.

You need all four if agents are touching real customers, billing, messages, or data.

Final Checklist

Before you trust AI-generated code in a SaaS repo, ask:

Do pre-commit hooks run locally?
Do critical checks run again in CI?
Are tenant boundaries enforced by tests or static rules?
Are prompt, completion, and secret logs blocked?
Are billing and auth changes routed through domain services?
Are skipped tests and snapshot churn visible?
Does the PR template show AI assistance and guardrail evidence?
Can reviewers see which risk areas changed?

If the answer is mostly no, the next productivity win is not a smarter prompt. It is a safer repo.

FAQ

What are AI code guardrails?

AI code guardrails are automated rules that stop unsafe, fragile, or off-pattern AI-generated code before it reaches production. They can include pre-commit hooks, static analysis, tests, CI checks, review rules, and runtime policy enforcement.

Are prompts enough to control AI coding agents?

No. Prompts are useful guidance, but they are not reliable enforcement. If a coding rule matters, put it in hooks, tests, CI, or policy-as-code so it runs every time.

What pre-commit hooks are best for AI-generated code?

Start with formatting, linting, secret detection, skipped-test detection, type checks for changed files, and one or two custom rules for your most common AI-generated mistakes. For SaaS apps, tenant isolation, billing writes, and unsafe logging are strong first targets.

Should AI-generated code require special review?

It should require clear review evidence, not panic. Ask authors to disclose AI assistance, list commands run, identify risk areas, and explain what reviewers should inspect. Review the code by risk, not by whether a model helped write it.

How do I stop AI agents from changing tests to pass broken code?

Add checks for skipped tests, .only, large snapshot changes, deleted tests, and critical test folder edits. Require human review for auth, billing, tenant isolation, and security test changes.

What is the difference between AI code guardrails and AI agent approval gates?

AI code guardrails protect the development workflow before code merges. AI agent approval gates protect runtime workflows before an agent performs risky actions such as sending emails, changing billing data, or updating customer records.

Do solo SaaS developers need this much process?

Yes, but keep it lightweight. A solo developer benefits from fast pre-commit hooks, clear custom rules, and a small PR checklist because there may be no second reviewer. Guardrails are a way to protect your future self.

DEV Community

AI Code Guardrails for SaaS: Stop Agent-Written Bugs Before They Reach PR

Why AI-Written Code Needs Guardrails

What Are AI Code Guardrails?

The Practical Goal: Make Bad Code Hard to Commit

Start With the Failure Patterns Your Agents Actually Create

Layer 1: Pre-Commit Hooks for Fast Local Feedback

Layer 2: AST Rules for Bugs Regex Cannot See

Layer 3: Protect SaaS Invariants, Not Just Style

Layer 4: Stop Agents From Weakening Tests

Layer 5: Add CI Checks Agents Cannot Skip

Layer 6: Require a Reviewable Agent Work Log

What to Guard First in an AI SaaS Codebase

1. No raw prompt or completion logs

2. No tool calls without policy checks

3. No tenant-free queries

4. No silent fallback to weaker models

5. No unbounded retries

A Simple 7-Day Implementation Plan

Common Mistakes to Avoid

Building too many rules at once

Only running checks in CI

Writing vague failure messages

Blocking without offering the safe path

Treating AI code as automatically bad

How This Fits a Larger AI SaaS Architecture

Final Checklist

FAQ

What are AI code guardrails?

Are prompts enough to control AI coding agents?

What pre-commit hooks are best for AI-generated code?

Should AI-generated code require special review?

How do I stop AI agents from changing tests to pass broken code?

What is the difference between AI code guardrails and AI agent approval gates?

Do solo SaaS developers need this much process?

Top comments (0)