Critique

Posted on Jun 2

A security checklist for AI-generated pull requests

#security #ai #github #devplusplus

AI-generated code is not automatically insecure.

The problem is that it can create convincing pull requests faster than teams can inspect them. The diff may be formatted well, the helper names may look reasonable, and the tests may be green. None of that proves the change preserved the security rules your app depends on.

When I review AI-generated PRs, I use a short checklist. It is close to the way we wrote Critique's [critique-review](https://www.critique.sh/skills/critique-review) skill: establish scope, map blast radius, trace risky paths, check authorization, and only report findings that are grounded in the actual code.

No vague "this might be risky" comments. If there is a security concern, it should point to a real path and a real failure mode.

1. Start with blast radius

Before reading every line, mark the parts of the system the PR touches.

Pay extra attention to changes involving:

Auth
Billing
Permissions
Data export or import
Migrations
Webhooks
Background jobs
Infrastructure
Public APIs
AI agents, tool calls, or model output

Not every AI-generated diff deserves the same review depth. A copy tweak does not need the same pass as a webhook handler. A CSS fix is not token validation. A UI-only change is not the same as a database migration.

The first question is simple:

What is the worst thing this PR can affect if it is wrong?

That answer decides how hard you review.

2. Trace untrusted input

Find anything that enters the system from outside:

Request bodies
Headers
Uploaded files
Webhook payloads
User-generated content
Retrieved documents
Model outputs
Agent instructions

Then follow where that data can go:

Database writes
Logs
Commands
Prompts
Tool calls
External APIs
Credentials

AI-generated code is often good at the happy path. It parses the payload, calls the helper, returns the response, and adds a test for the expected case.

Security review is mostly about the other cases.

What if the webhook payload is replayed? What if the uploaded file is bigger than expected? What if the retrieved document contains instructions for the model? What if a user passes another user's ID?

Write the path down if needed:

external input -> validation -> permission check -> side effect

If one of those steps is missing, that is where the review should slow down.

3. Check authorization, not just authentication

This is the mistake I see most often in generated code.

The PR checks that a user is logged in, but does not check whether that user can access the specific object.

Authentication asks:

Who are you?

Authorization asks:

Are you allowed to do this specific thing?

Ask:

Can user A access user B's object?
Can one tenant read another tenant's data?
Can a non-admin reach an admin-only path?
Did the change bypass an existing owner check?
Does the API enforce the same rule as the UI?

This is not enough:

if (!session.user) {
  throw new Error("Unauthorized")
}

You still need the object-level check:

const project = await getProject(projectId)

if (project.ownerId !== session.user.id) {
  throw new Error("Forbidden")
}

In a real multi-tenant app, even that may be too simple. You might need organization membership, role checks, feature policy, or plan limits.

The point is not the exact code. The point is that "logged in" is rarely the whole rule.

4. Treat model output as untrusted

If an LLM can influence a privileged action, its output is untrusted input.

That includes output used for:

Tool calls
File writes
Shell commands
API requests
Database updates
Workflow routing
Prompt construction

Prompt injection is not only a chatbot problem. It is a tool authorization problem.

The risky pattern looks like this:

model reads untrusted content -> model decides action -> app executes action

The fix is not just "use a better prompt." Prompts help, but they are not a security boundary.

Use boring controls:

Allowlist tools
Validate tool arguments outside the model
Scope credentials tightly
Require confirmation for sensitive writes
Keep read tools separate from write tools
Log tool calls
Fail closed when the request is unclear

If a PR adds agent behavior, review it like a new public API. Ask what it can read, what it can write, and what happens when the input is hostile.

5. Validate the fix

For security-sensitive changes, do not accept "looks patched."

Ask for one of:

A regression test
A reproducer
A before/after exploit path
A clear invariant the code now enforces

Good validation sounds like this:

Before: User A could request User B's invoice by ID.
After: The API checks organization membership before loading invoice details.
Test: A user from org_1 gets a 403 when requesting an invoice from org_2.

That is much better than:

Fixed auth bug.

The same rule applies to tests. A generated PR may include tests, but check what they prove. Happy-path coverage is useful. Boundary coverage is what catches the security bug.

Look for negative tests:

Logged-out user cannot access the endpoint
Normal user cannot access admin action
Tenant A cannot update Tenant B's settings
Invalid webhook signature is rejected
Replayed webhook event does not double-apply
Model output cannot call a disallowed tool

If the PR changes authorization and only tests the allowed case, the test suite is still missing the important part.

6. Keep review comments specific

The least useful security review is a wall of generic warnings.

Bad:

Make sure permissions are correct.

Better:

This endpoint checks that a session exists, but it does not verify that the requested invoice belongs to the caller's organization. A user who can obtain another invoice ID may be able to read it. Load the invoice through an organization-scoped query or compare the invoice organization against the caller's memberships before returning it.

That gives the author something to fix.

This is the part of Critique's critique-review skill I like most. It pushes the reviewer to separate findings from guesses. A real finding needs a code path, an impact, and a fix direction. If the evidence is incomplete, call it an open question instead of pretending it is a confirmed bug.

AI-generated code does not need a totally different review process.

It needs a stricter one.

Use the same standards you would use for human-written production code:

find the blast radius
trace untrusted input
check object-level authorization
treat model output as untrusted
require evidence for security fixes
keep findings grounded in code

The goal is not to block AI-generated PRs. The goal is to make them prove the same thing every production change should prove: the right users can do the right things, and the wrong users cannot.

If you want the review posture in reusable form, the public [critique-review](https://www.critique.sh/skills/critique-review) skill is built around that idea: fewer generic comments, more grounded findings.