JinHyuk Sung

Posted on Jun 15

I built a deterministic CI firewall for AI-generated pull requests

#devtools #security #ai #github

AI coding agents are getting good enough to open pull requests.

That is useful.

It also changes the review problem.

A normal code review asks:

Does this code look correct?

An AI-generated PR also raises a different question:

Did this agent change something I did not intend, and does this PR have enough evidence to merge?

Agent Gate is still a prerelease, so I am starting with a narrow goal: make AI-generated PRs easier to inspect before merge.

That second question is why I built Agent Gate for AI PRs.

The core idea

Agent Gate is a deterministic CI firewall for AI-generated pull requests.

It is not an LLM reviewer.

That distinction matters.

LLM reviewers help with judgment. Agent Gate verifies deterministic merge evidence.

An LLM reviewer can tell you whether code looks suspicious. Agent Gate checks whether the PR crossed policy boundaries that should be explainable and repeatable in CI.

The mental model is:

Use your LLM reviewer for judgment.

Use Agent Gate for deterministic merge evidence.

Agent Gate checks questions that should not require an LLM:

did the PR stay inside its declared scope?
did workflow permissions escalate?
did agent control-plane files drift?
did high-risk code change without matching test-file evidence?
did MCP config changes get surfaced?

These are not semantic code review questions. They are merge-boundary questions.

Why I wanted this

Imagine an agent is asked to fix an auth session bug.

The expected scope might be:

allowed_paths:
  - src/auth/**
  - tests/auth/**

But the PR also changes:

src/payments/webhook.ts
.github/workflows/release.yml
.mcp.json

A reviewer might catch that. An LLM reviewer might catch that. But I do not want this to depend only on someone noticing.

I want CI to say:

This PR crossed its declared scope.
This PR changed workflow permissions.
This PR changed the agent tool surface.
This PR needs human decision before merge.

That is the shape of Agent Gate.

What Agent Gate catches today

The current v0.1.2 release is intentionally focused on deterministic checks.

It can flag or block the following, depending on policy mode.

Out-of-contract edits

Agent Gate can parse a small PR body contract:

<!-- agent-gate-contract
version: 1
agent: codex
task: update auth session handling
allowed_paths:
  - src/auth/**
  - tests/auth/**
required_evidence:
  - matching auth tests changed
-->

If the PR changes files outside allowed_paths, Agent Gate reports that as a contract escape.

Workflow permission escalation

GitHub Actions workflows are powerful. If a PR changes this:

permissions:
  contents: read

to this:

permissions:
  contents: write

that should be visible before merge.

Agent Gate checks workflow-level permission escalation and dangerous workflow patterns such as:

write-all
id-token: write
pull_request_target checking out PR head
unpinned third-party actions
added secrets.* usage

Agent control-plane drift

Files like these can change how future agents behave:

AGENTS.md
CLAUDE.md
.cursor/**
.github/copilot-instructions.md
.mcp.json

A PR that changes .mcp.json is not just changing config. It may be changing which tools an agent can call.

Agent Gate treats those files as an agent control plane and reports drift.

Missing test evidence

Agent Gate can define high-risk paths:

high_risk_paths:
  auth:
    paths:
      - src/auth/**
    require_tests:
      - tests/auth/**
    severity: error

If auth code changes but no matching auth test file changes, Agent Gate reports missing test evidence.

This does not prove semantic test coverage. It only checks deterministic file-pattern evidence.

That limitation is intentional.

What a report looks like

One piece of early feedback was that the report should not start with a wall of rule IDs.

It should answer the maintainer's first question:

What should I do with this PR?

So the Markdown report now leads with a human decision.

Example shape:

Agent Gate: NEEDS HUMAN DECISION

Why:
This PR changed `.github/workflows/release.yml` and added `secrets.*` usage.

Recommended next step:
Review the workflow change before merging.

Policy status:
Warning today; eligible to become a merge gate after tuning.

The detailed rule findings still appear underneath.

The machine-readable JSON decision remains simple:

{
  "decision": "warn"
}

The human-facing report can say NEEDS HUMAN DECISION, while the machine-readable result stays pass, warn, or block.

The trust boundary

Agent Gate is designed around a conservative trust boundary.

At runtime, the GitHub Action:

does not checkout PR code
does not execute repository scripts
does not call LLMs
does not execute MCP servers
does not load policy from the PR head branch

It reads PR metadata and changed-file contents through GitHub APIs.

It loads agent-gate.yml from the PR base branch, not from the untrusted PR branch.

That matters because a PR should not be able to weaken its own policy.

Installing it

Agent Gate is available on GitHub Marketplace:

https://github.com/marketplace/actions/agent-gate-for-ai-prs

A minimal workflow looks like this:

name: Agent Gate

on:
  pull_request:
    types:
      - opened
      - synchronize
      - reopened
      - edited
      - labeled
      - unlabeled
      - ready_for_review

permissions:
  contents: read
  pull-requests: read

jobs:
  agent-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: sjh9714/Agent-Gate@v0.1.2
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          mode: warn
          fail-on-block: false

I recommend starting with:

mode: warn
fail-on-block: false

That gives you an observe path.

First learn what Agent Gate finds in your repository. Then promote only the policies that are useful and low-noise into merge gates.

A small starting policy could be:

version: 1
mode: warn

contract:
  required_for:
    - agent
  allow_missing_in_observe_mode: true

agent_detection:
  labels:
    - ai
    - agent
    - codex
  branch_patterns:
    - "codex/**"
    - "ai/**"

high_risk_paths:
  workflows:
    paths:
      - ".github/workflows/**"
    severity: error

Local replay demo

The repository includes an unsafe-pr-zoo with deterministic fixtures.

After cloning the repo and installing dependencies, you can run:

pnpm install
pnpm --filter agent-gate build
node packages/cli/dist/main.js replay fixtures/unsafe-pr-zoo/workflow-permission-escalation

Example output:

Agent Gate: BLOCKED

ERROR workflow/permission-escalation
contents permission increased from read to write.
Path: .github/workflows/release.yml

ERROR workflow/dangerous-pattern
.github/workflows/release.yml contains a dangerous GitHub Actions workflow pattern.
Path: .github/workflows/release.yml

Other fixtures cover:

agent control-plane drift
out-of-scope agent edits
missing test evidence
MCP config drift

What Agent Gate is not

Agent Gate is not a replacement for code review.

It does not try to find every semantic bug.

It does not know whether a function is logically correct.

It does not prove that tests are sufficient.

It does not replace a human reviewer, an LLM reviewer, or normal CI.

It answers a narrower question:

Did this PR cross deterministic policy boundaries before merge?

That is the problem I want it to solve well.

Current limitations

v0.1.2 is still a prerelease.

Known limitations:

APIs, rule names, reports, and config may change.
CODEOWNERS and reviewer evidence are not implemented yet.
Package and dependency drift rules are not implemented yet.
GitHub Actions job-level permission escalation comparison is limited.
Test evidence is file-pattern based and does not prove semantic coverage.
PR comment upsert requires issues: write and can warn on fork PRs with read-only tokens.

What I want feedback on

I am especially interested in feedback from people trying AI-generated PRs in real repositories.

The main questions:

Which findings should block by default?
Which findings should stay warning-only?
What high-risk path patterns do you use?
Would CODEOWNERS or reviewer evidence make this more useful?
Should package script and dependency drift be part of the gate?
What would make this too noisy to adopt?

Feedback issue:

https://github.com/sjh9714/Agent-Gate/issues/27

Repository:

https://github.com/sjh9714/Agent-Gate

Closing thought

LLM reviewers are useful.

But if AI-generated PRs become part of normal engineering workflows, teams will also need deterministic gates.

Not every review question should be probabilistic.

Some questions are simple:

Did this PR stay within scope?

Did workflow permissions escalate?

Did agent control-plane files change?

Is there matching test evidence?

That is the space Agent Gate is trying to explore.

Use your LLM reviewer for judgment.

Use Agent Gate for deterministic merge evidence.

Disclosure: I used AI assistance to help draft and edit this article, and I reviewed the technical claims before publishing.

Top comments (2)

Alex Shev • Jun 15

This is exactly where AI PRs need a different contract from human PRs. A human reviewer can infer intent from context; an agent should have to prove the change stayed inside the requested boundary.

The CI firewall idea is strong because it makes "unexpected scope expansion" detectable before the review turns into archaeology.

JinHyuk Sung • Jun 15

Thanks — this is exactly the distinction I’m trying to make.

Human PRs can rely more on shared context and reviewer inference. For agent-generated PRs, I think the requested boundary needs to become explicit enough that CI can check it.

“Unexpected scope expansion” is a good way to describe one of the main risks. My current goal is to keep that check deterministic: declared scope, changed paths, workflow permission changes, agent-control-plane drift, and matching test evidence before the review turns into archaeology.