DEV Community

JinHyuk Sung
JinHyuk Sung

Posted on

I built a deterministic CI firewall for AI-generated pull requests

AI coding agents are getting good enough to open pull requests.

That is useful.

It also changes the review problem.

A normal code review asks:

Does this code look correct?

An AI-generated PR also raises a different question:

Did this agent change something I did not intend, and does this PR have enough evidence to merge?

Agent Gate is still a prerelease, so I am starting with a narrow goal: make AI-generated PRs easier to inspect before merge.

That second question is why I built Agent Gate for AI PRs.

The core idea

Agent Gate is a deterministic CI firewall for AI-generated pull requests.

It is not an LLM reviewer.

That distinction matters.

LLM reviewers help with judgment. Agent Gate verifies deterministic merge evidence.

An LLM reviewer can tell you whether code looks suspicious. Agent Gate checks whether the PR crossed policy boundaries that should be explainable and repeatable in CI.

The mental model is:

Use your LLM reviewer for judgment.

Use Agent Gate for deterministic merge evidence.

Agent Gate checks questions that should not require an LLM:

  • did the PR stay inside its declared scope?
  • did workflow permissions escalate?
  • did agent control-plane files drift?
  • did high-risk code change without matching test-file evidence?
  • did MCP config changes get surfaced?

These are not semantic code review questions. They are merge-boundary questions.

Why I wanted this

Imagine an agent is asked to fix an auth session bug.

The expected scope might be:

allowed_paths:
  - src/auth/**
  - tests/auth/**
Enter fullscreen mode Exit fullscreen mode

But the PR also changes:

src/payments/webhook.ts
.github/workflows/release.yml
.mcp.json
Enter fullscreen mode Exit fullscreen mode

A reviewer might catch that. An LLM reviewer might catch that. But I do not want this to depend only on someone noticing.

I want CI to say:

This PR crossed its declared scope.
This PR changed workflow permissions.
This PR changed the agent tool surface.
This PR needs human decision before merge.
Enter fullscreen mode Exit fullscreen mode

That is the shape of Agent Gate.

What Agent Gate catches today

The current v0.1.2 release is intentionally focused on deterministic checks.

It can flag or block the following, depending on policy mode.

Out-of-contract edits

Agent Gate can parse a small PR body contract:

<!-- agent-gate-contract
version: 1
agent: codex
task: update auth session handling
allowed_paths:
  - src/auth/**
  - tests/auth/**
required_evidence:
  - matching auth tests changed
-->
Enter fullscreen mode Exit fullscreen mode

If the PR changes files outside allowed_paths, Agent Gate reports that as a contract escape.

Workflow permission escalation

GitHub Actions workflows are powerful. If a PR changes this:

permissions:
  contents: read
Enter fullscreen mode Exit fullscreen mode

to this:

permissions:
  contents: write
Enter fullscreen mode Exit fullscreen mode

that should be visible before merge.

Agent Gate checks workflow-level permission escalation and dangerous workflow patterns such as:

  • write-all
  • id-token: write
  • pull_request_target checking out PR head
  • unpinned third-party actions
  • added secrets.* usage

Agent control-plane drift

Files like these can change how future agents behave:

AGENTS.md
CLAUDE.md
.cursor/**
.github/copilot-instructions.md
.mcp.json
Enter fullscreen mode Exit fullscreen mode

A PR that changes .mcp.json is not just changing config. It may be changing which tools an agent can call.

Agent Gate treats those files as an agent control plane and reports drift.

Missing test evidence

Agent Gate can define high-risk paths:

high_risk_paths:
  auth:
    paths:
      - src/auth/**
    require_tests:
      - tests/auth/**
    severity: error
Enter fullscreen mode Exit fullscreen mode

If auth code changes but no matching auth test file changes, Agent Gate reports missing test evidence.

This does not prove semantic test coverage. It only checks deterministic file-pattern evidence.

That limitation is intentional.

What a report looks like

One piece of early feedback was that the report should not start with a wall of rule IDs.

It should answer the maintainer's first question:

What should I do with this PR?

So the Markdown report now leads with a human decision.

Example shape:

Agent Gate: NEEDS HUMAN DECISION

Why:
This PR changed `.github/workflows/release.yml` and added `secrets.*` usage.

Recommended next step:
Review the workflow change before merging.

Policy status:
Warning today; eligible to become a merge gate after tuning.
Enter fullscreen mode Exit fullscreen mode

The detailed rule findings still appear underneath.

The machine-readable JSON decision remains simple:

{
  "decision": "warn"
}
Enter fullscreen mode Exit fullscreen mode

The human-facing report can say NEEDS HUMAN DECISION, while the machine-readable result stays pass, warn, or block.

The trust boundary

Agent Gate is designed around a conservative trust boundary.

At runtime, the GitHub Action:

  • does not checkout PR code
  • does not execute repository scripts
  • does not call LLMs
  • does not execute MCP servers
  • does not load policy from the PR head branch

It reads PR metadata and changed-file contents through GitHub APIs.

It loads agent-gate.yml from the PR base branch, not from the untrusted PR branch.

That matters because a PR should not be able to weaken its own policy.

Installing it

Agent Gate is available on GitHub Marketplace:

https://github.com/marketplace/actions/agent-gate-for-ai-prs

A minimal workflow looks like this:

name: Agent Gate

on:
  pull_request:
    types:
      - opened
      - synchronize
      - reopened
      - edited
      - labeled
      - unlabeled
      - ready_for_review

permissions:
  contents: read
  pull-requests: read

jobs:
  agent-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: sjh9714/Agent-Gate@v0.1.2
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          mode: warn
          fail-on-block: false
Enter fullscreen mode Exit fullscreen mode

I recommend starting with:

mode: warn
fail-on-block: false
Enter fullscreen mode Exit fullscreen mode

That gives you an observe path.

First learn what Agent Gate finds in your repository. Then promote only the policies that are useful and low-noise into merge gates.

A small starting policy could be:

version: 1
mode: warn

contract:
  required_for:
    - agent
  allow_missing_in_observe_mode: true

agent_detection:
  labels:
    - ai
    - agent
    - codex
  branch_patterns:
    - "codex/**"
    - "ai/**"

high_risk_paths:
  workflows:
    paths:
      - ".github/workflows/**"
    severity: error
Enter fullscreen mode Exit fullscreen mode

Local replay demo

The repository includes an unsafe-pr-zoo with deterministic fixtures.

After cloning the repo and installing dependencies, you can run:

pnpm install
pnpm --filter agent-gate build
node packages/cli/dist/main.js replay fixtures/unsafe-pr-zoo/workflow-permission-escalation
Enter fullscreen mode Exit fullscreen mode

Example output:

Agent Gate: BLOCKED

ERROR workflow/permission-escalation
contents permission increased from read to write.
Path: .github/workflows/release.yml

ERROR workflow/dangerous-pattern
.github/workflows/release.yml contains a dangerous GitHub Actions workflow pattern.
Path: .github/workflows/release.yml
Enter fullscreen mode Exit fullscreen mode

Other fixtures cover:

  • agent control-plane drift
  • out-of-scope agent edits
  • missing test evidence
  • MCP config drift

What Agent Gate is not

Agent Gate is not a replacement for code review.

It does not try to find every semantic bug.

It does not know whether a function is logically correct.

It does not prove that tests are sufficient.

It does not replace a human reviewer, an LLM reviewer, or normal CI.

It answers a narrower question:

Did this PR cross deterministic policy boundaries before merge?

That is the problem I want it to solve well.

Current limitations

v0.1.2 is still a prerelease.

Known limitations:

  • APIs, rule names, reports, and config may change.
  • CODEOWNERS and reviewer evidence are not implemented yet.
  • Package and dependency drift rules are not implemented yet.
  • GitHub Actions job-level permission escalation comparison is limited.
  • Test evidence is file-pattern based and does not prove semantic coverage.
  • PR comment upsert requires issues: write and can warn on fork PRs with read-only tokens.

What I want feedback on

I am especially interested in feedback from people trying AI-generated PRs in real repositories.

The main questions:

  • Which findings should block by default?
  • Which findings should stay warning-only?
  • What high-risk path patterns do you use?
  • Would CODEOWNERS or reviewer evidence make this more useful?
  • Should package script and dependency drift be part of the gate?
  • What would make this too noisy to adopt?

Feedback issue:

https://github.com/sjh9714/Agent-Gate/issues/27

Repository:

https://github.com/sjh9714/Agent-Gate

Closing thought

LLM reviewers are useful.

But if AI-generated PRs become part of normal engineering workflows, teams will also need deterministic gates.

Not every review question should be probabilistic.

Some questions are simple:

Did this PR stay within scope?

Did workflow permissions escalate?

Did agent control-plane files change?

Is there matching test evidence?

That is the space Agent Gate is trying to explore.

Use your LLM reviewer for judgment.

Use Agent Gate for deterministic merge evidence.

Disclosure: I used AI assistance to help draft and edit this article, and I reviewed the technical claims before publishing.

Top comments (0)