Governing AI Agents in CI/CD with OPA and MCP

#aiagents #opa #mcp #cicd

Originally published on devopsstart.com. This article covers a two-layer approach to govern AI agents in CI/CD: MCP for tool scoping and OPA for policy-as-code gating. Practical steps and code examples included.

If an AI agent can open a pull request, it can also merge one, drop a table, or push to production if you let it. Governing that agent means two separate controls, not one: constrain what tools it can reach, then gate every action it proposes before the pipeline executes it. The Model Context Protocol (MCP) handles the first job by scoping which tools an agent even sees. Open Policy Agent (OPA) handles the second by evaluating each proposed action against policy written as code. Wire both into CI/CD and an agent stops being an unbounded actor with your credentials and becomes a constrained one whose every move is checked.

This is not theoretical. Agents already run inside pipelines: generating Terraform, triaging failed builds, drafting release notes, and increasingly proposing the deploy itself. The question is no longer whether to allow them, but how to bound them.

Why an agent in your pipeline is a new kind of risk

A human engineer has judgment, a login, and an audit trail. An agent has a token, a set of tools, and a prompt that anyone upstream can influence. That last part is the dangerous one. A recent audit of thousands of environments found that agents fail open far more often than they fail closed, and that most of the damage came from agents holding more access than the task in front of them required. If you want the raw numbers, the Snyk environment audit is worth reading in full.

Two failure modes dominate. The first is the over-permissioned agent: it holds a token good for forty operations when the task needs three. The second is tool poisoning, where a malicious instruction hidden in a document, a code comment, or a tool description convinces the agent to call something it should not. Both come down to the same root cause. The agent's blast radius was never bounded to the task.

You cannot fix this with a better prompt. Prompts are input, and input is attacker-controlled the moment your agent reads a file, a webhook, or a comment. You fix it with controls that sit outside the model.

Layer one: MCP scopes what the agent can touch

MCP is the protocol that connects an agent to its tools. Every capability an agent has, reading a repo, running a query, calling a deploy API, is exposed through an MCP server. That makes the MCP boundary the natural place to answer one question: what is this agent even allowed to touch?

The protocol itself does not enforce security. The specification is explicit that authentication and authorization are the deployment's responsibility, built on OAuth 2.1 for HTTP transports. That means the defaults are wide open unless you close them. Three rules matter in a pipeline.

Scope tokens to the task. The recommended pattern is a minimal starting scope containing only low-risk read and discovery operations, with privileged scopes granted incrementally when a specific operation first needs them. A build-triage agent gets read on logs and issues, nothing else. A deploy agent gets a deploy scope that is valid for one run and one environment, not a standing production credential.

Expose the fewest tools that work. Every tool you register with an agent is both an attack surface and a distraction. Over-exposed tool sets degrade the model's ability to pick the right one and hand an attacker more to aim at. If the agent's job is to comment on PRs, do not also register a shell tool.

Run MCP servers on managed infrastructure, not laptops. Containerized servers on a cluster give you uniform sandboxing, centralized patching, and credentials issued to a service account rather than copied into a developer's environment. The same discipline that keeps secrets out of your build logs applies here; the patterns in GitHub Actions secret hygiene carry over almost directly. For the specific threats an MCP server faces, prompt injection and secret exfiltration through tool responses, see MCP server security.

Scoping decides what is reachable. It does not decide whether a reachable action is a good idea right now. That is the second layer.

Layer two: OPA gates what the agent proposes

OPA is a general-purpose policy engine. You express rules in Rego, its declarative policy language, and OPA answers a single question for any input: is this allowed? It already sits in Kubernetes admission control and Terraform plan review. Putting it in front of an agent's proposed action is the same move applied to a new actor.

The pattern is deny-by-default. The agent does not act directly. It emits a structured plan, OPA evaluates that plan against policy, and the pipeline executes only if OPA returns no denials. Here is a policy that blocks any deploy an agent proposes to a protected namespace without recorded human approval:

package ci.agent

import future.keywords.contains
import future.keywords.if
import future.keywords.in

# Deny agent deploys to protected namespaces without human sign-off.
deny contains msg if {
    input.action == "deploy"
    input.namespace in {"kube-system", "prod-payments"}
    not input.approved_by_human
    msg := sprintf("agent deploy to %q requires human approval", [input.namespace])
}

# Deny any tool call outside the allowlist for this pipeline.
deny contains msg if {
    some call in input.tool_calls
    not call.name in {"read_logs", "comment_pr", "open_pr"}
    msg := sprintf("tool %q is not permitted in this job", [call.name])
}

The agent writes its intended actions to a JSON file, and the pipeline evaluates that file. OPA's --fail-defined flag sets a non-zero exit code whenever the deny set contains anything, so a violated policy fails the job the same way a failed test does:

$ curl -L -o opa https://openpolicyagent.org/downloads/v1.5.0/opa_linux_amd64_static
$ chmod 755 ./opa
$ ./opa eval --fail-defined --format pretty \
    --data policy/ \
    --input agent-plan.json \
    'data.ci.agent.deny'

If any deny rule fires, opa eval exits non-zero and the pipeline stops before the agent's plan ever runs. If the plan is clean, the set is empty and the job continues.

In GitHub Actions this is a two-step job using the official setup action:

- name: Set up OPA
  uses: open-policy-agent/setup-opa@v2
  with:
    version: 1.5.0

- name: Gate the agent plan
  run: |
    opa eval --fail-defined --format pretty \
      --data policy/ --input agent-plan.json 'data.ci.agent.deny'

Test the policies themselves the same way you test application code. Rego has a native test runner, and you should run it on every change to the policy files:

$ ./opa test policy/ -v

For plans expressed as raw HCL or multi-format config rather than JSON, reach for Conftest, which wraps the same Rego engine but parses formats opa eval cannot. This is the exact mechanism behind gating human-authored infrastructure too; the Terraform architecture firewall uses policy-as-code the same way, just with a person on the other side instead of an agent.

Wiring both layers into one pipeline

The two controls compose into a single flow. Neither is sufficient alone: MCP scoping without an OPA gate lets a correctly-scoped agent still make a bad call, and an OPA gate without MCP scoping means the agent can reach tools your policy never anticipated.

The agent starts with an MCP token scoped to this job's tools only, issued for this run and no longer.
Instead of executing, the agent writes a structured plan (actions, targets, tool calls) to agent-plan.json.
OPA evaluates the plan with --fail-defined. A single denial fails the job.
On a clean pass, the pipeline executes the plan using the same scoped credentials, so the agent still cannot exceed its MCP grant even during execution.
Every step, the plan, the OPA decision, and the execution, is written to an append-only audit log keyed to the agent identity.

That table makes the division of labor concrete:

Control	Question it answers	Mechanism	Failure it stops
MCP scoping	What can the agent touch?	OAuth 2.1 scopes, tool allowlist	Over-permissioned agent, tool poisoning
OPA gate	Is this action allowed now?	Rego policy, `opa eval --fail-defined`	Unapproved deploy, out-of-policy call
Audit log	What did it actually do?	Attribution logging per agent identity	Silent drift, no forensic trail

Credentials: keep them short-lived and out of context

An agent that holds a long-lived production token is one prompt injection away from using it. The fix is the same one you already apply to workloads: issue dynamic, short-lived credentials at execution time and never place them where the model can read them. HashiCorp Vault's agent and auto-auth pattern is built for exactly this, minting credentials that expire in minutes and templating them into the environment rather than the prompt. The Vault documentation covers the auto-auth methods in detail.

The rule to hold onto: a credential the model can read is a credential the model can leak. Keep the token in the execution environment, keep it out of the context window, and keep it valid for as short a window as the job allows.

What OPA and MCP do not give you

Be honest about the boundary. This design bounds what an agent can do; it does not make the agent's judgment correct. An agent can still propose a technically-allowed action that is a bad idea, and OPA will pass it because policy said yes. Policy encodes the rules you thought to write, nothing more.

Three gaps are worth naming. First, your policies are only as good as the actions you anticipated; a novel action your Rego never modeled slips through as neither allowed nor denied unless you default-deny hard. Second, the structured-plan approach assumes the agent honestly reports what it intends to do, so the execution step must use the same scoped credential rather than trusting the plan to be complete. Third, the audit log is worthless if nobody reads it; wire high-severity denials to the same alerting path as a failed deploy. Governance is a habit, not a config file you write once.

None of that argues against the two layers. It argues for treating them as a floor, not a ceiling. Scope the agent so a compromised prompt cannot reach dangerous tools. Gate every action so a reachable tool still cannot run an out-of-policy operation. Log everything so a human can reconstruct what happened. Then keep tightening the policy as you learn what your agents actually try to do. That last loop, watching real agent behavior and turning surprises into new deny rules, is where the governance actually lives.

You do not need to solve agent governance for the whole industry. You need to bound the specific agents running in your specific pipeline, and both tools to do that are open source, already in your stack, and evaluate in milliseconds. Start with deny-by-default on your most dangerous action, deploy to production, and expand outward from there.