Pratheesh Satheesh Kumar

Posted on May 28

Is Agentic AI Security the Next Crisis for Platform Engineers in 2026?

#agenticaisecurity #platformengineering #geordieai

Is Agentic AI Security the Next Crisis for Platform Engineers in 2026?

Quick Answer:
Geordie AI's $30M Series A is a clear signal that enterprise adoption of agentic AI is outpacing security controls. As a platform engineer, you need to start treating AI agents as first-class workloads with dedicated observability, access controls, and error budgeting before unmanaged agent behaviour creates cascading production incidents.

What You Will Learn

What agentic AI security means in a platform engineering context
Why existing observability and security patterns fall short for AI agents
How to apply SLOs and error budgets to agentic workloads
Concrete steps to integrate agent-native security into your CI/CD pipelines
Common mistakes teams make when deploying AI agents in production

What Is Agentic AI Security?

Agentic AI security is the discipline of ensuring that autonomous AI agents – systems that can plan, reason, and execute actions without human intervention – operate within defined security, reliability, and compliance boundaries. It combines real-time behavioural observability, fine-grained access control, and proactive risk governance. For platform engineers, this means agents are a new workload type that demands its own golden signals, error budgets, and incident runbooks.

Why Does Agentic AI Create a Security Problem?

Unpredictable execution paths: Unlike traditional microservices, agents can chain API calls and tool use in ways that are not statically predictable, breaking static security analysis.
Elevated lateral movement: An agent with excessive permissions can move across services, data stores, and cloud APIs faster than any human operator.
Blind spots in observability: Existing observability stacks track request/response latency and error rates, but not the intent or reasoning behind agent decisions.
Shift-left doesn’t work out of the box: Security scanning of agent code is necessary but insufficient because agent behaviour depends on runtime context.
No established SLIs for agent reliability: Without definition, teams have no way to defend error budgets or measure deployment success.

At-a-Glance Summary

Factor	Details
Core risk	Agents operate with autonomy, increasing blast radius of misconfigurations
Observability gap	Traditional golden signals (latency, traffic, errors, saturation) miss agent intent
Access control challenge	Agents need dynamic, least-privilege permissions that are hard to model with static IAM
Incident response	MTTR for agent-related incidents currently exceeds 4 hours in most early-adopter teams
Regulatory pressure	NIST AI RMF and CISA guidelines now reference agentic risk; compliance audits are coming
DORA metrics impact	Uncontrolled agent deployments degrade change failure rate and lead time for changes
Funding signal	Geordie AI's $30M round validates that agent security is a distinct market need

How to Secure Agentic AI in Your Platform

Step 1 — Define agent-specific SLIs

Start with three new golden signals: agent action success rate, permission violation frequency, and decision latency. These form the basis of an SLO for agent reliability. A practical approach used at Pratheesh-tech is to instrument every agent step with OpenTelemetry spans that capture the reasoning trace, not just the API call.

Step 2 — Apply error budgets to agent deployments

Treat each agent version as a deployable unit. If its action success rate falls below the SLO threshold (e.g., 99.9%), halt further canary rollouts and trigger an incident runbook. This prevents bad agent behaviours from escalating.

Step 3 — Implement behavioural canary testing

Before routing production traffic to a new agent, run it in a sandbox with simulated tool calls. Compare its action sequence against an allowed pattern. Reject any deviation. This is analogous to chaos engineering but for agent intent.

Step 4 — Enforce zero-trust agent identity

Each agent must have a workload identity that is short-lived and scoped to exactly the APIs it needs. Use service mesh policies to enforce that only agents with valid signed JWTs can call internal endpoints. Revoke credentials as soon as the agent’s task completes.

Step 5 — Build agent incident runbooks

Your existing incident response process must include agent-specific steps: pause all agent activity, download the decision log, roll back to the last known-good model or prompt, and scrub any leaked data. DORA elite performers target MTTR under 1 hour, but without agent runbooks you’ll be debugging for days.

Step 6 — Measure DORA metrics for agent pipelines

Track deployment frequency, lead time for changes, change failure rate, and MTTR for agent updates. If agents are deployed multiple times per day, you need the same rigour applied to containerised workloads. Use GitOps-style approvals for prompt and tool configuration changes.

What Happens If You Ignore This?

Uncontrolled agent escalation: A single misconfigured agent could trigger a chain reaction across your infrastructure, costing hours of recovery time and lost data.
Regulatory fines: NIST and CISA frameworks are moving toward requiring runtime auditing of AI agents. Non-compliance may hit budgets directly.
Reputation damage: Agent-led incidents that leak customer data or cause service outages erode trust with business stakeholders.
Wasted error budget: Without agent SLIs, you’ll burn budget on false positives while real problems slip through.
Missed innovation: Fear of unsecured agents will slow adoption, leaving your organisation behind competitors who solve it.

Photo by Paul Lichtblau on Pexels

Suggested image: A platform engineer reviewing an AI agent observability dashboard with security alerts

Common Mistakes to Avoid

Mistake	Why It's a Problem	What to Do Instead
Applying existing CSPM tools to agents	CSPM scans snapshots, not runtime behaviour; agents change state between scans	Use runtime behavioural monitoring that captures agent action sequences
Giving agents human-like IAM roles	Over-privileged roles let agents access sensitive data they don't need	Issue scoped, short-lived tokens that expire after the agent’s task
Ignoring agent-to-agent communication	Agents may chatter laterally, bypassing normal API gateways	Enforce service mesh mTLS and mutual authentication for agent endpoints
Skipping prompt injection testing	Attackers can manipulate agents via indirect prompt injection through external data sources	Include adversarial prompt testing in your CI/CD pipeline
Treating agents as stateless functions	Agents often maintain state across steps, leading to inconsistent audits	Persist decision logs and expose them through your observability stack

Expert Tips

Instrument every agent action with OpenTelemetry: Use spans that capture the input, decision, output, and tool call for each step. This gives you the raw data for SLI calculation and post-incident analysis.
Start with a single task-specific agent: Do not deploy a general-purpose agent first. A scoped agent (e.g., log analysis) is easier to secure and measure. Learn from it before scaling.
Use KEDA to auto-scale agent pods based on action queue depth: This prevents resource spikes from overwhelming your cluster during event bursts.
Run weekly chaos drills for agent failure modes: Simulate a prompt injection attack or a permission escalation scenario. Measure your detection time and MTTR improvement.
Publish agent runtime KPIs on a dedicated team dashboard: Include agent action error rate, permission violation count, and mean decision latency. This builds shared ownership across DevOps and AI teams.

Photo by Christina Morillo on Pexels

Suggested image: An engineer reviewing agentic AI security metrics on a large monitor

Frequently Asked Questions

How is agentic AI security different from traditional API security?

Agentic AI security must account for intent and autonomy. Traditional API security blocks known bad requests, but an agent can chain multiple legitimate calls into an unintended outcome. You need behavioural observability that tracks the reasoning path, not just the HTTP verbs.

Can I use my existing SIEM to monitor AI agents?

Partially. An SIEM can ingest agent logs, but it won't understand the semantic meaning of an agent's decision. You need a dedicated platform that correlates tool calls, prompt inputs, and permission tokens in near real-time to spot deviations from allowed patterns.

What SLO should I set for agent reliability?

Start with an action success rate of 99.9% over a rolling 30-day window. This matches typical production SLOs for critical workloads. As you mature, add a permission violation rate SLO of <0.01% to catch entitlement creep early.

How often should I rotate agent credentials?

Rotate them with every agent deployment or every hour, whichever is shorter. Agents are ephemeral by nature; long-lived tokens defeat the purpose of zero-trust identity. Use Vault or similar to issue tokens that expire automatically when the agent task completes.

Are DORA metrics applicable to AI agent pipelines?

Absolutely. Measure deployment frequency and lead time for prompt or tool configuration changes. If your change failure rate exceeds 5% or MTTR climbs past 1 hour, your agent delivery process needs the same rigour as any other CI/CD pipeline.

Top comments (1)

Mallory Haigh • Jun 9

The security framing here is right, but the hardest part isn't instrumentation so much as it's the identity and context layer underneath. Most teams I'm working with are treating agents like privileged service accounts and then wondering why the blast radius is so large; the actual fix is scoping agent identity to the Path it's executing, so the permissions are contextual and time-bounded, not static IAM policies bolted on after the fact.

I think this is part of why the observability gap feels so unsolvable right now: teams are trying to monitor agent behaviour without having defined the intended behaviour first. When you build a proper agent infrastructure layer with explicit path specifications, your SLIs basically write themselves, because you already know what a successful execution looks like versus a deviation. Without that foundational substrate, you're stuck in a forensics loop.