Kento IKEDA for AWS Community Builders

Posted on Apr 13 • Edited on Jun 20 • Originally published at zenn.dev

What Changes and What Stays the Same for SRE with AWS Frontier Agents

#aws #devops #sre #security

On March 31, 2026, AWS made DevOps Agent and Security Agent generally available — the first two of the autonomous AI agents announced at re:Invent 2025 under the "Frontier Agents" brand. A 2-month free trial is included, after which pay-as-you-go pricing kicks in.

https://aws.amazon.com/blogs/mt/announcing-general-availability-of-aws-devops-agent/

https://aws.amazon.com/blogs/machine-learning/aws-launches-frontier-agents-for-security-testing-and-cloud-operations/

The official announcements highlight numbers like "up to 75% MTTR reduction" and "penetration testing compressed from weeks to hours." The question that matters more is: how does this change the day-to-day work of an SRE team? Feature overviews are already plentiful, so this article focuses on what shifts to agents and what stays with humans.

What Is a Frontier Agent?

AWS announced three Frontier Agents at re:Invent 2025: Kiro Autonomous Agent (software development), DevOps Agent (operations), and Security Agent (security). Of these, DevOps Agent and Security Agent are now GA. Kiro Autonomous Agent remains in preview.

AWS defines Frontier Agents as systems that "work independently to achieve goals, scale massively to tackle concurrent tasks, and run persistently for hours or days." Frankly, that description could apply to existing AI agents like Claude Code or Devin. What AWS emphasizes is delivering "complete outcomes" rather than assisting with individual tasks, but this feels like a difference of degree, not kind.

In practice, it's probably best to think of them as domain-specialized autonomous agents — deeply integrated with DevOps and security workflows. "Frontier" is more of a marketing brand than a technical category: "AWS's first-party, domain-specific agent products" is a fair characterization.

What matters isn't the naming — it's how these agents affect SRE work.

What DevOps Agent Does and Doesn't Do

AWS describes DevOps Agent as an "always-available operations teammate." However, since it requires human approval for fixes and can't make business decisions, the reality is closer to an "always-on SRE apprentice" — it investigates and proposes, but can't decide or execute. Here's where that boundary lies.

What the Agent Does

Imagine an alert fires at 2 AM. Traditionally, the on-call engineer wakes up from a Datadog alert, opens their laptop, checks dashboards for metric anomalies, digs through logs, cross-references deployment history, and identifies the root cause. DevOps Agent automates this entire initial investigation.

Specifically, it correlates metrics and logs from monitoring tools (CloudWatch, Datadog, Dynatrace, New Relic, Splunk, Grafana), code repositories (GitHub, GitLab, Azure DevOps), and CI/CD deployment histories to build hypotheses like "this code change introduced in this deployment correlates with this metric anomaly." Investigation progress is shared via the web console and Slack, where you can ask follow-up questions or redirect the investigation.

The GA release adds Azure and on-premises environments as investigation targets. On-premises tools connect via MCP (Model Context Protocol), enabling consistent investigation across multicloud and hybrid setups.

Beyond incident response, DevOps Agent also provides proactive improvement recommendations — analyzing historical incident patterns to identify gaps in alert coverage, test coverage, code quality, and infrastructure configuration.

It's worth noting that Datadog's Bits AI SRE offers quite similar capabilities: autonomous alert investigation, source code analysis, and deployment correlation. The key difference is that DevOps Agent can simultaneously span multiple observability tools (Datadog + CloudWatch + Splunk, etc.) and include Azure and on-premises environments via MCP. If your organization is entirely within the Datadog ecosystem, Bits AI SRE may be sufficient. If you have multiple tools or a multicloud setup, DevOps Agent's cross-platform analysis adds value. More on this in the "How Does the Relationship with Existing Tools Change?" section.

The GA release also introduced "Learned Skills" and "Custom Skills." Learned Skills let the agent learn from your organization's investigation patterns and tool usage, improving accuracy over time. Custom Skills let you add organization-specific investigation procedures and best practices, configurable per incident type (triage, root cause analysis, mitigation). Code Indexing also enables code-level fix suggestions based on repository understanding.

What the Agent Doesn't Do

This is the critical part. DevOps Agent investigates and proposes — executing fixes is up to humans.

It can generate fix proposals and work with Kiro or Claude Code to produce fix code, but applying changes to production requires human approval. This is intentional — AWS has made the design decision that "an agent that modifies production without approval won't be trusted."

The other thing the agent doesn't do is business judgment. "This incident has major customer impact, so we need a company-wide response." "It's Friday night — let's apply a workaround and do the root cause fix Monday." These decisions require human context. Identifying the technical root cause and deciding how to respond are separate jobs. DevOps Agent handles the former; humans own the latter.

What Security Agent Does and Doesn't Do

Security Agent is the security counterpart — an "always-on penetration tester and security reviewer." It has three main capabilities: on-demand penetration testing, design document security review (Design Review), and PR security review (Code Review).

What the Agent Does

Penetration Testing

Traditionally, penetration testing meant "once or twice a year, for the most critical applications only, outsourced to specialists, taking weeks." Cost and time constraints leave most of the application portfolio untested. As the Security Agent FAQ notes, "most organizations limit manual penetration testing to their most critical applications and conduct these tests periodically." And even tested applications become partially unverified the moment new code is deployed.

Security Agent changes this structure. You create an "Agent Space," connect your GitHub repository, and the agent reads source code, architecture documents, and design docs to understand the application's structure before running automated penetration tests against endpoints.

The key difference from simply running a scanner: Security Agent validates discovered vulnerabilities by actually sending payloads to confirm exploitability. Reports include reproduction steps, dramatically reducing false positives. Per the official features page, testing covers OWASP Top 10 vulnerability types plus business logic flaws. According to the GA announcement blog, LG CNS reported significant false positive reduction, over 50% faster testing, and roughly 30% cost reduction.

Design Review

This capability reviews architecture documents and design docs from a security perspective before any code is written. It checks against AWS best practices and your organization's custom security requirements. Catching issues at the design stage avoids costly rework after implementation.

PR Review

Pull Request-level security review is the third capability. As of GA, it supports GitHub PRs — Security Agent automatically reviews PRs for security issues when they're created. You can configure it to check custom security requirement compliance, common security vulnerabilities, or both.

PR security checks aren't new — many organizations already have Claude Code or Codex review PRs with security instructions via CLAUDE.md, or have SAST tools in their CI/CD pipeline. Security Agent's difference is operational: security requirements are defined once in the console and automatically applied across all repositories. This removes the overhead of maintaining per-repository md files, but it doesn't enable something technically impossible before. Of the three capabilities, penetration test automation is where the real differentiation lies.

Design reviews are free up to 200/month, and code reviews up to 1,000/month. Only penetration testing is paid ($50/task-hour) — more on this in the "Cost Structure" section.

What the Agent Doesn't Do

Security Agent automates "discovery and validation" — "judgment and response" remain human territory.

Security policy decisions are human work. "Which risks to accept and which to address," "how to interpret compliance requirements" — these are outside the agent's scope. For example, "fixing this vulnerability requires a breaking API change, but we need to consider the impact on a major customer's release schedule" is a business trade-off that requires human judgment.

Social engineering (tricking employees into granting access) and vulnerabilities that can only be discovered by understanding the entire business workflow are also difficult to cover with automated testing alone. While the official documentation says business logic flaws are included in the test scope, the agent doesn't fully replace a human penetration tester's judgment of "what this operation means in the context of this business flow." Security Agent's strength is "broad, frequent, systematic testing" — complementing, not replacing, "deep, creative testing" by human experts.

How Does the Relationship with Existing Tools Change?

"Will Datadog or PagerDuty become unnecessary?" Short answer: no. The relationship changes, not the need.

Here's how the human steps change, using a late-night alert as an example:

Previously humans handled all 5 steps; after adoption, human steps drop to 2. Red = previously all-human work, green = shifts to the agent, blue = remains human.

Monitoring Tools (Datadog / CloudWatch, etc.)

DevOps Agent doesn't "replace" these tools — it uses them as "data sources." GA supports integrations with CloudWatch, Datadog, Dynatrace, New Relic, Splunk, and Grafana. This isn't about canceling Datadog and switching to DevOps Agent — it's about DevOps Agent analyzing the metrics and logs that Datadog collects.

You might wonder: "Could we drop Datadog, consolidate on CloudWatch, and use DevOps Agent / Security Agent to save costs?" DevOps Agent handles incident investigation and improvement recommendations — it doesn't include day-to-day monitoring features like APM, RUM, distributed tracing, or dashboards. Datadog's value extends beyond incident response, so a simple swap doesn't work. That said, there is overlap between Datadog's Bits AI SRE and DevOps Agent's incident investigation capabilities, so whether you need to pay for both is worth evaluating.

In fact, the quality of your monitoring setup directly affects DevOps Agent's effectiveness. The agent can only analyze what your tools collect. Sparse metrics and logs mean less accurate analysis. The direction isn't "adopt the agent so we can invest less in monitoring" — it's "invest in monitoring to maximize the agent's effectiveness."

Incident Notification and On-Call Management (PagerDuty / Datadog On-Call, etc.)

Alert routing and on-call management roles remain unchanged. DevOps Agent starts investigating the moment an alert fires, completing initial investigation before the notified human even logs in. On-call scheduling, escalation, and incident lifecycle management continue to be handled by tools like PagerDuty or Datadog On-Call. The GA release added PagerDuty integration as well.

What changes is "the first thing the on-call engineer does." Instead of "open the dashboard and check metrics," it becomes "read the Agent's investigation results shared in Slack." Per the official GA blog, Zenchef (a restaurant technology platform) submitted an issue to DevOps Agent during a hackathon and had the root cause identified in 20–30 minutes — an investigation that would normally take 1–2 hours, completed while the engineers stayed focused on the hackathon.

GitHub (Security Agent PR Review)

Security Agent automatically posts security review comments on GitHub PRs. Developers can review and address findings without leaving the PR interface. Merge decisions remain human. Details and differentiation points are covered in the "What Security Agent Does and Doesn't Do" section above.

Cost Structure

Frontier Agents have a distinctive pricing model, especially DevOps Agent's tie-in with AWS Support plans.

DevOps Agent: Support Credits

DevOps Agent costs $0.0083/agent-second (roughly $30/hr). The official pricing page shows usage examples: a small team (10 incident investigations/month, 8 minutes each) at ~$40/month, and an enterprise (500 incidents, 10 Agent Spaces) at ~$2,300/month.

On top of this, per the pricing page, AWS Support customers receive monthly credits based on the prior month's Support spend:

Support Plan	Credit Rate
Unified Operations	100% of prior month's Support spend
Enterprise Support	75% of prior month's Support spend
Business Support+	30% of prior month's Support spend

For example, an organization paying $15,000/month for Enterprise Support would receive $11,250/month in DevOps Agent credits. If usage stays within that, the incremental cost is zero.

Behind this credit structure is the relationship with TAM (Technical Account Manager) in Enterprise Support. Traditionally, Enterprise Support customers get a TAM who provides architecture reviews and operational guidance. The Enterprise Support page now presents TAM and DevOps Agent side by side — TAM handles strategic guidance, DevOps Agent handles 24/7 automated investigation and improvement proposals. DevOps Agent is positioned as an "extension" of Support, not a replacement, which explains why credits come from Support spend.

Security Agent: Penetration Testing Cost Transformation

Security Agent penetration testing is billed at $50/task-hour (design and code reviews have free tiers as mentioned above). Per the official FAQ, a 2-month free trial is included post-GA.

Traditional third-party penetration testing typically costs hundreds of thousands of yen (tens of thousands of dollars) per engagement, taking weeks. This "high unit cost × low frequency" structure transforms into "$50/task-hour × high frequency."

The implication: it becomes economically viable to expand penetration testing coverage. Organizations that could only test their most critical applications due to cost constraints can now continuously test across their entire portfolio.

How Does SRE Team Management Change?

Frontier Agents adoption has implications for how SRE teams operate.

The Nature of On-Call Pain Changes

DevOps Agent's biggest impact is automating initial investigation for late-night alerts. Per the official GA blog, WGU (Western Governors University) deployed it to production during preview, reducing estimated 2-hour investigations to 28 minutes.

Traditional on-call pain comes from being woken up and having to start investigating from scratch in a less-than-ideal state. Opening dashboards, hunting for metric anomalies, pulling related logs, cross-referencing deployment history — this alone can take 30 minutes to an hour.

After DevOps Agent adoption, this "starting from zero" disappears. The on-call engineer's first action becomes "read the Agent's findings." The agent presents "this code change in this deployment correlates with this metric anomaly — here's the proposed fix" as the starting point.

However, a new kind of pressure may emerge: "Is the root cause the Agent identified actually correct? Are there blind spots?" The tension between trusting the agent's output and risking an incorrect fix, versus distrusting it and re-investigating from scratch. Worth discussing as a team before it happens.

Required Skill Sets Change

The shift goes from "investigate from scratch when alerted" to "review the Agent's findings and judge whether anything was missed." It's similar to when CI/CD became standard — the emphasis moved from "memorize manual deployment procedures" to "design pipelines and make judgment calls when they fail." As automation advances, the ability to audit automated output and handle exceptions that automation can't becomes more important than hands-on execution skills.

For less experienced team members, learning design becomes necessary. Previously, "investigating incidents yourself" was the primary way to build incident response skills. With agents handling initial investigation, these "learn by doing" opportunities shrink.

Possible approaches include "form your own hypothesis before looking at the Agent's findings," "review Agent output with intentionally injected errors," or "regular incident response drills without the Agent." The same thinking as understanding manual deployment procedures even when you rely on CI/CD.

Can You Reduce Headcount?

For SRE team managers, this question is unavoidable. Can Frontier Agents adoption mean fewer people?

In the short term, "same headcount, broader coverage" is more realistic. Agent-handled initial investigations free up SRE team time. Whether that freed time goes to "headcount reduction" or "proactive improvements that were previously backlogged (SLO reviews, chaos engineering, architecture improvements)" is the real question. The latter likely delivers more organizational value.

SRE teams perpetually stuck in reactive incident response mode (the "firefighter" state) is a common challenge. Frontier Agents adoption is a catalyst for accelerating the shift from "firefighter" to "fire prevention engineer."

Constraints and Caveats

Reading the Numbers

AWS's published figures — "up to 75% MTTR reduction," "up to 80% faster investigations," "94% root cause accuracy" — are all preview-period customer-reported values, explicitly qualified with "up to." Whether your environment sees similar results depends on application complexity, monitoring maturity, and incident characteristics. Treat them as reference points and validate in your own environment.

WGU's "estimated 2 hours → 28 minutes" and LG CNS's "over 50% faster testing" are also results from specific situations. This article cites these numbers as material for understanding implications, not as guarantees that generalize.

Region Limitations

Both DevOps Agent and Security Agent are available in the same six regions at GA: US East (N. Virginia), US West (Oregon), Europe (Frankfurt / Ireland), and Asia Pacific (Sydney / Tokyo). Tokyo region availability is a plus for teams in Japan.

However, per New Claw Times analysis, DevOps Agent inference processing occurs in US regions regardless of the selected region. Organizations with strict GDPR or data residency requirements should verify this.

Free Trial Limitations

Both agents include a 2-month free trial, but DevOps Agent has monthly caps. Per official pricing, the trial period allows up to 10 Agent Spaces, 20 hours of incident investigation, 15 hours of prevention evaluations, and 20 hours of on-demand SRE tasks per month. Excess usage incurs standard charges. Sufficient for pre-production evaluation, but watch the limits for large-scale PoCs.

Multicloud Support Reality

DevOps Agent supports AWS, Azure, and on-premises. On-premises connection uses MCP, requiring access configuration for target tools. "It doesn't just see other environments without setup." Note that DevOps Agent does not explicitly support Google Cloud environments at GA.

Security Agent's penetration testing supports per the official GA announcement "AWS, Azure, GCP, other cloud-providers, and on-premises" — it can test any reachable endpoint regardless of cloud provider.

Security Agent: GitHub Only

As mentioned earlier, Security Agent's PR review (Code Review) only supports GitHub at GA. Organizations primarily using GitLab or Bitbucket need to factor in this constraint.

Summary: Not "Replacement" but "Redesigning the Division of Labor"

Frontier Agents don't "eliminate" SRE work — they "partition" it.

Work that shifts to agents centers on pattern recognition and correlation analysis: detecting anomalies across metrics and logs, matching against historical incidents to form hypotheses, systematically scanning code for vulnerabilities. This "intellectual labor that demands volume and speed" is where agents excel.

Work that stays human centers on judgment and decision-making: whether to apply a fix, how to assess business impact, how much risk to accept, how to evolve the architecture. These are context-dependent, requiring organizational knowledge and business priority judgment — outside the agent's scope.

For SRE team management, this is an opportunity to redesign team skill composition and on-call structure around this new division of labor. Not "agents mean we need fewer people," but "agents handle more of the routine, so humans can focus on work that demands greater judgment."

DEV Community

What Changes and What Stays the Same for SRE with AWS Frontier Agents

What Is a Frontier Agent?

What DevOps Agent Does and Doesn't Do

What the Agent Does

What the Agent Doesn't Do

What Security Agent Does and Doesn't Do

What the Agent Does

Penetration Testing

Design Review

PR Review

What the Agent Doesn't Do

How Does the Relationship with Existing Tools Change?

Monitoring Tools (Datadog / CloudWatch, etc.)

Incident Notification and On-Call Management (PagerDuty / Datadog On-Call, etc.)

GitHub (Security Agent PR Review)

Cost Structure

DevOps Agent: Support Credits

Security Agent: Penetration Testing Cost Transformation

How Does SRE Team Management Change?

The Nature of On-Call Pain Changes

Required Skill Sets Change

Can You Reduce Headcount?

Constraints and Caveats

Reading the Numbers

Region Limitations

Free Trial Limitations

Multicloud Support Reality

Security Agent: GitHub Only

Summary: Not "Replacement" but "Redesigning the Division of Labor"

Top comments (0)