The 2026 CrowdStrike Global Threat Report recorded the fastest lateral movement at 27 seconds after initial access. In one case, data exfiltration started four minutes after entry. Most organizations still schedule penetration tests quarterly, if they're lucky, and wait weeks for the report. By the time the PDF lands in your inbox, the findings are stale.
You ship features weekly. Your security review happens annually. Somewhere in between, you're shipping code that hasn't been tested, because the pen test backlog is six applications deep and the next slot is in Q3.
A 2025 Checkmarx report found that 81% of organizations knowingly deploy vulnerable code to meet delivery deadlines. Not because they don't care about security, but because the process can't keep up with the release cycle.
AWS Security Agent was built to close that gap.
The Pen Test Bottleneck
The bottleneck isn't finding vulnerabilities. It's everything around it.
Traditional pen testing is a project. You scope it. You negotiate a contract, somewhere between $15,000 and $50,000 per engagement, sometimes six figures for enterprise. You wait for the consultant's calendar to open up. They test for a week or two, constrained by time and budget, making trade-offs about what to test and how deeply. Three weeks later, you get a PDF.
Your team works to fix the findings. But you rarely have the budget to bring the testers back to validate the fixes. So you hope. The report starts aging the moment it's delivered. New code gets pushed. New endpoints go live.
It's the difference between a single photograph and a live video feed of your security posture.
Automated scanners have their own problem though: they don't understand your application. SAST looks at code without runtime context. DAST pokes at a running app without understanding what it's supposed to do. Neither knows your business logic or your security policies. They generate hundreds of findings, most of which are noise, and miss the ones that matter.
An Agent That Reads Your Code Before Breaking Your App
AWS Security Agent is an autonomous agent. It reads your design docs, studies your source code, ingests your API specs, and then figures out how to break your application the way a skilled human pen tester would. On demand. Hours instead of weeks.
What separates it from scanners is that it chains vulnerabilities together. Traditional tools find individual issues in isolation. AWS Security Agent connects them.
The GA blog post tells a story worth retelling:
- A stored XSS in a comment field. CVSS 6.1, medium severity. Every tool flags it. Nobody prioritizes it.
- That XSS captures an admin's session cookie. No tool detects this step. SAST analyzes code, not sessions. DAST crawls as a standard user. EDR sees valid HTTPS traffic.
- The hijacked admin session accesses
/admin/config, which returns the production database connection string with credentials in plaintext. CVSS 9.8. No other tool discovered it because the code works exactly as designed.
Individually: a medium, an invisible, and a "functioning as intended." Chained together: a full customer data breach. The agent tested each step, proved the full attack works, and elevated the entire sequence to critical.
Setting It Up
Setup takes about 30 minutes, one time.
You start by creating an Agent Space in the AWS Security Agent console. An Agent Space is a container for one application. The first time you create one, AWS spins up the Security Agent Web Application, a separate interface where your team runs reviews and tests.
For access, you pick between SSO via IAM Identity Center (good for teams) or IAM-only (simpler, no SSO config needed).
Then you define your security requirements. AWS provides managed ones based on industry standards, but the real value is in custom requirements. "All PII access must have session timeouts under 15 minutes." "Customer-managed KMS keys required for data at rest." You define them once, and they apply across every design review and code review in every Agent Space.
Connect your GitHub repos by installing the AWS Security Agent GitHub App. That gives the agent source code context for pen testing, automated security review on pull requests, and the ability to open remediation PRs when it finds something.
Last step: verify your domains. DNS TXT record or HTTP verification file, one time per domain, so the agent knows you own what it's about to test.
Running a Pen Test
Open the web app. Select your Agent Space. Create a new penetration test.
You give it a target URL, public or private via VPC. Authentication credentials for different roles, standard user, admin, service account. Sign-in instructions for complex auth flows like OAuth or SAML (the agent uses LLM-based navigation to handle them). And documentation: API specs, architecture docs, threat models. Context makes the findings better.
The agent does reconnaissance, enumerates endpoints, builds a custom attack plan, and runs multi-step attack scenarios across 13 risk categories. It adapts based on what it discovers, status codes, error messages, new endpoints, unexpected behaviors.
Hours later, you have validated findings. A CVSS score. The full attack path, what the agent tried, what payloads it used, how it verified exploitation. Reproduction steps. Impact analysis in business terms ("attackers can modify product prices during checkout"). Code fixes ready to implement.
From Finding to Fix
You review a finding. Click "remediate." The agent opens a PR in your GitHub repo with the fix. Your developer reviews and merges. Re-run the pen test to verify. Ship.
Compare that to the traditional loop: find a vulnerability, write a report, send it to the dev team, wait for a fix, try to get budget for a retest, hope it worked. Months between "found" and "verified."
The agent compresses that to hours.
Cost
| Traditional Pen Test | AWS Security Agent | |
|---|---|---|
| Time to results | 3 to 6 weeks | Hours |
| Cost per test | $15,000 to $50,000+ | ~$400 to $2,400 |
| Frequency | Annual or quarterly | On demand |
| Remediation validation | Separate engagement | Re-run immediately |
| Coverage | Top 3 to 5 critical apps | Entire portfolio |
| Multicloud | Varies | AWS, Azure, GCP, on-prem |
$50 per task-hour, metered per second. An average test runs about 24 task-hours, roughly $1,200. There's a 2-month free trial.
Design reviews (up to 200/month) and code reviews (up to 1,000/month) are free.
It's Also a Design and Code Review Tool
Pen testing gets the attention, but the other two capabilities change the day-to-day workflow more.
Design review catches security issues before code is written. Upload your architecture doc, and the agent checks it against your organizational requirements. "Your design doesn't specify network segmentation between the payment processing layer and the user-facing tier." Flagged before sprint planning, not after the incident.
Code review runs on every pull request. If your org requires 90-day log retention and a developer configures 365 days, the agent comments on the PR. Traditional tools miss this because the code is technically correct. The agent catches it because it knows your rules. It also checks for OWASP Top 10 vulnerabilities alongside your custom policies.
What It Won't Do
GitHub only for code review integration. No GitLab, Bitbucket, or CodeCommit.
Web apps and APIs only. Not your mobile app binary or IoT firmware.
It's an agent, not a security program. You still need threat modeling, incident response, and the rest.
Available in 6 AWS regions: us-east-1, us-west-2, eu-west-1, eu-central-1, ap-southeast-2, ap-northeast-1.
Where This Leaves Us
The security industry has spent a decade telling developers to "shift left." The problem was never willingness. It was tooling. You can't shift left with a 6-week engagement, a $30K budget, and a PDF that's outdated before the ink dries.
AWS Security Agent makes pen testing something you can run on a Tuesday afternoon and act on by Wednesday morning. Whether that replaces the annual engagement entirely or just fills the gaps between them depends on your org. But the gap between "how fast we ship" and "how fast we test" just got a lot smaller.





Top comments (0)