DEV Community

Cover image for Open Source an AI Agent That Audits Your AWS Account
Shambu Pujar
Shambu Pujar

Posted on

Open Source an AI Agent That Audits Your AWS Account

cloud-audit-agent is an open source CLI that uses Claude to reason over your live AWS data and produce a prioritized security and cost audit — no rule libraries to maintain, no agents to manage.


The Problem

AWS accounts drift. Access keys go unrotated. Security groups accumulate open ports. S3 buckets get misconfigured. Bills spike unexpectedly. None of this is surprising — it's the normal entropy of a live AWS account.

The tools that exist to catch this have real limitations:

  • Rule-based scanners (GuardDuty, Security Hub, Prowler) catch known patterns but miss cross-service context. They can't tell you that the misconfigured S3 bucket is the same one running up a $500/month storage bill.
  • Cost tools and security tools are separate. You need a different dashboard, a different workflow, and a different engineer to correlate them.
  • Maintaining rule libraries is expensive. Every new AWS service, every new attack pattern, every new compliance requirement means adding rules. The rules drift too.

The Insight

Instead of encoding rules, give raw AWS API responses to an LLM and let it reason.

A senior engineer auditing an AWS account would run aws iam list-users, aws s3api get-bucket-policy, and aws ce get-cost-and-usage — then read the results together and draw conclusions. cloud-audit-agent does exactly that. It calls read-only AWS APIs across five domains, hands Claude all the raw data, and gets back a prioritized finding report with remediation steps.

No rule engine. No maintenance burden. The reasoning improves as Claude improves.


What It Audits

Scope Tools What gets checked
iam 7 Access key age (flags >90 days), MFA status per user, role trust policies, inline and attached policies, account authorization snapshot
s3 6 All four public access block settings, server-side encryption, bucket policies (detects wildcard Principal: *), per-bucket storage cost by storage class
ec2 4 Security groups open to 0.0.0.0/0 or ::/0 on 10 sensitive ports (SSH 22, RDP 3389, MySQL 3306, Postgres 5432, MongoDB 27017, Redis 6379, and others), VPC flow log status, network ACLs
cost 4 Cost breakdown by service (daily/monthly granularity), 30-day forecast, top usage types per service, anomaly detection
compliance 3 Active Security Hub findings (filterable by severity), AWS Config rule compliance, CIS AWS Foundations Benchmark findings

Quick Start

No install required:

npx @trellisclad/cloud-audit-agent --scope all
Enter fullscreen mode Exit fullscreen mode

Common patterns:

# Scoped audit — IAM and S3 only, runs in ~2 minutes
npx @trellisclad/cloud-audit-agent --scope iam s3 --region eu-west-1

# Machine-parseable output for CI pipelines
npx @trellisclad/cloud-audit-agent --scope cost --format json | jq '.findings[]'

# Self-contained HTML report
npx @trellisclad/cloud-audit-agent --scope all --format html > report.html

# Redacted version safe to share publicly
npx @trellisclad/cloud-audit-agent --scope all --redact --format html > report-sanitized.html
Enter fullscreen mode Exit fullscreen mode

Requirements: Node.js 18+, AWS credentials (ReadOnlyAccess managed policy or equivalent), Claude Max subscription or Anthropic API key.


Key Capabilities

Cost + Security Correlation

Most tools treat cost and security as separate domains. cloud-audit-agent correlates them. When a bucket has security findings AND significant storage spend, the dollar amount appears directly in the finding description:

"S3 bucket [BUCKET_A] has public read access enabled via bucket policy and no server-side encryption. Estimated storage cost: ~$340/month. Immediate remediation recommended."

This changes the business conversation. "This bucket is misconfigured" is easy to defer. "This misconfigured bucket costs $340/month and is publicly readable" is not.

Dual Anomaly Detection

The cost anomaly tool runs two independent detection algorithms simultaneously:

Trend comparison — Compares two consecutive 30-day windows. Flags services where BOTH the percentage change (default: >50%) AND the absolute change (default: >$10) exceed thresholds. The dual-gate prevents noise from small services with volatile-looking percentages.

Statistical spike detection — Fetches 60 days of daily cost data per service. Computes a baseline mean and standard deviation. Flags any day in the last 30 days that exceeds mean + 2σ. Both algorithms are user-configurable.

Running both simultaneously and combining results catches different failure modes: a service that gradually crept up 60% over a month (trend detection) vs. a service that had one anomalous day last Tuesday (spike detection).

Redact Mode

--redact replaces sensitive values throughout the entire report with stable placeholder tokens: 12-digit account IDs, ARNs, security group IDs (sg-xxx), VPC IDs, subnet IDs, access key IDs, and real resource names. "babylon-mainnet-ledger-backups" becomes "web-assets-bucket" — consistently, everywhere it appears.

This makes it safe to share findings with vendors, post them in Slack, or include them in tickets without leaking real infrastructure details.

Four Output Formats

Format Use case
markdown (default) Terminal output, piping to other tools
human Terminal display — groups by severity, strips raw JSON
html Stakeholder reports — self-contained, color-coded severity badges
json CI integration — typed AuditReport object with findings array and audit metadata

Programmatic API

The runAudit() function is the package's primary export for embedding audits in CI pipelines, Slack bots, or custom dashboards:

import { runAudit } from "@trellisclad/cloud-audit-agent";

const result = await runAudit({
  awsConfig: { region: "us-east-1", profile: "staging" },
  scopes: ["iam", "s3"],
  anomalyThresholds: { minAbsoluteChange: 25, percentChangeThreshold: 30 },
  analysisPeriodDays: 14,
  drillDownMinSpend: 100,
  includeCliCommands: true, // include `aws` CLI fix commands in recommendations
});

console.log(result.markdown);
// result.costUsd, result.durationMs, result.numTurns, result.inputTokens, result.outputTokens
Enter fullscreen mode Exit fullscreen mode

How It's Built

This section is for developers who want to understand the architecture or adapt the pattern for their own Claude-powered tools.

Architecture

CLI (commander) → runAudit() → Claude Agent SDK (query())
                                     ↓
                             In-process MCP servers
                    ┌─────────────────────────────────────┐
                    │  aws-iam  aws-s3  aws-ec2           │
                    │  aws-cost  aws-compliance           │
                    └─────────────────────────────────────┘
                                     ↓
                           AWS SDK v3 (read-only calls)
Enter fullscreen mode Exit fullscreen mode

Claude Agent SDK + In-Process MCP Servers

The agent uses @anthropic-ai/claude-agent-sdk with a set of MCP tool servers — but unlike typical MCP setups, there are no external server processes. Each server (createIamToolsServer(), createS3ToolsServer(), etc.) is an in-process object created via createSdkMcpServer().

Only servers for the requested --scope flags are instantiated. If you run --scope cost, Claude only sees the 4 cost tools — not the 20 total. This keeps the tool surface minimal and reduces unnecessary API calls.

3-Phase System Prompt

The system prompt enforces a strict workflow:

  1. DETECT — Call ALL tools in the requested scopes in a single turn using parallel tool calls. Do not wait for one result before calling the next.
  2. ANALYZE — Reason over the collected data.
  3. RECOMMEND — Produce prioritized findings.

The parallel-first constraint in DETECT keeps turn counts low: 10 turns for a single scope, up to 32 for a full audit (25 base + 5 for cost drill-down + 2 for S3 cost estimation). The turn budget is auto-computed — users don't need to tune --max-turns manually.

Effect-TS Typed Error Handling

Every AWS SDK call is wrapped with awsCall(), which converts Promise rejections into a tagged error union:

AwsPermissionError | AwsThrottleError | AwsNotFoundError | AwsServiceError
Enter fullscreen mode Exit fullscreen mode

Tool handlers use Effect.catchTag("AwsNotFoundError", ...) to handle "no bucket policy exists" as a data case rather than a crash. The TypeScript compiler tracks which errors can escape each handler — a more rigorous approach than try/catch chains where missing branches are silent.

Parallel AWS calls within a tool use Effect.forEach with { concurrency: "unbounded" }. This means auditing 50 IAM users happens as 50 concurrent API calls, not a sequential loop.

OpenTelemetry Tracing

--trace integrates with Arize Phoenix via @arizeai/phoenix-otel. The span hierarchy:

Span Kind Covers
audit AGENT Entire audit run
turn_N CHAIN Each agent turn
tool:name TOOL Each tool call (input/output attributes)

Uses OpenInference semantic conventions (openinference.span.kind, input.value, output.value) for compatibility with Phoenix's LLM observability UI. Tool calls are intercepted by createTracedSdkMcpServer(), which wraps each handler transparently.

Bedrock Routing

Set CLAUDE_CODE_USE_BEDROCK=1 to route all LLM calls through AWS Bedrock. The agent queries ListInferenceProfiles to discover which Anthropic models are enabled in your account, then auto-configures the model IDs — preferring us.*-prefixed profiles and versioned IDs (containing a date string like 20250514) over unversioned aliases.

This matters for organizations with data residency requirements or that want AWS-managed model access without exposing an Anthropic API key.


What It Doesn't Do

  • Modify AWS resources. Every tool is annotated readOnlyHint: true, destructiveHint: false. The agent prompt says explicitly: "NEVER attempt to modify, create, or delete any AWS resources." This is safe to run in production.
  • Replace dedicated compliance tooling. For SOC 2, PCI, or HIPAA evidence collection, you need tools with audit trails and formal attestation workflows. cloud-audit-agent surfaces findings; it doesn't produce compliance artifacts.
  • Monitor continuously. This is an on-demand audit tool. Run it from CI on a schedule, from a Slack bot, or as a pre-release check — not as a real-time monitoring agent.

Try It

npx @trellisclad/cloud-audit-agent --scope all
Enter fullscreen mode Exit fullscreen mode

Source and docs: github.com/trellisclad/cloud-audit-agent

If you find it useful, star the repo. If there are AWS services you want covered (RDS, Lambda, EKS, CloudTrail), open an issue.

Top comments (0)