IAM Access Analyzer nuked our prod hotfix because I fundamentally misunderstood how Zelkova evaluates wildcards

#aws #security #terraform #devops

TL;DR: Spent 6 hours debugging why our GitOps pipeline kept blocking a critical deployment. Turns out IAM Access Analyzer doesn't care about your Permission Boundaries when evaluating trust policies. Principal: "AWS: *" + a StringLike ARN condition is still globally exploitable. Fixed it with aws:PrincipalOrgID.

The Incident

Our zero-critical-finding security gate hard-blocked a hotfix for our order processing engine. The Terraform pipeline died during validation with:

[FATAL] IAM Access Analyzer finding [RESOURCE_PUBLICLY_ACCESSIBLE] 
detected on aws_iam_role.cross_account_event_bus. 
Trust policy allows Principal 'AWS:*'. Deployment halted.

The False Leads (aka me being an idiot)

First, I assumed IAM eventual consistency was screwing with us. Forced a state refresh, manually triggered aws accessanalyzer start-resource-scan, finding came right back.

Second hypothesis: I had a strict Permission Boundary on the role with aws:SourceVpc and aws:SourceIp conditions. I thought Zelkova (the automated reasoning engine behind Access Analyzer) would be smart enough to calculate the intersection of the trust policy + boundary and realize no external actor could satisfy the network requirements.

Wrong. Access Analyzer evaluates trust policies in complete isolation. It doesn't aggregate Permission Boundaries or SCPs when determining if something is publicly accessible.

The Root Cause

The role's trust policy had Principal: { "AWS": "*" } to support dynamic cross-account access for worker nodes across sub-accounts. To "secure" it, I added:

Condition: {
  StringLike: { "aws:PrincipalArn": "arn:aws:iam::*:role/worker-node-*" }
}

Looks safe, right? Nope. Zelkova uses formal logic. Since AWS account IDs are globally addressable, any attacker could create a role named worker-node-exploit in their own AWS account. That ARN would match the StringLike condition. Zelkova correctly flagged this as exploitable by the entire AWS universe.

The Fix

Replaced the non-deterministic ARN wildcard with aws:PrincipalOrgID:

condition {
  test     = "StringEquals"
  variable = "aws:PrincipalOrgID"
  values   = ["o-xyz123abc9"]
}

This mathematically proves to Zelkova that AWS: * is bounded to our AWS Organization. Finding cleared instantly.

Honestly, I got so annoyed debugging IAM policy logic at 2 AM that I refuse to paste our configs into ChatGPT or third-party linters because of compliance. So over the weekend, I just hacked together a pure client-side WASM utility that runs locally in the browser and redacts secrets before checking for this exact issue. Put it up here if anyone else wants to validate their trust policies without leaking data to the cloud:
IAM Access Analyzer Public Principal Auditor