maryam mairaj for SUDO Consultants

Posted on Apr 10

Implementing AWS Security & Compliance: A Hands-On Guide to IAM, Recovery, and Governance

#aws #devops #tutorial #security

Introduction

When organizations move to AWS, one of the biggest misconceptions is that security and compliance are automatically handled by the cloud provider. In reality, AWS follows a shared responsibility model, where AWS secures the infrastructure, but everything inside your account is your responsibility.
This is where most real-world issues begin.

Teams often deploy workloads quickly but overlook:

Fine-grained access control in IAM
Proper audit logging across regions
Continuous compliance monitoring
Well-defined disaster recovery strategies

As a result, environments become difficult to audit, risky to operate, and non-compliant with enterprise or regulatory standards.

This guide takes a hands-on implementation approach to AWS cloud security and compliance. Instead of discussing theory, we will walk through how to actually configure:

Identity and Access Management (IAM)
Security monitoring and compliance services
Disaster recovery mechanisms
Governance using AWS Organizations

Each section includes console steps, CLI commands, and practical reasoning so you understand not just how, but why each control is important.

1. AWS Security & Compliance Architecture Overview

Before diving in, here is how a secure AWS environment is structured and why each layer matters.

A well-architected setup typically consists of multiple layers:

Identity Layer
IAM controls who can access what. This includes users, roles, and policies.

Security and Monitoring Layer
Services like CloudTrail, AWS Config, GuardDuty, and Security Hub provide visibility into activities, configuration changes, and threats.

Infrastructure Layer
Your workloads run inside a VPC with properly segmented subnets and controlled access.

Recovery Layer
Backup strategies, cross-region replication, and failover mechanisms ensure business continuity.

Governance Layer
AWS Organizations and Service Control Policies enforce rules across accounts and prevent misconfigurations.

The key idea is defense in depth. No single service guarantees security, but together they create a resilient system.

2. Implementing Identity and Access Management (IAM) in AWS

IAM is the most critical component of AWS cloud security. If access is not properly controlled, even the best monitoring setup cannot prevent misuse.

In real-world environments, misconfigured IAM permissions are one of the leading causes of security incidents in AWS.

Step 1: Create an IAM Role

IAM roles are preferred over users for most workloads because they provide temporary credentials and reduce long-term risk.

Console Steps

Navigate to IAM → Roles
Click on Create Role
Choose a trusted entity (for example, EC2 or custom)
Attach only required permissions

AWS CLI

aws iam create-role \ - role-name S3ReadOnlyRole \ - assume-role-policy-document file://trust-policy.json

Why this matters

Using roles instead of static credentials aligns with AWS IAM best practices and reduces the risk of credential leakage.

Step 2: Apply Least Privilege Access

A common mistake is granting excessive permissions using wildcards. Instead, define precise access.

Example Policy

{ "Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::example-bucket/*" }

This attaches AmazonS3ReadOnlyAccess, which is read-only on S3 and nothing broader.

AWS CLI

aws iam attach-role-policy \ --role-name S3ReadOnlyRole \ --policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess

Best Practice

Always scope:

Actions
Resources
Conditions (if applicable)

This is essential for compliance frameworks like ISO 27001 and SOC 2.

Step 3: Enable Multi-Factor Authentication (MFA)

MFA adds a layer of security beyond passwords.

Here, MFA is configured using an authenticator app by scanning a QR code, which ensures that even if credentials are compromised, unauthorized access is still prevented.

AWS CLI

aws iam create-virtual-mfa-device \ --virtual-mfa-device-name MyMFADevice \ --outfile /tmp/mfa.png \ --bootstrap-method QRCodePNG

Insight
Many security breaches occur due to compromised credentials. MFA significantly reduces this risk.

Step 4: Organize Access Using IAM Groups

Instead of assigning permissions directly to users:

Create groups
Attach policies to groups
Add users to groups

An IAM group is created, and the AmazonS3ReadOnlyAccess policy is attached, ensuring that all users added to this group inherit consistent and controlled permissions.

This simplifies management and ensures consistency.

Step 5: Enable IAM Access Analyzer

IAM Access Analyzer is a critical tool for identifying unintended external access to your AWS resources. It continuously analyzes resource-based policies and flags resources that are shared with external accounts, the public internet, or unknown principals.

Access Analyzer serves three key functions: finding externally exposed resources, generating least-privilege policies from actual CloudTrail activity, and detecting unused access. Together, these help you continuously right-size your IAM posture.

AWS CLI

aws accessanalyzer create-analyzer \ --analyzer-name MyAnalyzer \ --type ACCOUNT

Why it matters

Without Access Analyzer, you cannot systematically detect S3 buckets, KMS keys, SQS queues, or IAM roles inadvertently exposed to the public or external accounts. It is essential for both compliance validation and continuous least-privilege enforcement.

Step 6: Set IAM Permission Boundaries

IAM Permission Boundaries define the maximum permissions that an IAM entity (user or role) can have, regardless of what policies are attached to it. They are the primary mechanism for safely delegating role creation to developers or automation without enabling privilege escalation.

For example, a developer account may be granted permission to create IAM roles, but with a boundary policy that caps those roles at S3 read-only access. Even if a developer attaches AdministratorAccess to a role they create, the boundary silently limits effective permissions to the approved scope.

AWS CLI

aws iam put-role-permissions-boundary \ --role-name DeveloperRole \ --permissions-boundary arn:aws:iam::ACCOUNT-ID:policy/DeveloperBoundaryPolicy

Step 7: Manage Secrets with AWS Secrets Manager

One of the most common compliance failures is hardcoding passwords, API keys, and database credentials in application code or environment variables. AWS Secrets Manager provides a secure, centralized store for application secrets with automatic rotation and KMS-backed encryption.

Secrets Manager integrates natively with RDS, Redshift, and DocumentDB to rotate credentials automatically without requiring application code changes. Each secret is encrypted with a KMS Customer Managed Key (CMK), giving you full control over key access and rotation.

AWS CLI

aws secretsmanager create-secret \ --name MyDatabasePassword \ --secret-string '{"username":"admin","password":"P@ssw0rd!"}' \ --kms-key-id alias/MyCMK

Best Practice
Enable automatic rotation for all database credentials, API keys, and OAuth tokens. Combine Secrets Manager with VPC endpoints so that Lambda functions and EC2 instances retrieve secrets without traversing the public internet. This is a baseline requirement for SOC 2 and PCI-DSS compliance.

3. Setting Up AWS Cloud Security Monitoring and Compliance

Security is not just about prevention. It is about visibility, detection, and response. The AWS services below give you all three.

Step 1: Enable CloudTrail
CloudTrail records all API activity, which is critical for auditing and investigations.

A multi-region CloudTrail is configured to ensure that all API activity across AWS regions is captured and stored securely in an S3 bucket for auditing and compliance purposes.

AWS CLI

aws cloudtrail create-trail \ --name MyTrail \ --s3-bucket-name my-cloudtrail-logs \ --is-multi-region-trail aws cloudtrail start-logging - name MyTrail

Why it matters

Without CloudTrail, you cannot answer:
Who made a change
When it happened
What exactly was modified

Hardening CloudTrail: Log Integrity and Immutable Storage

Storing logs in S3 is not enough. An attacker who gains account access can delete CloudTrail logs to cover their tracks, making your entire audit trail worthless. You must enforce log integrity using the following controls:

Log file validation: CloudTrail can generate a digest file every hour that contains the hash of every log file delivered. Enable this so you can cryptographically prove that no log was tampered with after delivery.

S3 Object Lock (WORM storage): Enable Object Lock on your CloudTrail S3 bucket in Compliance mode with a retention period aligned to your compliance requirements (typically 90 days to 1 year). Once locked, no user, including the root account, can delete or overwrite those log objects during the retention window.

KMS encryption on the trail: Encrypt CloudTrail log files using a Customer Managed Key (CMK). This ensures that even if someone gains read access to S3, they cannot read logs without also having KMS decrypt permission, which you control through key policy.

AWS CLI

aws cloudtrail update-trail \ --name MyTrail \ --enable-log-file-validation \ --kms-key-id alias/CloudTrailCMK

Step 2: Enable AWS Config

AWS Config tracks configuration changes and evaluates compliance continuously.

AWS Config plays a critical role in detecting configuration drift, ensuring that resources remain aligned with defined security baselines over time.

AWS Config is enabled to record all resource configurations, including global resources like IAM, allowing continuous monitoring and compliance evaluation across the environment.

AWS CLI

aws configservice put-configuration-recorder \ --configuration-recorder name=default,roleARN=arn:aws:iam::ACCOUNT-ID:role/config-role

Example Rules

S3 buckets must not be public
Root account usage should be restricted

Step 3: Enable GuardDuty

GuardDuty provides threat detection using anomaly detection and threat intelligence.

It uses machine learning and threat intelligence feeds to detect anomalies such as unauthorized access attempts and unusual API activity.

GuardDuty is enabled to continuously monitor the AWS environment for suspicious activity, unauthorized access, and potential threats, providing a centralized view of security findings.

AWS CLI

aws guardduty create-detector - enable

Step 4: Enable Security Hub
Security Hub aggregates findings and provides a compliance score.

Security Hub provides a centralized view of security findings and compliance posture by aggregating results from multiple AWS services, including GuardDuty, AWS Config, and IAM checks.

AWS CLI

aws securityhub enable-security-hub

Step 5: Enable Amazon Macie for Sensitive Data Discovery

You cannot claim compliance without knowing what data you actually have in your S3 buckets. Amazon Macie uses machine learning to automatically discover, classify, and protect sensitive data, including Personally Identifiable Information (PII), financial data, credentials, and API keys stored in S3.

Macie continuously inventories your S3 buckets and evaluates them for access controls, encryption status, and public exposure. It generates findings when sensitive data is discovered in unencrypted or publicly accessible buckets, which feed directly into Security Hub for centralized visibility.

AWS CLI

aws macie2 enable-macie aws macie2 create-classification-job \ --job-type SCHEDULED \ --name SensitiveDataScan \ --s3-job-definition file://macie-job.json

Why it matters

GDPR, HIPAA, and PCI-DSS all require that you know where sensitive data lives. Without Macie, compliance is theoretical rather than real. A single misconfigured S3 bucket containing PII could trigger a reportable breach under GDPR, so catching it early matters.

Step 6: Enable AWS Audit Manager

The blog has covered monitoring, detection, and logging. But compliance requires more than monitoring tools: you need a structured way to prove controls to auditors. AWS Audit Manager is the primary AWS service built for this purpose.

Audit Manager automates the collection of evidence against industry-standard frameworks, including SOC 2, PCI-DSS, HIPAA, GDPR, and CIS Benchmarks. It pulls evidence directly from AWS Config rules, CloudTrail activity, Security Hub findings, and IAM policies, then maps each piece of evidence to the specific control it satisfies. This creates an audit-ready package without manual spreadsheet work.

AWS CLI

aws auditmanager register-account aws auditmanager create-assessment \ --name SOC2Assessment \ --framework-id <SOC2_FRAMEWORK_ID> \ --assessment-reports-destination file://destination.json \ --roles file://roles.json \ --scope file://scope.json

Why it matters

Without Audit Manager, there is no structured, control-to-evidence mapping between your AWS configuration and compliance requirements. Security Hub tells you what is failing. Audit Manager tells you what that means against SOC 2 CC6.1 or PCI-DSS Requirement 10, and packages it for auditors. Both are necessary for a production compliance program.

3a. Encryption: KMS, Customer Managed Keys, and Data-at-Rest / In-Transit

Encryption is foundational to any security and compliance program. In AWS, encryption covers two domains: data at rest (stored data) and data in transit (data moving between services or clients). Neither is optional for regulated workloads.

Step 1: Create a Customer Managed Key (CMK) in AWS KMS

AWS-managed keys are convenient but give you limited control. Customer Managed Keys (CMKs) let you define exactly who can use and administer the key through a key policy, enable automatic annual key rotation, and audit every cryptographic operation via CloudTrail. CMKs are the standard for compliance workloads.

AWS CLI

aws kms create-key \ --description "CMK for S3 and RDS encryption" \ --key-usage ENCRYPT_DECRYPT \ --origin AWS_KMS aws kms enable-key-rotation - key-id <key-id>

Step 2: Enable SSE-KMS on S3

Apply SSE-KMS as the default encryption policy on all S3 buckets used for sensitive or regulated data. Every object written to the bucket is automatically encrypted using your CMK, and every decrypt operation is logged in CloudTrail. Combine this with a bucket policy that denies any PutObject request missing the x-amz-server-side-encryption header.

AWS CLI

aws s3api put-bucket-encryption \ --bucket my-sensitive-bucket \ --server-side-encryption-configuration '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"aws:kms","KMSMasterKeyID":"alias/MyCMK"}}]}'

Step 3: Enforce Encryption in Transit with TLS and ACM

All data in transit must be encrypted using TLS 1.2 or higher. AWS Certificate Manager (ACM) provides free, auto-renewing TLS certificates for use with ALB, CloudFront, API Gateway, and other services. For S3, enforce TLS by adding a bucket policy that denies any request where the condition aws:SecureTransport is false.

For RDS and other managed services, enable SSL/TLS connections at the parameter group level. For RDS MySQL, set require_secure_transport=ON. For PostgreSQL, set ssl=1 and enforce it using an IAM policy condition that requires rds:ssl.

Step 4: Envelope Encryption

Envelope encryption is how AWS KMS works at scale. Encrypting large amounts of data directly with the CMK is not practical because it has size limits and incurs per-API-call charges. Instead, AWS generates a Data Encryption Key (DEK) to encrypt your data locally, then uses the CMK to encrypt only the DEK. AWS SDKs handle this automatically. Understanding the model matters for compliance documentation and for any custom encryption built with the AWS Encryption SDK.

3b. VPC Security: Network Segmentation, Flow Logs, and VPC Endpoints

A VPC is mentioned in the architecture overview, but implementing it securely requires explicit hands-on configuration. The Infrastructure Layer is only as strong as its network controls. The following steps cover the critical components.

Step 1: Public/Private Subnet Segmentation

Never place databases, caches, or internal services in public subnets. The standard pattern is: public subnets contain only load balancers and NAT gateways; private subnets contain application servers; isolated subnets (no route to the internet) contain databases. This limits the blast radius if any tier is compromised.

Step 2: Configure Security Groups and NACLs

Security Groups are stateful firewalls that operate at the instance level. Use them to whitelist only required ports and source ranges: for example, allow port 443 inbound from 0.0.0.0/0 on the ALB security group, but allow port 3306 only from the application-tier security group on the database security group.

Network ACLs (NACLs) are stateless and operate at the subnet boundary. Use them as a second line of defense to explicitly deny known-malicious IP ranges and block unwanted outbound traffic that security groups might miss due to their stateful nature.

AWS CLI (Create Security Group)

aws ec2 create-security-group \ --group-name DatabaseSG \ --description "Allow MySQL from app tier only" \ --vpc-id vpc-xxxxxxxx

Step 3: Enable VPC Flow Logs

VPC Flow Logs capture metadata about all IP traffic entering and leaving your VPC, subnets, and individual ENIs. They are essential for incident investigation, detecting lateral movement, and proving to auditors that you have network-level visibility. Send flow logs to CloudWatch Logs or S3 for querying with Athena.

AWS CLI

aws ec2 create-flow-logs \ --resource-type VPC \ --resource-ids vpc-xxxxxxxx \ --traffic-type ALL \ --log-destination-type s3 \ --log-destination arn:aws:s3:::my-flow-logs-bucket

Step 4: Configure VPC Endpoints

Here is something worth knowing: even though you own both your EC2 instance and your S3 bucket, traffic between them travels over the public internet by default. VPC endpoints fix this by keeping all traffic within the AWS network, and they let you apply endpoint policies to restrict exactly which buckets or KMS keys are accessible from your VPC.

AWS CLI

aws ec2 create-vpc-endpoint \ --vpc-id vpc-xxxxxxxx \ --service-name com.amazonaws.us-east-1.s3 \ --vpc-endpoint-type Gateway \ --route-table-ids rtb-xxxxxxxx

4. Designing AWS Disaster Recovery for High Availability

A robust AWS disaster recovery strategy. Keeps your system available even when failures occur.

A well-designed AWS disaster recovery strategy minimises downtime and protects business-critical workloads from regional outages.

Step 1: Configure AWS Backup
AWS Backup centralizes backup management across services.

An AWS Backup plan is created to automate daily backups with a defined retention period, ensuring that data can be recovered in case of failure or data loss.

AWS CLI

aws backup create-backup-plan \ --backup-plan file://backup-plan.json

Step 2: Enable S3 Cross-Region Replication

Cross-region replication ensures your data remains available even if an entire AWS region goes offline.

Cross-region replication is configured to automatically replicate objects from the source bucket to a destination bucket in another region, ensuring data durability and disaster recovery.

AWS CLI

aws s3api put-bucket-replication \ --bucket source-bucket \ --replication-configuration file://replication.json

Step 3: Multi-Region Database Resilience

An important distinction: RDS Multi-AZ protects against Availability Zone failures, not regional failures. If an entire AWS Region becomes unavailable due to a large-scale event, Multi-AZ alone will not keep your database online. True disaster recovery requires a multi-region architecture using the following services:

RDS Cross-Region Read Replicas: Asynchronously replicate RDS instances to a secondary region. In a disaster, you can promote the read replica to a standalone primary.
Aurora Global Database: Aurora Global Database replicates across up to five secondary regions with a typical lag of under one second. Failover to a secondary region can be completed in under a minute, making it suitable for near-zero RPO workloads.
DynamoDB Global Tables: DynamoDB Global Tables provide fully managed, multi-region, multi-active replication. Every region can both read and write, and changes propagate globally in milliseconds. This is the AWS-native way to achieve active-active multi-region for DynamoDB workloads.
Multi-Region KMS Keys: KMS keys are regional by default. For cross-region DR, create multi-region KMS keys so that your encrypted data can be decrypted in the failover region without needing to re-encrypt it. This is essential for KMS-encrypted RDS snapshots, S3 objects, and Secrets Manager secrets that need to be accessible during a regional failover.

AWS CLI (Multi-Region KMS Key)

aws kms create-key \ --multi-region \ --description "Multi-Region CMK for DR" \ --key-usage ENCRYPT_DECRYPT

Note on High Availability vs. Disaster Recovery

Multi-AZ is a high availability feature: it protects against instance hardware failure and AZ-level outages with automatic failover in under two minutes. Cross-region replication is a disaster recovery feature: it protects against regional outages and requires a planned or unplanned failover event. Both are needed in production environments, and your RTO/RPO targets should drive which multi-region pattern you choose.

As shown above, RDS Multi-AZ automatically fails over to a standby replica in another Availability Zone, keeping your database available during instance-level outages.

Step 4: Set up Route 53 Failover

Route 53 handles DNS-level failover automatically, redirecting traffic to a healthy endpoint the moment your primary goes down.

With failover routing configured, Route 53 monitors your primary endpoint via health checks and automatically switches traffic to the secondary when the primary becomes unhealthy. No manual intervention is needed.

AWS CLI

aws route53 change-resource-record-sets \ --hosted-zone-id ZONEID \ --change-batch file://failover.json

Understanding RTO and RPO

RTO defines how quickly systems must recover
RPO defines acceptable data loss

These values guide your architecture decisions.

5. Implementing AWS Governance Using Organizations and SCPs

Governance ensures consistency, especially in multi-account environments.

In enterprise environments, SCPs are commonly used to enforce guardrails such as restricting regions, preventing public access, and controlling critical actions.

Step 1: Set up AWS Organizations

AWS Organizations enables centralized management of multiple AWS accounts, allowing administrators to enforce policies, control access, and standardize configurations across environments.

Organizational Units (OUs) are created to logically separate environments such as Development and Production, enabling structured governance and policy enforcement across accounts.

Create organization
Add accounts
Define structure

Step 2: Apply Service Control Policies
SCPs act as organisation-wide guardrails. They define the maximum permissions any account in a given OU can exercise, regardless of what individual IAM policies allow.

The Service Control Policy is successfully attached to the Production Organizational Unit, restricting actions such as S3 bucket deletion across all accounts within the OU.

Examples:

Deny public S3 access
Restrict regions

Step 2b: Apply Resource Control Policies (RCPs)

SCPs control what your IAM principals (users, roles) are allowed to do. But they do not control who can access your resources from outside your organization. Resource Control Policies (RCPs) fill this gap by acting as the resource-side complement to SCPs.

An RCP is attached to a resource type (S3 buckets, KMS keys, SQS queues, and similar) and applies organization-wide. For example, an RCP can enforce that no S3 bucket in your organization can ever be accessed by principals outside the organization, regardless of what the bucket policy says. This provides a hard guardrail that cannot be overridden by individual account administrators.

Example RCP (Deny cross-org S3 access)
{"Version":"2012–10–17","Statement":[{"Effect":"Deny","Principal":"*","Action":"s3:*","Resource":"*","Condition":{"StringNotEqualsIfExists":{"aws:PrincipalOrgID":"o-xxxxxxxxxxxx"}}}]}

Why this matters

SCPs alone only cover half the governance story. An SCP prevents your principals from doing things outside the organization. An RCP prevents external principals from accessing resources inside your organization. Together, they create a complete perimeter. For regulated industries such as financial services and healthcare, implementing both is a compliance requirement under data residency and data isolation controls.

Step 3: Continuous Compliance Monitoring

AWS Config continuously evaluates resource configurations against predefined rules and automatically identifies non-compliant resources, ensuring ongoing adherence to security best practices

Compliance rules are evaluated periodically and on configuration changes, ensuring that any deviation from defined standards is immediately detected.

Step 4: Cost Governance

Cost governance is a key pillar of FinOps, helping organizations balance performance, cost, and operational efficiency.

AWS Cost Explorer provides detailed insights into cloud spending patterns, allowing teams to monitor usage trends, analyze costs by service, and identify opportunities for optimization.

6. Best Practices for AWS Security and Compliance

Always follow least privilege access
Enable logging across all regions
Use a multi-account strategy
Regularly review compliance reports
Automate backups and recovery testing
Encrypt all data at rest with Customer Managed Keys and enforce encryption in transit with TLS
Enable CloudTrail log file validation and protect logs with S3 Object Lock
Use AWS Secrets Manager with automatic rotation for all application credentials
Implement both SCPs and RCPs for a complete organizational governance perimeter
Use Aurora Global Database, DynamoDB Global Tables, and multi-region KMS keys for true cross-region DR
Use AWS Audit Manager to continuously collect compliance evidence for SOC 2, PCI-DSS, HIPAA, and GDPR

7. Common Mistakes in AWS Security and Compliance

Using overly permissive IAM roles
Not enabling logging across all regions
Ignoring compliance violations
No disaster recovery testing
Lack of governance controls
Storing secrets and credentials in code, environment variables, or S3 instead of Secrets Manager
Deploying workloads without encryption at rest or in transit, especially for regulated data
Confusing Multi-AZ high availability with cross-region disaster recovery
Not protecting CloudTrail logs against deletion, leaving the audit trail untrustworthy
Implementing SCPs without RCPs, leaving resources accessible to external accounts
Monitoring without Audit Manager, resulting in no structured compliance evidence for auditors

8. Final Thoughts

Building a secure and compliant AWS environment is an ongoing process, not a one-time setup. By layering identity management, encryption, network security, monitoring, disaster recovery, governance, and compliance automation, you build a cloud architecture that holds up under both attack and audit.

By focusing on:

Strong IAM practices
Continuous monitoring
Reliable disaster recovery
Governance at scale
End-to-end encryption with KMS Customer Managed Keys
VPC network controls and secrets management
Structured compliance evidence collection with Audit Manager

Security in AWS is not a feature you turn on. It is a posture you build, layer by layer. Start with IAM, get logging in place, and everything else follows from there.

Whether you are a startup moving fast or an enterprise in a regulated industry, this layered approach gives you a production-ready foundation you can build on and audit with confidence.

DEV Community

Implementing AWS Security & Compliance: A Hands-On Guide to IAM, Recovery, and Governance

Introduction

1. AWS Security & Compliance Architecture Overview

2. Implementing Identity and Access Management (IAM) in AWS

3. Setting Up AWS Cloud Security Monitoring and Compliance

Hardening CloudTrail: Log Integrity and Immutable Storage

Step 3: Enable GuardDuty

4. Designing AWS Disaster Recovery for High Availability

5. Implementing AWS Governance Using Organizations and SCPs

6. Best Practices for AWS Security and Compliance

7. Common Mistakes in AWS Security and Compliance

8. Final Thoughts

Top comments (0)