DEV Community

Cover image for AWS EC2 Deep Dive: Architecture, Operations, and Best Practices
Manish Kumar
Manish Kumar

Posted on

AWS EC2 Deep Dive: Architecture, Operations, and Best Practices

AWS EC2 Complete Working Reference Guide

Instance Types and Families

Instance Type Nomenclature

  • Format: [Family][Generation][Additional Capabilities].[Size]
  • Example: c7g.xlarge
    • c = Compute optimized family
    • 7 = 7th generation
    • g = AWS Graviton processor
    • xlarge = Size

Instance Families Overview

Family Category Processor Options Use Cases Key Characteristics
T3, T3a, T4g General Purpose Intel, AMD, Graviton Web servers, dev/test, microservices Burstable CPU, cost-effective
M5, M6i, M7i General Purpose Intel, AMD, Graviton Databases, application servers Balanced CPU/memory/network
C5, C6i, C7g Compute Optimized Intel, AMD, Graviton HPC, batch processing, gaming High CPU-to-memory ratio
R5, R6i, R7g, X1, X2 Memory Optimized Intel, AMD, Graviton In-memory databases, big data High memory-to-CPU ratio
I3, I4i, D2, D3 Storage Optimized Intel, AMD Data warehousing, NoSQL, distributed file systems High IOPS, local NVMe storage
P4, P5, G5, Inf2, Trn1 Accelerated Computing NVIDIA GPUs, AWS Trainium/Inferentia ML training/inference, rendering GPUs, TPUs, specialized accelerators
Mac General Purpose Apple Silicon iOS/macOS development Dedicated Mac hardware
Hpc7g HPC Optimized Graviton Molecular dynamics, CFD simulations Optimized for tightly coupled workloads

Instance Sizes

  • nano, micro, small, medium
  • large, xlarge, 2xlarge, 4xlarge, 8xlarge, 12xlarge, 16xlarge, 24xlarge, 32xlarge, 48xlarge, 56xlarge, 112xlarge
  • Each size typically doubles vCPUs and memory from previous size
  • Metal instances provide access to physical server resources

Processor Variants

  • Intel: Standard option (M5, C5, R5)
  • AMD: Cost-optimized (M5a, C5a, R5a - typically 10% cheaper)
  • AWS Graviton: ARM-based, up to 40% better price-performance (M7g, C7g, R7g)
  • g suffix: Graviton processor
  • a suffix: AMD processor
  • n suffix: Enhanced networking
  • d suffix: Instance store volumes included

Pricing Models Comparison

Model Commitment Savings Flexibility Best For Interruption Risk
On-Demand None None (baseline) Full Spiky workloads, dev/test None
Reserved Instances 1-3 years Up to 72% Instance family/region locked Predictable, steady-state workloads None
Savings Plans - Compute 1-3 years Up to 66% Any instance type/region Flexible compute usage None
Savings Plans - EC2 1-3 years Up to 72% Instance family locked, region locked Predictable EC2 usage in specific family None
Spot Instances None Up to 90% Full Fault-tolerant, batch jobs Yes (2-minute warning)
Dedicated Hosts On-demand or 1-3 year Additional RI discounts Physical server control BYOL, compliance None
Capacity Reservations On-demand None (billed if unused) AZ-specific capacity Business-critical apps None

Spot Instance Characteristics

  • Variable pricing based on supply/demand
  • 2-minute interruption notification
  • Can be 85% cheaper than On-Demand during low demand periods
  • Example: c7i.2xlarge at \$0.054/hour (Spot) vs \$0.357/hour (On-Demand)
  • Best for: Stateless applications, CI/CD, data processing, containerized workloads

Savings Plans Priority

  • Applies to On-Demand usage first
  • Leftover commitment applies to Spot at Spot rates
  • Example: \$100/hour plan with \$80 On-Demand + \$30 Spot = covers \$80 On-Demand fully + \$20 Spot

Reserved Instances Types

  • Standard RI: Maximum savings, least flexibility
  • Convertible RI: Can change instance family, lower discount
  • Scheduled RI: Reserved for specific time windows (deprecated)

Instance Launch Methods

Launch via AWS Console

  1. Navigate to EC2 Dashboard → Launch Instance
  2. Configure:
    • Name and tags
    • AMI selection (Amazon Linux, Ubuntu, Windows, etc.)
    • Instance type
    • Key pair (create or select existing)
    • Network settings (VPC, subnet, security groups)
    • Storage configuration
    • Advanced details (user data, IAM role, metadata options)
  3. Review and launch

Launch via AWS CLI

# Basic instance launch
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.medium \
  --key-name MyKeyPair \
  --security-group-ids sg-0123456789abcdef0 \
  --subnet-id subnet-0bb1c79de3EXAMPLE \
  --count 1 \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=MyInstance}]'

# Launch with user data
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.medium \
  --key-name MyKeyPair \
  --security-group-ids sg-0123456789abcdef0 \
  --subnet-id subnet-0bb1c79de3EXAMPLE \
  --user-data file://user-data.sh \
  --iam-instance-profile Name=MyInstanceProfile

# Launch Spot Instance
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.medium \
  --instance-market-options '{"MarketType":"spot","SpotOptions":{"MaxPrice":"0.05","SpotInstanceType":"one-time"}}' \
  --key-name MyKeyPair \
  --security-group-ids sg-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

User Data Script Example

#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from $(hostname -f)</h1>" > /var/www/html/index.html

# Get instance metadata
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
AZ=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
echo "<p>Instance ID: $INSTANCE_ID</p>" >> /var/www/html/index.html
echo "<p>Availability Zone: $AZ</p>" >> /var/www/html/index.html
Enter fullscreen mode Exit fullscreen mode

Launch with Terraform

resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.medium"
  key_name      = "MyKeyPair"

  vpc_security_group_ids = [aws_security_group.web.id]
  subnet_id              = aws_subnet.public.id

  iam_instance_profile = aws_iam_instance_profile.ec2_profile.name

  user_data = <<-EOF
              #!/bin/bash
              yum update -y
              yum install -y httpd
              systemctl start httpd
              systemctl enable httpd
              EOF

  root_block_device {
    volume_type = "gp3"
    volume_size = 30
    encrypted   = true
  }

  tags = {
    Name        = "WebServer"
    Environment = "Production"
  }

  monitoring = true
}
Enter fullscreen mode Exit fullscreen mode

Launch with CloudFormation

Resources:
  MyEC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      ImageId: ami-0c55b159cbfafe1f0
      InstanceType: t3.medium
      KeyName: MyKeyPair
      SecurityGroupIds:
        - !Ref WebSecurityGroup
      SubnetId: !Ref PublicSubnet
      IamInstanceProfile: !Ref EC2InstanceProfile
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          yum update -y
          yum install -y httpd
          systemctl start httpd
          systemctl enable httpd
      BlockDeviceMappings:
        - DeviceName: /dev/xvda
          Ebs:
            VolumeType: gp3
            VolumeSize: 30
            Encrypted: true
      Tags:
        - Key: Name
          Value: WebServer
Enter fullscreen mode Exit fullscreen mode

Launch Templates

Create Launch Template via CLI

aws ec2 create-launch-template \
  --launch-template-name MyLaunchTemplate \
  --version-description "Version 1" \
  --launch-template-data '{
    "ImageId": "ami-0c55b159cbfafe1f0",
    "InstanceType": "t3.medium",
    "KeyName": "MyKeyPair",
    "SecurityGroupIds": ["sg-0123456789abcdef0"],
    "IamInstanceProfile": {
      "Name": "MyInstanceProfile"
    },
    "BlockDeviceMappings": [{
      "DeviceName": "/dev/xvda",
      "Ebs": {
        "VolumeSize": 30,
        "VolumeType": "gp3",
        "DeleteOnTermination": true,
        "Encrypted": true
      }
    }],
    "Monitoring": {
      "Enabled": true
    },
    "UserData": "IyEvYmluL2Jhc2gKCnl1bSB1cGRhdGUgLXkKeXVtIGluc3RhbGwgLXkgaHR0cGQ="
  }'
Enter fullscreen mode Exit fullscreen mode

Launch Template with Systems Manager Parameter

# Create SSM parameter for AMI ID
aws ssm put-parameter \
  --name "/golden-ami/latest" \
  --value "ami-0c55b159cbfafe1f0" \
  --type "String"

# Create launch template referencing SSM parameter
aws ec2 create-launch-template \
  --launch-template-name MyTemplate \
  --launch-template-data '{
    "ImageId": "resolve:ssm:/golden-ami/latest",
    "InstanceType": "t3.medium"
  }'
Enter fullscreen mode Exit fullscreen mode

Launch Template with Terraform

resource "aws_launch_template" "app" {
  name_prefix   = "app-"
  image_id      = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.medium"
  key_name      = "MyKeyPair"

  vpc_security_group_ids = [aws_security_group.app.id]

  iam_instance_profile {
    name = aws_iam_instance_profile.app.name
  }

  block_device_mappings {
    device_name = "/dev/xvda"

    ebs {
      volume_size           = 30
      volume_type           = "gp3"
      iops                  = 3000
      throughput            = 125
      delete_on_termination = true
      encrypted             = true
    }
  }

  network_interfaces {
    associate_public_ip_address = true
    delete_on_termination       = true
    security_groups             = [aws_security_group.app.id]
  }

  monitoring {
    enabled = true
  }

  user_data = base64encode(<<-EOF
              #!/bin/bash
              yum update -y
              yum install -y httpd
              systemctl start httpd
              EOF
  )

  tag_specifications {
    resource_type = "instance"
    tags = {
      Name = "AppServer"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Launch Instance from Template

aws ec2 run-instances \
  --launch-template LaunchTemplateName=MyLaunchTemplate,Version=1 \
  --count 2 \
  --subnet-id subnet-0bb1c79de3EXAMPLE
Enter fullscreen mode Exit fullscreen mode

Update Launch Template (Create New Version)

aws ec2 create-launch-template-version \
  --launch-template-id lt-0abcd290751193123 \
  --source-version 1 \
  --launch-template-data '{"InstanceType":"t3.large"}'
Enter fullscreen mode Exit fullscreen mode

Storage Options

Storage Type Comparison

Type Persistence Performance Use Case Backup Method
EBS (gp3) Yes (network-attached) 3,000-16,000 IOPS General purpose, boot volumes EBS Snapshots
EBS (gp2) Yes (network-attached) Up to 16,000 IOPS Legacy general purpose EBS Snapshots
EBS (io2) Yes (network-attached) Up to 64,000 IOPS High-performance databases EBS Snapshots
EBS (st1) Yes (network-attached) Throughput-optimized Big data, data warehouses EBS Snapshots
EBS (sc1) Yes (network-attached) Cold HDD, lowest cost Infrequent access EBS Snapshots
Instance Store No (ephemeral) Very high IOPS Temporary data, caches Must use application-level backup

EBS Volume Types Detailed

gp3 (General Purpose SSD)

  • 3,000 IOPS baseline (configurable up to 16,000)
  • 125 MB/s throughput baseline (configurable up to 1,000 MB/s)
  • Price: \$0.08/GB-month
  • Independent IOPS and throughput configuration
  • Recommended for most workloads

gp2 (General Purpose SSD - Legacy)

  • IOPS scales with volume size (3 IOPS per GB)
  • Burstable up to 3,000 IOPS for volumes < 1 TB
  • Throughput: up to 250 MB/s
  • Use gp3 for new deployments (better value)

io2 Block Express (Provisioned IOPS SSD)

  • Up to 256,000 IOPS per volume
  • 99.999% durability
  • Up to 4,000 MB/s throughput
  • Sub-millisecond latency
  • Use for critical databases

EBS Volume Operations

# Create EBS volume
aws ec2 create-volume \
  --availability-zone us-east-1a \
  --size 100 \
  --volume-type gp3 \
  --iops 3000 \
  --throughput 125 \
  --encrypted \
  --tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=MyVolume}]'

# Attach volume to instance
aws ec2 attach-volume \
  --volume-id vol-0123456789abcdef0 \
  --instance-id i-0123456789abcdef0 \
  --device /dev/sdf

# Modify volume (increase size and IOPS)
aws ec2 modify-volume \
  --volume-id vol-0123456789abcdef0 \
  --size 200 \
  --iops 5000

# Create snapshot
aws ec2 create-snapshot \
  --volume-id vol-0123456789abcdef0 \
  --description "Backup of MyVolume"

# Create volume from snapshot
aws ec2 create-volume \
  --snapshot-id snap-0123456789abcdef0 \
  --availability-zone us-east-1a \
  --volume-type gp3

# Detach volume
aws ec2 detach-volume \
  --volume-id vol-0123456789abcdef0

# Delete volume
aws ec2 delete-volume \
  --volume-id vol-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

EBS Snapshot Management

# Create multi-volume snapshot for entire instance
aws ec2 create-snapshots \
  --instance-specification InstanceId=i-0123456789abcdef0 \
  --description "Full instance backup"

# Copy snapshot to another region
aws ec2 copy-snapshot \
  --source-region us-east-1 \
  --source-snapshot-id snap-0123456789abcdef0 \
  --destination-region us-west-2 \
  --description "DR copy"

# Create AMI from instance (includes all attached EBS volumes)
aws ec2 create-image \
  --instance-id i-0123456789abcdef0 \
  --name "MyGoldenImage" \
  --description "Production baseline" \
  --no-reboot

# List snapshots
aws ec2 describe-snapshots \
  --owner-ids self \
  --filters "Name=status,Values=completed"

# Delete snapshot
aws ec2 delete-snapshot \
  --snapshot-id snap-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

Instance Store Characteristics

  • Physically attached to host server
  • Data lost on instance stop/terminate/hardware failure
  • Included in instance price (no additional cost)
  • Very high IOPS (millions)
  • Available on specific instance types (c5d, m5d, r5d, i3, i4i)

AMI Management

Create Custom AMI

# Create AMI from running instance (with reboot)
aws ec2 create-image \
  --instance-id i-0123456789abcdef0 \
  --name "MyCustomAMI-$(date +%Y%m%d)" \
  --description "Custom application image"

# Create AMI without rebooting
aws ec2 create-image \
  --instance-id i-0123456789abcdef0 \
  --name "MyCustomAMI" \
  --no-reboot

# Register AMI from snapshot
aws ec2 register-image \
  --name "MyAMI" \
  --root-device-name /dev/xvda \
  --block-device-mappings \
    "DeviceName=/dev/xvda,Ebs={SnapshotId=snap-0123456789abcdef0,VolumeType=gp3}"
Enter fullscreen mode Exit fullscreen mode

AMI Operations

# List AMIs owned by you
aws ec2 describe-images \
  --owners self \
  --filters "Name=state,Values=available"

# Copy AMI to another region
aws ec2 copy-image \
  --source-region us-east-1 \
  --source-image-id ami-0123456789abcdef0 \
  --region us-west-2 \
  --name "MyAMI-Copy"

# Share AMI with another account
aws ec2 modify-image-attribute \
  --image-id ami-0123456789abcdef0 \
  --launch-permission "Add=[{UserId=123456789012}]"

# Make AMI public
aws ec2 modify-image-attribute \
  --image-id ami-0123456789abcdef0 \
  --launch-permission "Add=[{Group=all}]"

# Deregister AMI
aws ec2 deregister-image \
  --image-id ami-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

AMI User Data

  • User data NOT stored in AMI
  • Must specify user data each time launching from AMI
  • User data embedded in launch templates persists across launches
  • AMI captures: OS, applications, configurations, attached EBS volume snapshots

Networking Configuration

Security Groups vs Network ACLs

Feature Security Groups Network ACLs
Scope Instance-level Subnet-level
State Stateful (return traffic auto-allowed) Stateless (must explicitly allow return)
Rules Allow rules only Allow and Deny rules
Rule Processing All rules evaluated Rules evaluated in order
Default Behavior Deny all inbound, allow all outbound Default NACL allows all
Assignment Must be explicitly assigned Automatically applied to subnet
Rule Limit 60 inbound + 60 outbound per group 20 inbound + 20 outbound per NACL

Security Group Configuration

# Create security group
aws ec2 create-security-group \
  --group-name WebServerSG \
  --description "Security group for web servers" \
  --vpc-id vpc-0123456789abcdef0

# Add inbound rules
aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 80 \
  --cidr 0.0.0.0/0

aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0

aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 22 \
  --cidr 203.0.113.0/24

# Allow traffic from another security group
aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 3306 \
  --source-group sg-9876543210abcdef0

# Remove rule
aws ec2 revoke-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 22 \
  --cidr 0.0.0.0/0

# Add outbound rule
aws ec2 authorize-security-group-egress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0
Enter fullscreen mode Exit fullscreen mode

Security Group with Terraform

resource "aws_security_group" "web" {
  name        = "web-server-sg"
  description = "Security group for web servers"
  vpc_id      = aws_vpc.main.id

  ingress {
    description = "HTTP from anywhere"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description = "HTTPS from anywhere"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    description     = "SSH from bastion"
    from_port       = 22
    to_port         = 22
    protocol        = "tcp"
    security_groups = [aws_security_group.bastion.id]
  }

  egress {
    description = "All outbound traffic"
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "WebServerSG"
  }
}
Enter fullscreen mode Exit fullscreen mode

Network ACL Configuration

# Create Network ACL
aws ec2 create-network-acl \
  --vpc-id vpc-0123456789abcdef0

# Add inbound rule (allow HTTP)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0123456789abcdef0 \
  --ingress \
  --rule-number 100 \
  --protocol tcp \
  --port-range From=80,To=80 \
  --cidr-block 0.0.0.0/0 \
  --rule-action allow

# Add deny rule (higher priority)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0123456789abcdef0 \
  --ingress \
  --rule-number 99 \
  --protocol icmp \
  --icmp-type-code Code=-1,Type=-1 \
  --cidr-block 0.0.0.0/0 \
  --rule-action deny

# Add outbound rule for ephemeral ports
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0123456789abcdef0 \
  --egress \
  --rule-number 100 \
  --protocol tcp \
  --port-range From=1024,To=65535 \
  --cidr-block 0.0.0.0/0 \
  --rule-action allow

# Associate NACL with subnet
aws ec2 replace-network-acl-association \
  --association-id aclassoc-0123456789abcdef0 \
  --network-acl-id acl-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

Elastic Network Interface (ENI)

# Create ENI with static private IP
aws ec2 create-network-interface \
  --subnet-id subnet-0123456789abcdef0 \
  --description "Primary network interface" \
  --groups sg-0123456789abcdef0 \
  --private-ip-address 10.0.1.10

# Attach ENI to instance
aws ec2 attach-network-interface \
  --network-interface-id eni-0123456789abcdef0 \
  --instance-id i-0123456789abcdef0 \
  --device-index 1

# Assign secondary private IP
aws ec2 assign-private-ip-addresses \
  --network-interface-id eni-0123456789abcdef0 \
  --private-ip-addresses 10.0.1.11 10.0.1.12

# Detach ENI
aws ec2 detach-network-interface \
  --attachment-id eni-attach-0123456789abcdef0

# Delete ENI
aws ec2 delete-network-interface \
  --network-interface-id eni-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

Elastic IP (EIP)

# Allocate Elastic IP
aws ec2 allocate-address --domain vpc

# Associate EIP with instance
aws ec2 associate-address \
  --instance-id i-0123456789abcdef0 \
  --allocation-id eipalloc-0123456789abcdef0

# Associate EIP with ENI
aws ec2 associate-address \
  --network-interface-id eni-0123456789abcdef0 \
  --allocation-id eipalloc-0123456789abcdef0

# Disassociate EIP
aws ec2 disassociate-address \
  --association-id eipassoc-0123456789abcdef0

# Release EIP
aws ec2 release-address \
  --allocation-id eipalloc-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

Enhanced Networking

  • SR-IOV (Single Root I/O Virtualization): Higher PPS, lower latency, lower jitter
  • ENA (Elastic Network Adapter): Up to 100 Gbps, required for current generation instances
  • Intel 82599 VF: Up to 10 Gbps, legacy instances
  • Placement Groups: Cluster, Partition, Spread

Placement Groups

# Create cluster placement group (low latency)
aws ec2 create-placement-group \
  --group-name HPC-Cluster \
  --strategy cluster

# Create partition placement group (distributed)
aws ec2 create-placement-group \
  --group-name BigData-Partition \
  --strategy partition \
  --partition-count 7

# Create spread placement group (high availability)
aws ec2 create-placement-group \
  --group-name Critical-Spread \
  --strategy spread

# Launch instance in placement group
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type c5n.18xlarge \
  --placement "GroupName=HPC-Cluster"
Enter fullscreen mode Exit fullscreen mode
Placement Strategy Max Instances Use Case Characteristics
Cluster Thousands HPC, low-latency apps Single AZ, same hardware
Partition 7 partitions per AZ Distributed systems (Hadoop, Cassandra) Isolated hardware per partition
Spread 7 instances per AZ Critical applications Each instance on separate hardware

Instance Lifecycle Management

Instance States

State Description Billing Operations Allowed
pending Launching, preparing Not billed Wait
running Instance is running Billed Stop, reboot, hibernate, terminate
stopping Preparing to stop Not billed Wait
stopped Instance shutdown, can restart Not billed (storage charges apply) Start, terminate
shutting-down Preparing to terminate Not billed Wait
terminated Permanently deleted Not billed None (cannot restart)
hibernate RAM saved to EBS, quick restart Billed during stopping Start

Instance Operations

# Start instance
aws ec2 start-instances --instance-ids i-0123456789abcdef0

# Stop instance
aws ec2 stop-instances --instance-ids i-0123456789abcdef0

# Reboot instance
aws ec2 reboot-instances --instance-ids i-0123456789abcdef0

# Terminate instance
aws ec2 terminate-instances --instance-ids i-0123456789abcdef0

# Enable termination protection
aws ec2 modify-instance-attribute \
  --instance-id i-0123456789abcdef0 \
  --disable-api-termination

# Disable termination protection
aws ec2 modify-instance-attribute \
  --instance-id i-0123456789abcdef0 \
  --no-disable-api-termination

# Change instance type (must stop first)
aws ec2 stop-instances --instance-ids i-0123456789abcdef0
aws ec2 modify-instance-attribute \
  --instance-id i-0123456789abcdef0 \
  --instance-type "{\"Value\": \"t3.large\"}"
aws ec2 start-instances --instance-ids i-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

Hibernation

  • RAM contents saved to EBS root volume
  • Must be enabled at launch
  • Instance resumes with same instance ID and private IP
  • Faster startup than stop/start
  • Requirements:
    • Supported instance families: C3-C5, M3-M5, R3-R5, T2-T3
    • RAM must be < 150 GB
    • Root volume must be EBS, encrypted
    • Cannot hibernate > 60 days
# Launch instance with hibernation enabled
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type m5.large \
  --hibernation-options Configured=true \
  --block-device-mappings \
    "DeviceName=/dev/xvda,Ebs={VolumeSize=30,Encrypted=true}"

# Hibernate instance
aws ec2 stop-instances \
  --instance-ids i-0123456789abcdef0 \
  --hibernate
Enter fullscreen mode Exit fullscreen mode

Instance Metadata Service (IMDS)

# IMDSv1 (legacy)
curl http://169.254.169.254/latest/meta-data/

# IMDSv2 (token-based, more secure)
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/

# Common metadata endpoints
# Instance ID
curl http://169.254.169.254/latest/meta-data/instance-id

# Availability Zone
curl http://169.254.169.254/latest/meta-data/placement/availability-zone

# IAM role credentials
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/ROLE-NAME

# User data
curl http://169.254.169.254/latest/user-data
Enter fullscreen mode Exit fullscreen mode

Enforce IMDSv2

aws ec2 modify-instance-metadata-options \
  --instance-id i-0123456789abcdef0 \
  --http-tokens required \
  --http-put-response-hop-limit 1
Enter fullscreen mode Exit fullscreen mode

Auto Scaling

Auto Scaling Components

  • Launch Template: Defines instance configuration
  • Auto Scaling Group (ASG): Manages instance fleet
  • Scaling Policies: Define when to scale
  • Load Balancer: Distributes traffic

Create Auto Scaling Group

aws autoscaling create-auto-scaling-group \
  --auto-scaling-group-name MyASG \
  --launch-template "LaunchTemplateName=MyLaunchTemplate,Version=1" \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 3 \
  --vpc-zone-identifier "subnet-0123,subnet-4567,subnet-89ab" \
  --target-group-arns "arn:aws:elasticloadbalancing:region:account:targetgroup/my-tg/abc123" \
  --health-check-type ELB \
  --health-check-grace-period 300 \
  --tags "Key=Name,Value=WebServer,PropagateAtLaunch=true"
Enter fullscreen mode Exit fullscreen mode

Auto Scaling with Terraform

resource "aws_autoscaling_group" "web" {
  name                = "web-asg"
  min_size            = 2
  max_size            = 10
  desired_capacity    = 3
  health_check_type   = "ELB"
  health_check_grace_period = 300
  vpc_zone_identifier = aws_subnet.private[*].id
  target_group_arns   = [aws_lb_target_group.web.arn]

  launch_template {
    id      = aws_launch_template.web.id
    version = "$Latest"
  }

  tag {
    key                 = "Name"
    value               = "WebServer"
    propagate_at_launch = true
  }

  enabled_metrics = [
    "GroupDesiredCapacity",
    "GroupInServiceInstances",
    "GroupMinSize",
    "GroupMaxSize"
  ]
}
Enter fullscreen mode Exit fullscreen mode

Scaling Policies

Target Tracking Scaling

aws autoscaling put-scaling-policy \
  --auto-scaling-group-name MyASG \
  --policy-name target-tracking-cpu \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ASGAverageCPUUtilization"
    },
    "TargetValue": 70.0
  }'
Enter fullscreen mode Exit fullscreen mode

Step Scaling

# Create CloudWatch alarm
aws cloudwatch put-metric-alarm \
  --alarm-name high-cpu \
  --alarm-description "Scale up when CPU > 80%" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=AutoScalingGroupName,Value=MyASG

# Create scaling policy
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name MyASG \
  --policy-name scale-up-policy \
  --policy-type StepScaling \
  --adjustment-type ChangeInCapacity \
  --step-adjustments \
    "MetricIntervalLowerBound=0,MetricIntervalUpperBound=10,ScalingAdjustment=1" \
    "MetricIntervalLowerBound=10,ScalingAdjustment=2"
Enter fullscreen mode Exit fullscreen mode

Scheduled Scaling

aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name MyASG \
  --scheduled-action-name ScaleUpMorning \
  --start-time "2026-01-20T08:00:00Z" \
  --recurrence "0 8 * * MON-FRI" \
  --min-size 5 \
  --max-size 20 \
  --desired-capacity 10
Enter fullscreen mode Exit fullscreen mode

Lifecycle Hooks

aws autoscaling put-lifecycle-hook \
  --lifecycle-hook-name instance-launching-hook \
  --auto-scaling-group-name MyASG \
  --lifecycle-transition autoscaling:EC2_INSTANCE_LAUNCHING \
  --default-result CONTINUE \
  --heartbeat-timeout 300 \
  --notification-target-arn arn:aws:sns:region:account:my-topic
Enter fullscreen mode Exit fullscreen mode

Load Balancing

Load Balancer Types

Type OSI Layer Protocol Use Case Key Features
Application Load Balancer (ALB) Layer 7 HTTP/HTTPS Web applications, microservices Path/host routing, WebSocket, HTTP/2
Network Load Balancer (NLB) Layer 4 TCP/UDP/TLS High-performance, low latency Static IP, millions RPS, preserve source IP
Gateway Load Balancer (GWLB) Layer 3 IP Third-party virtual appliances Traffic inspection, firewall integration
Classic Load Balancer (CLB) Layer 4/7 TCP/HTTP Legacy applications Deprecated for new deployments

Application Load Balancer with Auto Scaling

# Create target group
aws elbv2 create-target-group \
  --name web-tg \
  --protocol HTTP \
  --port 80 \
  --vpc-id vpc-0123456789abcdef0 \
  --health-check-enabled \
  --health-check-protocol HTTP \
  --health-check-path /health \
  --health-check-interval-seconds 30 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3

# Create ALB
aws elbv2 create-load-balancer \
  --name web-alb \
  --subnets subnet-0123 subnet-4567 \
  --security-groups sg-0123456789abcdef0 \
  --scheme internet-facing \
  --type application

# Create listener
aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:region:account:loadbalancer/app/web-alb/abc123 \
  --protocol HTTP \
  --port 80 \
  --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:account:targetgroup/web-tg/xyz789

# Add HTTPS listener with SSL certificate
aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:region:account:loadbalancer/app/web-alb/abc123 \
  --protocol HTTPS \
  --port 443 \
  --certificates CertificateArn=arn:aws:acm:region:account:certificate/cert-id \
  --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:account:targetgroup/web-tg/xyz789
Enter fullscreen mode Exit fullscreen mode

ALB with Terraform

resource "aws_lb" "web" {
  name               = "web-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = aws_subnet.public[*].id

  enable_deletion_protection = true
  enable_http2              = true
}

resource "aws_lb_target_group" "web" {
  name     = "web-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = aws_vpc.main.id

  health_check {
    path                = "/health"
    protocol            = "HTTP"
    interval            = 30
    timeout             = 5
    healthy_threshold   = 2
    unhealthy_threshold = 3
  }
}

resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.web.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type = "redirect"
    redirect {
      port        = "443"
      protocol    = "HTTPS"
      status_code = "HTTP_301"
    }
  }
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.web.arn
  port              = "443"
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate.web.arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.web.arn
  }
}

# Path-based routing
resource "aws_lb_listener_rule" "api" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 100

  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api.arn
  }

  condition {
    path_pattern {
      values = ["/api/*"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

ELB Health Checks

  • Auto Scaling uses health checks to replace unhealthy instances
  • Health check types:
    • EC2: Instance status checks
    • ELB: Load balancer health checks (recommended)
  • Grace period: Time before health checks start after instance launch

Monitoring and CloudWatch

CloudWatch Metrics for EC2

Basic Monitoring (Free, 5-minute intervals)

  • CPUUtilization
  • DiskReadOps, DiskWriteOps
  • DiskReadBytes, DiskWriteBytes
  • NetworkIn, NetworkOut
  • NetworkPacketsIn, NetworkPacketsOut
  • StatusCheckFailed, StatusCheckFailed_Instance, StatusCheckFailed_System

Detailed Monitoring (Paid, 1-minute intervals)

# Enable detailed monitoring
aws ec2 monitor-instances --instance-ids i-0123456789abcdef0

# Disable detailed monitoring
aws ec2 unmonitor-instances --instance-ids i-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

Custom Metrics

  • Memory utilization (not included by default)
  • Disk space utilization
  • Application-specific metrics

CloudWatch Agent Installation

# Download and install CloudWatch agent
wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
sudo rpm -U ./amazon-cloudwatch-agent.rpm

# Configure agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

# Start agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config \
  -m ec2 \
  -s \
  -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
Enter fullscreen mode Exit fullscreen mode

CloudWatch Agent Configuration (JSON)

{
  "metrics": {
    "namespace": "CustomMetrics/EC2",
    "metrics_collected": {
      "mem": {
        "measurement": [
          {"name": "mem_used_percent", "rename": "MemoryUtilization", "unit": "Percent"}
        ],
        "metrics_collection_interval": 60
      },
      "disk": {
        "measurement": [
          {"name": "used_percent", "rename": "DiskUtilization", "unit": "Percent"}
        ],
        "metrics_collection_interval": 60,
        "resources": ["*"]
      }
    }
  },
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/log/httpd/access_log",
            "log_group_name": "/aws/ec2/httpd",
            "log_stream_name": "{instance_id}/access_log"
          }
        ]
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

CloudWatch Alarms

# Create CPU alarm
aws cloudwatch put-metric-alarm \
  --alarm-name high-cpu-alarm \
  --alarm-description "Alert when CPU exceeds 80%" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
  --alarm-actions arn:aws:sns:region:account:my-topic

# Create disk space alarm (custom metric)
aws cloudwatch put-metric-alarm \
  --alarm-name high-disk-usage \
  --alarm-description "Alert when disk usage > 80%" \
  --metric-name DiskUtilization \
  --namespace CustomMetrics/EC2 \
  --statistic Average \
  --period 300 \
  --evaluation-periods 1 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=InstanceId,Value=i-0123456789abcdef0,Name=path,Value=/ \
  --alarm-actions arn:aws:sns:region:account:my-topic

# Create alarm with EC2 action (stop instance)
aws cloudwatch put-metric-alarm \
  --alarm-name stop-instance-on-high-cpu \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --evaluation-periods 3 \
  --threshold 95 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
  --alarm-actions arn:aws:automate:region:ec2:stop
Enter fullscreen mode Exit fullscreen mode

Alarm Actions

  • SNS notification
  • EC2 action: stop, terminate, reboot, recover
  • Auto Scaling action
  • Systems Manager action
  • Lambda function invocation

CloudWatch Logs

# Create log group
aws logs create-log-group --log-group-name /aws/ec2/application

# Set retention policy
aws logs put-retention-policy \
  --log-group-name /aws/ec2/application \
  --retention-in-days 7

# Create metric filter
aws logs put-metric-filter \
  --log-group-name /aws/ec2/application \
  --filter-name ErrorCount \
  --filter-pattern "[ERROR]" \
  --metric-transformations \
    metricName=ApplicationErrors,metricNamespace=CustomApp,metricValue=1
Enter fullscreen mode Exit fullscreen mode

IAM Roles and Instance Profiles

Create IAM Role for EC2

# Create trust policy
cat > ec2-trust-policy.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF

# Create IAM role
aws iam create-role \
  --role-name EC2-S3-Access-Role \
  --assume-role-policy-document file://ec2-trust-policy.json

# Attach policy to role
aws iam attach-role-policy \
  --role-name EC2-S3-Access-Role \
  --policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess

# Create instance profile
aws iam create-instance-profile \
  --instance-profile-name EC2-S3-Access-Profile

# Add role to instance profile
aws iam add-role-to-instance-profile \
  --instance-profile-name EC2-S3-Access-Profile \
  --role-name EC2-S3-Access-Role

# Attach instance profile to running instance
aws ec2 associate-iam-instance-profile \
  --instance-id i-0123456789abcdef0 \
  --iam-instance-profile Name=EC2-S3-Access-Profile
Enter fullscreen mode Exit fullscreen mode

IAM Role with Terraform

resource "aws_iam_role" "ec2_role" {
  name = "ec2-app-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "s3_access" {
  role       = aws_iam_role.ec2_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
}

resource "aws_iam_instance_profile" "ec2_profile" {
  name = "ec2-app-profile"
  role = aws_iam_role.ec2_role.name
}

resource "aws_instance" "app" {
  ami                  = "ami-0c55b159cbfafe1f0"
  instance_type        = "t3.medium"
  iam_instance_profile = aws_iam_instance_profile.ec2_profile.name
}
Enter fullscreen mode Exit fullscreen mode

Instance Management Commands

List and Describe Instances

# List all instances
aws ec2 describe-instances

# List instances with specific state
aws ec2 describe-instances \
  --filters "Name=instance-state-name,Values=running"

# List instances with specific tag
aws ec2 describe-instances \
  --filters "Name=tag:Environment,Values=Production"

# Get instance details in table format
aws ec2 describe-instances \
  --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,State.Name,PrivateIpAddress,PublicIpAddress,Tags[?Key==`Name`].Value|]' \
  --output table

# Get specific instance details
aws ec2 describe-instances \
  --instance-ids i-0123456789abcdef0

# Get instance status
aws ec2 describe-instance-status \
  --instance-ids i-0123456789abcdef0
Enter fullscreen mode Exit fullscreen mode

Tagging Operations

# Create tags
aws ec2 create-tags \
  --resources i-0123456789abcdef0 \
  --tags Key=Name,Value=WebServer Key=Environment,Value=Production

# Delete tags
aws ec2 delete-tags \
  --resources i-0123456789abcdef0 \
  --tags Key=OldTag
Enter fullscreen mode Exit fullscreen mode

Console Access and Troubleshooting

# Get console output
aws ec2 get-console-output \
  --instance-id i-0123456789abcdef0

# Get console screenshot
aws ec2 get-console-screenshot \
  --instance-id i-0123456789abcdef0

# Get password data (Windows)
aws ec2 get-password-data \
  --instance-id i-0123456789abcdef0 \
  --priv-launch-key-file MyKeyPair.pem
Enter fullscreen mode Exit fullscreen mode

Cost Optimization Best Practices

Right-Sizing

  • Use AWS Compute Optimizer for recommendations
  • Monitor CloudWatch metrics for actual utilization
  • Start with burstable instances (T3/T4g) for variable workloads
  • Use AWS Cost Explorer to identify underutilized instances

Instance Selection

  • Prefer Graviton instances (T4g, M7g, C7g) for up to 40% better price-performance
  • Use AMD instances (T3a, M5a, C5a) for 10% cost savings
  • Consider Spot instances for fault-tolerant workloads (up to 90% savings)
  • Implement Savings Plans for committed usage (up to 72% savings)

Storage Optimization

  • Use gp3 instead of gp2 (20% cheaper, better performance)
  • Delete unused EBS volumes and snapshots
  • Implement lifecycle policies for snapshot retention
  • Use S3 for infrequently accessed data

Auto Scaling Configuration

  • Set appropriate min/max/desired capacity
  • Use target tracking for dynamic scaling
  • Implement scheduled scaling for predictable patterns
  • Configure scale-in protection for long-running tasks

Monitoring and Cleanup

  • Tag all resources for cost allocation
  • Set up billing alerts
  • Regularly review and terminate unused instances
  • Use AWS Trusted Advisor for optimization recommendations

Security Best Practices

Network Security

  • Deploy instances in private subnets
  • Use security groups with least privilege
  • Implement Network ACLs for subnet-level filtering
  • Enable VPC Flow Logs for traffic analysis
  • Use AWS PrivateLink for service access

Access Control

  • Use IAM roles instead of access keys
  • Implement least privilege IAM policies
  • Enable MFA for privileged operations
  • Use Systems Manager Session Manager instead of SSH (no key management)
  • Rotate SSH keys regularly

Data Protection

  • Enable EBS encryption by default
  • Encrypt snapshots
  • Use encrypted AMIs
  • Implement backup strategies
  • Enable termination protection for critical instances

Instance Hardening

  • Keep OS and applications updated
  • Use AWS Systems Manager Patch Manager
  • Implement host-based firewalls
  • Disable unnecessary services
  • Use IMDSv2 for metadata access
  • Enable CloudWatch Logs for audit trails

Monitoring and Compliance

  • Enable CloudTrail for API logging
  • Use AWS Config for compliance monitoring
  • Implement AWS Security Hub
  • Set up CloudWatch alarms for security events
  • Regular security assessments and penetration testing

This comprehensive reference provides all essential working details for AWS EC2 operations in a structured, point-wise format suitable for quick reference and immediate implementation.

Top comments (0)