AWS EC2 Complete Working Reference Guide
Instance Types and Families
Instance Type Nomenclature
- Format:
[Family][Generation][Additional Capabilities].[Size] - Example:
c7g.xlarge-
c= Compute optimized family -
7= 7th generation -
g= AWS Graviton processor -
xlarge= Size
-
Instance Families Overview
| Family | Category | Processor Options | Use Cases | Key Characteristics |
|---|---|---|---|---|
| T3, T3a, T4g | General Purpose | Intel, AMD, Graviton | Web servers, dev/test, microservices | Burstable CPU, cost-effective |
| M5, M6i, M7i | General Purpose | Intel, AMD, Graviton | Databases, application servers | Balanced CPU/memory/network |
| C5, C6i, C7g | Compute Optimized | Intel, AMD, Graviton | HPC, batch processing, gaming | High CPU-to-memory ratio |
| R5, R6i, R7g, X1, X2 | Memory Optimized | Intel, AMD, Graviton | In-memory databases, big data | High memory-to-CPU ratio |
| I3, I4i, D2, D3 | Storage Optimized | Intel, AMD | Data warehousing, NoSQL, distributed file systems | High IOPS, local NVMe storage |
| P4, P5, G5, Inf2, Trn1 | Accelerated Computing | NVIDIA GPUs, AWS Trainium/Inferentia | ML training/inference, rendering | GPUs, TPUs, specialized accelerators |
| Mac | General Purpose | Apple Silicon | iOS/macOS development | Dedicated Mac hardware |
| Hpc7g | HPC Optimized | Graviton | Molecular dynamics, CFD simulations | Optimized for tightly coupled workloads |
Instance Sizes
- nano, micro, small, medium
- large, xlarge, 2xlarge, 4xlarge, 8xlarge, 12xlarge, 16xlarge, 24xlarge, 32xlarge, 48xlarge, 56xlarge, 112xlarge
- Each size typically doubles vCPUs and memory from previous size
- Metal instances provide access to physical server resources
Processor Variants
- Intel: Standard option (M5, C5, R5)
- AMD: Cost-optimized (M5a, C5a, R5a - typically 10% cheaper)
- AWS Graviton: ARM-based, up to 40% better price-performance (M7g, C7g, R7g)
- g suffix: Graviton processor
- a suffix: AMD processor
- n suffix: Enhanced networking
- d suffix: Instance store volumes included
Pricing Models Comparison
| Model | Commitment | Savings | Flexibility | Best For | Interruption Risk |
|---|---|---|---|---|---|
| On-Demand | None | None (baseline) | Full | Spiky workloads, dev/test | None |
| Reserved Instances | 1-3 years | Up to 72% | Instance family/region locked | Predictable, steady-state workloads | None |
| Savings Plans - Compute | 1-3 years | Up to 66% | Any instance type/region | Flexible compute usage | None |
| Savings Plans - EC2 | 1-3 years | Up to 72% | Instance family locked, region locked | Predictable EC2 usage in specific family | None |
| Spot Instances | None | Up to 90% | Full | Fault-tolerant, batch jobs | Yes (2-minute warning) |
| Dedicated Hosts | On-demand or 1-3 year | Additional RI discounts | Physical server control | BYOL, compliance | None |
| Capacity Reservations | On-demand | None (billed if unused) | AZ-specific capacity | Business-critical apps | None |
Spot Instance Characteristics
- Variable pricing based on supply/demand
- 2-minute interruption notification
- Can be 85% cheaper than On-Demand during low demand periods
- Example: c7i.2xlarge at \$0.054/hour (Spot) vs \$0.357/hour (On-Demand)
- Best for: Stateless applications, CI/CD, data processing, containerized workloads
Savings Plans Priority
- Applies to On-Demand usage first
- Leftover commitment applies to Spot at Spot rates
- Example: \$100/hour plan with \$80 On-Demand + \$30 Spot = covers \$80 On-Demand fully + \$20 Spot
Reserved Instances Types
- Standard RI: Maximum savings, least flexibility
- Convertible RI: Can change instance family, lower discount
- Scheduled RI: Reserved for specific time windows (deprecated)
Instance Launch Methods
Launch via AWS Console
- Navigate to EC2 Dashboard → Launch Instance
- Configure:
- Name and tags
- AMI selection (Amazon Linux, Ubuntu, Windows, etc.)
- Instance type
- Key pair (create or select existing)
- Network settings (VPC, subnet, security groups)
- Storage configuration
- Advanced details (user data, IAM role, metadata options)
- Review and launch
Launch via AWS CLI
# Basic instance launch
aws ec2 run-instances \
--image-id ami-0c55b159cbfafe1f0 \
--instance-type t3.medium \
--key-name MyKeyPair \
--security-group-ids sg-0123456789abcdef0 \
--subnet-id subnet-0bb1c79de3EXAMPLE \
--count 1 \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=MyInstance}]'
# Launch with user data
aws ec2 run-instances \
--image-id ami-0c55b159cbfafe1f0 \
--instance-type t3.medium \
--key-name MyKeyPair \
--security-group-ids sg-0123456789abcdef0 \
--subnet-id subnet-0bb1c79de3EXAMPLE \
--user-data file://user-data.sh \
--iam-instance-profile Name=MyInstanceProfile
# Launch Spot Instance
aws ec2 run-instances \
--image-id ami-0c55b159cbfafe1f0 \
--instance-type t3.medium \
--instance-market-options '{"MarketType":"spot","SpotOptions":{"MaxPrice":"0.05","SpotInstanceType":"one-time"}}' \
--key-name MyKeyPair \
--security-group-ids sg-0123456789abcdef0
User Data Script Example
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from $(hostname -f)</h1>" > /var/www/html/index.html
# Get instance metadata
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
AZ=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
echo "<p>Instance ID: $INSTANCE_ID</p>" >> /var/www/html/index.html
echo "<p>Availability Zone: $AZ</p>" >> /var/www/html/index.html
Launch with Terraform
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.medium"
key_name = "MyKeyPair"
vpc_security_group_ids = [aws_security_group.web.id]
subnet_id = aws_subnet.public.id
iam_instance_profile = aws_iam_instance_profile.ec2_profile.name
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
EOF
root_block_device {
volume_type = "gp3"
volume_size = 30
encrypted = true
}
tags = {
Name = "WebServer"
Environment = "Production"
}
monitoring = true
}
Launch with CloudFormation
Resources:
MyEC2Instance:
Type: AWS::EC2::Instance
Properties:
ImageId: ami-0c55b159cbfafe1f0
InstanceType: t3.medium
KeyName: MyKeyPair
SecurityGroupIds:
- !Ref WebSecurityGroup
SubnetId: !Ref PublicSubnet
IamInstanceProfile: !Ref EC2InstanceProfile
UserData:
Fn::Base64: !Sub |
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
BlockDeviceMappings:
- DeviceName: /dev/xvda
Ebs:
VolumeType: gp3
VolumeSize: 30
Encrypted: true
Tags:
- Key: Name
Value: WebServer
Launch Templates
Create Launch Template via CLI
aws ec2 create-launch-template \
--launch-template-name MyLaunchTemplate \
--version-description "Version 1" \
--launch-template-data '{
"ImageId": "ami-0c55b159cbfafe1f0",
"InstanceType": "t3.medium",
"KeyName": "MyKeyPair",
"SecurityGroupIds": ["sg-0123456789abcdef0"],
"IamInstanceProfile": {
"Name": "MyInstanceProfile"
},
"BlockDeviceMappings": [{
"DeviceName": "/dev/xvda",
"Ebs": {
"VolumeSize": 30,
"VolumeType": "gp3",
"DeleteOnTermination": true,
"Encrypted": true
}
}],
"Monitoring": {
"Enabled": true
},
"UserData": "IyEvYmluL2Jhc2gKCnl1bSB1cGRhdGUgLXkKeXVtIGluc3RhbGwgLXkgaHR0cGQ="
}'
Launch Template with Systems Manager Parameter
# Create SSM parameter for AMI ID
aws ssm put-parameter \
--name "/golden-ami/latest" \
--value "ami-0c55b159cbfafe1f0" \
--type "String"
# Create launch template referencing SSM parameter
aws ec2 create-launch-template \
--launch-template-name MyTemplate \
--launch-template-data '{
"ImageId": "resolve:ssm:/golden-ami/latest",
"InstanceType": "t3.medium"
}'
Launch Template with Terraform
resource "aws_launch_template" "app" {
name_prefix = "app-"
image_id = "ami-0c55b159cbfafe1f0"
instance_type = "t3.medium"
key_name = "MyKeyPair"
vpc_security_group_ids = [aws_security_group.app.id]
iam_instance_profile {
name = aws_iam_instance_profile.app.name
}
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = 30
volume_type = "gp3"
iops = 3000
throughput = 125
delete_on_termination = true
encrypted = true
}
}
network_interfaces {
associate_public_ip_address = true
delete_on_termination = true
security_groups = [aws_security_group.app.id]
}
monitoring {
enabled = true
}
user_data = base64encode(<<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
EOF
)
tag_specifications {
resource_type = "instance"
tags = {
Name = "AppServer"
}
}
}
Launch Instance from Template
aws ec2 run-instances \
--launch-template LaunchTemplateName=MyLaunchTemplate,Version=1 \
--count 2 \
--subnet-id subnet-0bb1c79de3EXAMPLE
Update Launch Template (Create New Version)
aws ec2 create-launch-template-version \
--launch-template-id lt-0abcd290751193123 \
--source-version 1 \
--launch-template-data '{"InstanceType":"t3.large"}'
Storage Options
Storage Type Comparison
| Type | Persistence | Performance | Use Case | Backup Method |
|---|---|---|---|---|
| EBS (gp3) | Yes (network-attached) | 3,000-16,000 IOPS | General purpose, boot volumes | EBS Snapshots |
| EBS (gp2) | Yes (network-attached) | Up to 16,000 IOPS | Legacy general purpose | EBS Snapshots |
| EBS (io2) | Yes (network-attached) | Up to 64,000 IOPS | High-performance databases | EBS Snapshots |
| EBS (st1) | Yes (network-attached) | Throughput-optimized | Big data, data warehouses | EBS Snapshots |
| EBS (sc1) | Yes (network-attached) | Cold HDD, lowest cost | Infrequent access | EBS Snapshots |
| Instance Store | No (ephemeral) | Very high IOPS | Temporary data, caches | Must use application-level backup |
EBS Volume Types Detailed
gp3 (General Purpose SSD)
- 3,000 IOPS baseline (configurable up to 16,000)
- 125 MB/s throughput baseline (configurable up to 1,000 MB/s)
- Price: \$0.08/GB-month
- Independent IOPS and throughput configuration
- Recommended for most workloads
gp2 (General Purpose SSD - Legacy)
- IOPS scales with volume size (3 IOPS per GB)
- Burstable up to 3,000 IOPS for volumes < 1 TB
- Throughput: up to 250 MB/s
- Use gp3 for new deployments (better value)
io2 Block Express (Provisioned IOPS SSD)
- Up to 256,000 IOPS per volume
- 99.999% durability
- Up to 4,000 MB/s throughput
- Sub-millisecond latency
- Use for critical databases
EBS Volume Operations
# Create EBS volume
aws ec2 create-volume \
--availability-zone us-east-1a \
--size 100 \
--volume-type gp3 \
--iops 3000 \
--throughput 125 \
--encrypted \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=MyVolume}]'
# Attach volume to instance
aws ec2 attach-volume \
--volume-id vol-0123456789abcdef0 \
--instance-id i-0123456789abcdef0 \
--device /dev/sdf
# Modify volume (increase size and IOPS)
aws ec2 modify-volume \
--volume-id vol-0123456789abcdef0 \
--size 200 \
--iops 5000
# Create snapshot
aws ec2 create-snapshot \
--volume-id vol-0123456789abcdef0 \
--description "Backup of MyVolume"
# Create volume from snapshot
aws ec2 create-volume \
--snapshot-id snap-0123456789abcdef0 \
--availability-zone us-east-1a \
--volume-type gp3
# Detach volume
aws ec2 detach-volume \
--volume-id vol-0123456789abcdef0
# Delete volume
aws ec2 delete-volume \
--volume-id vol-0123456789abcdef0
EBS Snapshot Management
# Create multi-volume snapshot for entire instance
aws ec2 create-snapshots \
--instance-specification InstanceId=i-0123456789abcdef0 \
--description "Full instance backup"
# Copy snapshot to another region
aws ec2 copy-snapshot \
--source-region us-east-1 \
--source-snapshot-id snap-0123456789abcdef0 \
--destination-region us-west-2 \
--description "DR copy"
# Create AMI from instance (includes all attached EBS volumes)
aws ec2 create-image \
--instance-id i-0123456789abcdef0 \
--name "MyGoldenImage" \
--description "Production baseline" \
--no-reboot
# List snapshots
aws ec2 describe-snapshots \
--owner-ids self \
--filters "Name=status,Values=completed"
# Delete snapshot
aws ec2 delete-snapshot \
--snapshot-id snap-0123456789abcdef0
Instance Store Characteristics
- Physically attached to host server
- Data lost on instance stop/terminate/hardware failure
- Included in instance price (no additional cost)
- Very high IOPS (millions)
- Available on specific instance types (c5d, m5d, r5d, i3, i4i)
AMI Management
Create Custom AMI
# Create AMI from running instance (with reboot)
aws ec2 create-image \
--instance-id i-0123456789abcdef0 \
--name "MyCustomAMI-$(date +%Y%m%d)" \
--description "Custom application image"
# Create AMI without rebooting
aws ec2 create-image \
--instance-id i-0123456789abcdef0 \
--name "MyCustomAMI" \
--no-reboot
# Register AMI from snapshot
aws ec2 register-image \
--name "MyAMI" \
--root-device-name /dev/xvda \
--block-device-mappings \
"DeviceName=/dev/xvda,Ebs={SnapshotId=snap-0123456789abcdef0,VolumeType=gp3}"
AMI Operations
# List AMIs owned by you
aws ec2 describe-images \
--owners self \
--filters "Name=state,Values=available"
# Copy AMI to another region
aws ec2 copy-image \
--source-region us-east-1 \
--source-image-id ami-0123456789abcdef0 \
--region us-west-2 \
--name "MyAMI-Copy"
# Share AMI with another account
aws ec2 modify-image-attribute \
--image-id ami-0123456789abcdef0 \
--launch-permission "Add=[{UserId=123456789012}]"
# Make AMI public
aws ec2 modify-image-attribute \
--image-id ami-0123456789abcdef0 \
--launch-permission "Add=[{Group=all}]"
# Deregister AMI
aws ec2 deregister-image \
--image-id ami-0123456789abcdef0
AMI User Data
- User data NOT stored in AMI
- Must specify user data each time launching from AMI
- User data embedded in launch templates persists across launches
- AMI captures: OS, applications, configurations, attached EBS volume snapshots
Networking Configuration
Security Groups vs Network ACLs
| Feature | Security Groups | Network ACLs |
|---|---|---|
| Scope | Instance-level | Subnet-level |
| State | Stateful (return traffic auto-allowed) | Stateless (must explicitly allow return) |
| Rules | Allow rules only | Allow and Deny rules |
| Rule Processing | All rules evaluated | Rules evaluated in order |
| Default Behavior | Deny all inbound, allow all outbound | Default NACL allows all |
| Assignment | Must be explicitly assigned | Automatically applied to subnet |
| Rule Limit | 60 inbound + 60 outbound per group | 20 inbound + 20 outbound per NACL |
Security Group Configuration
# Create security group
aws ec2 create-security-group \
--group-name WebServerSG \
--description "Security group for web servers" \
--vpc-id vpc-0123456789abcdef0
# Add inbound rules
aws ec2 authorize-security-group-ingress \
--group-id sg-0123456789abcdef0 \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress \
--group-id sg-0123456789abcdef0 \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress \
--group-id sg-0123456789abcdef0 \
--protocol tcp \
--port 22 \
--cidr 203.0.113.0/24
# Allow traffic from another security group
aws ec2 authorize-security-group-ingress \
--group-id sg-0123456789abcdef0 \
--protocol tcp \
--port 3306 \
--source-group sg-9876543210abcdef0
# Remove rule
aws ec2 revoke-security-group-ingress \
--group-id sg-0123456789abcdef0 \
--protocol tcp \
--port 22 \
--cidr 0.0.0.0/0
# Add outbound rule
aws ec2 authorize-security-group-egress \
--group-id sg-0123456789abcdef0 \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0
Security Group with Terraform
resource "aws_security_group" "web" {
name = "web-server-sg"
description = "Security group for web servers"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP from anywhere"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTPS from anywhere"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH from bastion"
from_port = 22
to_port = 22
protocol = "tcp"
security_groups = [aws_security_group.bastion.id]
}
egress {
description = "All outbound traffic"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "WebServerSG"
}
}
Network ACL Configuration
# Create Network ACL
aws ec2 create-network-acl \
--vpc-id vpc-0123456789abcdef0
# Add inbound rule (allow HTTP)
aws ec2 create-network-acl-entry \
--network-acl-id acl-0123456789abcdef0 \
--ingress \
--rule-number 100 \
--protocol tcp \
--port-range From=80,To=80 \
--cidr-block 0.0.0.0/0 \
--rule-action allow
# Add deny rule (higher priority)
aws ec2 create-network-acl-entry \
--network-acl-id acl-0123456789abcdef0 \
--ingress \
--rule-number 99 \
--protocol icmp \
--icmp-type-code Code=-1,Type=-1 \
--cidr-block 0.0.0.0/0 \
--rule-action deny
# Add outbound rule for ephemeral ports
aws ec2 create-network-acl-entry \
--network-acl-id acl-0123456789abcdef0 \
--egress \
--rule-number 100 \
--protocol tcp \
--port-range From=1024,To=65535 \
--cidr-block 0.0.0.0/0 \
--rule-action allow
# Associate NACL with subnet
aws ec2 replace-network-acl-association \
--association-id aclassoc-0123456789abcdef0 \
--network-acl-id acl-0123456789abcdef0
Elastic Network Interface (ENI)
# Create ENI with static private IP
aws ec2 create-network-interface \
--subnet-id subnet-0123456789abcdef0 \
--description "Primary network interface" \
--groups sg-0123456789abcdef0 \
--private-ip-address 10.0.1.10
# Attach ENI to instance
aws ec2 attach-network-interface \
--network-interface-id eni-0123456789abcdef0 \
--instance-id i-0123456789abcdef0 \
--device-index 1
# Assign secondary private IP
aws ec2 assign-private-ip-addresses \
--network-interface-id eni-0123456789abcdef0 \
--private-ip-addresses 10.0.1.11 10.0.1.12
# Detach ENI
aws ec2 detach-network-interface \
--attachment-id eni-attach-0123456789abcdef0
# Delete ENI
aws ec2 delete-network-interface \
--network-interface-id eni-0123456789abcdef0
Elastic IP (EIP)
# Allocate Elastic IP
aws ec2 allocate-address --domain vpc
# Associate EIP with instance
aws ec2 associate-address \
--instance-id i-0123456789abcdef0 \
--allocation-id eipalloc-0123456789abcdef0
# Associate EIP with ENI
aws ec2 associate-address \
--network-interface-id eni-0123456789abcdef0 \
--allocation-id eipalloc-0123456789abcdef0
# Disassociate EIP
aws ec2 disassociate-address \
--association-id eipassoc-0123456789abcdef0
# Release EIP
aws ec2 release-address \
--allocation-id eipalloc-0123456789abcdef0
Enhanced Networking
- SR-IOV (Single Root I/O Virtualization): Higher PPS, lower latency, lower jitter
- ENA (Elastic Network Adapter): Up to 100 Gbps, required for current generation instances
- Intel 82599 VF: Up to 10 Gbps, legacy instances
- Placement Groups: Cluster, Partition, Spread
Placement Groups
# Create cluster placement group (low latency)
aws ec2 create-placement-group \
--group-name HPC-Cluster \
--strategy cluster
# Create partition placement group (distributed)
aws ec2 create-placement-group \
--group-name BigData-Partition \
--strategy partition \
--partition-count 7
# Create spread placement group (high availability)
aws ec2 create-placement-group \
--group-name Critical-Spread \
--strategy spread
# Launch instance in placement group
aws ec2 run-instances \
--image-id ami-0c55b159cbfafe1f0 \
--instance-type c5n.18xlarge \
--placement "GroupName=HPC-Cluster"
| Placement Strategy | Max Instances | Use Case | Characteristics |
|---|---|---|---|
| Cluster | Thousands | HPC, low-latency apps | Single AZ, same hardware |
| Partition | 7 partitions per AZ | Distributed systems (Hadoop, Cassandra) | Isolated hardware per partition |
| Spread | 7 instances per AZ | Critical applications | Each instance on separate hardware |
Instance Lifecycle Management
Instance States
| State | Description | Billing | Operations Allowed |
|---|---|---|---|
| pending | Launching, preparing | Not billed | Wait |
| running | Instance is running | Billed | Stop, reboot, hibernate, terminate |
| stopping | Preparing to stop | Not billed | Wait |
| stopped | Instance shutdown, can restart | Not billed (storage charges apply) | Start, terminate |
| shutting-down | Preparing to terminate | Not billed | Wait |
| terminated | Permanently deleted | Not billed | None (cannot restart) |
| hibernate | RAM saved to EBS, quick restart | Billed during stopping | Start |
Instance Operations
# Start instance
aws ec2 start-instances --instance-ids i-0123456789abcdef0
# Stop instance
aws ec2 stop-instances --instance-ids i-0123456789abcdef0
# Reboot instance
aws ec2 reboot-instances --instance-ids i-0123456789abcdef0
# Terminate instance
aws ec2 terminate-instances --instance-ids i-0123456789abcdef0
# Enable termination protection
aws ec2 modify-instance-attribute \
--instance-id i-0123456789abcdef0 \
--disable-api-termination
# Disable termination protection
aws ec2 modify-instance-attribute \
--instance-id i-0123456789abcdef0 \
--no-disable-api-termination
# Change instance type (must stop first)
aws ec2 stop-instances --instance-ids i-0123456789abcdef0
aws ec2 modify-instance-attribute \
--instance-id i-0123456789abcdef0 \
--instance-type "{\"Value\": \"t3.large\"}"
aws ec2 start-instances --instance-ids i-0123456789abcdef0
Hibernation
- RAM contents saved to EBS root volume
- Must be enabled at launch
- Instance resumes with same instance ID and private IP
- Faster startup than stop/start
- Requirements:
- Supported instance families: C3-C5, M3-M5, R3-R5, T2-T3
- RAM must be < 150 GB
- Root volume must be EBS, encrypted
- Cannot hibernate > 60 days
# Launch instance with hibernation enabled
aws ec2 run-instances \
--image-id ami-0c55b159cbfafe1f0 \
--instance-type m5.large \
--hibernation-options Configured=true \
--block-device-mappings \
"DeviceName=/dev/xvda,Ebs={VolumeSize=30,Encrypted=true}"
# Hibernate instance
aws ec2 stop-instances \
--instance-ids i-0123456789abcdef0 \
--hibernate
Instance Metadata Service (IMDS)
# IMDSv1 (legacy)
curl http://169.254.169.254/latest/meta-data/
# IMDSv2 (token-based, more secure)
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/
# Common metadata endpoints
# Instance ID
curl http://169.254.169.254/latest/meta-data/instance-id
# Availability Zone
curl http://169.254.169.254/latest/meta-data/placement/availability-zone
# IAM role credentials
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/ROLE-NAME
# User data
curl http://169.254.169.254/latest/user-data
Enforce IMDSv2
aws ec2 modify-instance-metadata-options \
--instance-id i-0123456789abcdef0 \
--http-tokens required \
--http-put-response-hop-limit 1
Auto Scaling
Auto Scaling Components
- Launch Template: Defines instance configuration
- Auto Scaling Group (ASG): Manages instance fleet
- Scaling Policies: Define when to scale
- Load Balancer: Distributes traffic
Create Auto Scaling Group
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name MyASG \
--launch-template "LaunchTemplateName=MyLaunchTemplate,Version=1" \
--min-size 2 \
--max-size 10 \
--desired-capacity 3 \
--vpc-zone-identifier "subnet-0123,subnet-4567,subnet-89ab" \
--target-group-arns "arn:aws:elasticloadbalancing:region:account:targetgroup/my-tg/abc123" \
--health-check-type ELB \
--health-check-grace-period 300 \
--tags "Key=Name,Value=WebServer,PropagateAtLaunch=true"
Auto Scaling with Terraform
resource "aws_autoscaling_group" "web" {
name = "web-asg"
min_size = 2
max_size = 10
desired_capacity = 3
health_check_type = "ELB"
health_check_grace_period = 300
vpc_zone_identifier = aws_subnet.private[*].id
target_group_arns = [aws_lb_target_group.web.arn]
launch_template {
id = aws_launch_template.web.id
version = "$Latest"
}
tag {
key = "Name"
value = "WebServer"
propagate_at_launch = true
}
enabled_metrics = [
"GroupDesiredCapacity",
"GroupInServiceInstances",
"GroupMinSize",
"GroupMaxSize"
]
}
Scaling Policies
Target Tracking Scaling
aws autoscaling put-scaling-policy \
--auto-scaling-group-name MyASG \
--policy-name target-tracking-cpu \
--policy-type TargetTrackingScaling \
--target-tracking-configuration '{
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ASGAverageCPUUtilization"
},
"TargetValue": 70.0
}'
Step Scaling
# Create CloudWatch alarm
aws cloudwatch put-metric-alarm \
--alarm-name high-cpu \
--alarm-description "Scale up when CPU > 80%" \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--evaluation-periods 2 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--dimensions Name=AutoScalingGroupName,Value=MyASG
# Create scaling policy
aws autoscaling put-scaling-policy \
--auto-scaling-group-name MyASG \
--policy-name scale-up-policy \
--policy-type StepScaling \
--adjustment-type ChangeInCapacity \
--step-adjustments \
"MetricIntervalLowerBound=0,MetricIntervalUpperBound=10,ScalingAdjustment=1" \
"MetricIntervalLowerBound=10,ScalingAdjustment=2"
Scheduled Scaling
aws autoscaling put-scheduled-update-group-action \
--auto-scaling-group-name MyASG \
--scheduled-action-name ScaleUpMorning \
--start-time "2026-01-20T08:00:00Z" \
--recurrence "0 8 * * MON-FRI" \
--min-size 5 \
--max-size 20 \
--desired-capacity 10
Lifecycle Hooks
aws autoscaling put-lifecycle-hook \
--lifecycle-hook-name instance-launching-hook \
--auto-scaling-group-name MyASG \
--lifecycle-transition autoscaling:EC2_INSTANCE_LAUNCHING \
--default-result CONTINUE \
--heartbeat-timeout 300 \
--notification-target-arn arn:aws:sns:region:account:my-topic
Load Balancing
Load Balancer Types
| Type | OSI Layer | Protocol | Use Case | Key Features |
|---|---|---|---|---|
| Application Load Balancer (ALB) | Layer 7 | HTTP/HTTPS | Web applications, microservices | Path/host routing, WebSocket, HTTP/2 |
| Network Load Balancer (NLB) | Layer 4 | TCP/UDP/TLS | High-performance, low latency | Static IP, millions RPS, preserve source IP |
| Gateway Load Balancer (GWLB) | Layer 3 | IP | Third-party virtual appliances | Traffic inspection, firewall integration |
| Classic Load Balancer (CLB) | Layer 4/7 | TCP/HTTP | Legacy applications | Deprecated for new deployments |
Application Load Balancer with Auto Scaling
# Create target group
aws elbv2 create-target-group \
--name web-tg \
--protocol HTTP \
--port 80 \
--vpc-id vpc-0123456789abcdef0 \
--health-check-enabled \
--health-check-protocol HTTP \
--health-check-path /health \
--health-check-interval-seconds 30 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3
# Create ALB
aws elbv2 create-load-balancer \
--name web-alb \
--subnets subnet-0123 subnet-4567 \
--security-groups sg-0123456789abcdef0 \
--scheme internet-facing \
--type application
# Create listener
aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:region:account:loadbalancer/app/web-alb/abc123 \
--protocol HTTP \
--port 80 \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:account:targetgroup/web-tg/xyz789
# Add HTTPS listener with SSL certificate
aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:region:account:loadbalancer/app/web-alb/abc123 \
--protocol HTTPS \
--port 443 \
--certificates CertificateArn=arn:aws:acm:region:account:certificate/cert-id \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:account:targetgroup/web-tg/xyz789
ALB with Terraform
resource "aws_lb" "web" {
name = "web-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = aws_subnet.public[*].id
enable_deletion_protection = true
enable_http2 = true
}
resource "aws_lb_target_group" "web" {
name = "web-tg"
port = 80
protocol = "HTTP"
vpc_id = aws_vpc.main.id
health_check {
path = "/health"
protocol = "HTTP"
interval = 30
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 3
}
}
resource "aws_lb_listener" "http" {
load_balancer_arn = aws_lb.web.arn
port = "80"
protocol = "HTTP"
default_action {
type = "redirect"
redirect {
port = "443"
protocol = "HTTPS"
status_code = "HTTP_301"
}
}
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.web.arn
port = "443"
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = aws_acm_certificate.web.arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.web.arn
}
}
# Path-based routing
resource "aws_lb_listener_rule" "api" {
listener_arn = aws_lb_listener.https.arn
priority = 100
action {
type = "forward"
target_group_arn = aws_lb_target_group.api.arn
}
condition {
path_pattern {
values = ["/api/*"]
}
}
}
ELB Health Checks
- Auto Scaling uses health checks to replace unhealthy instances
- Health check types:
- EC2: Instance status checks
- ELB: Load balancer health checks (recommended)
- Grace period: Time before health checks start after instance launch
Monitoring and CloudWatch
CloudWatch Metrics for EC2
Basic Monitoring (Free, 5-minute intervals)
- CPUUtilization
- DiskReadOps, DiskWriteOps
- DiskReadBytes, DiskWriteBytes
- NetworkIn, NetworkOut
- NetworkPacketsIn, NetworkPacketsOut
- StatusCheckFailed, StatusCheckFailed_Instance, StatusCheckFailed_System
Detailed Monitoring (Paid, 1-minute intervals)
# Enable detailed monitoring
aws ec2 monitor-instances --instance-ids i-0123456789abcdef0
# Disable detailed monitoring
aws ec2 unmonitor-instances --instance-ids i-0123456789abcdef0
Custom Metrics
- Memory utilization (not included by default)
- Disk space utilization
- Application-specific metrics
CloudWatch Agent Installation
# Download and install CloudWatch agent
wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
sudo rpm -U ./amazon-cloudwatch-agent.rpm
# Configure agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
# Start agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-s \
-c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
CloudWatch Agent Configuration (JSON)
{
"metrics": {
"namespace": "CustomMetrics/EC2",
"metrics_collected": {
"mem": {
"measurement": [
{"name": "mem_used_percent", "rename": "MemoryUtilization", "unit": "Percent"}
],
"metrics_collection_interval": 60
},
"disk": {
"measurement": [
{"name": "used_percent", "rename": "DiskUtilization", "unit": "Percent"}
],
"metrics_collection_interval": 60,
"resources": ["*"]
}
}
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/httpd/access_log",
"log_group_name": "/aws/ec2/httpd",
"log_stream_name": "{instance_id}/access_log"
}
]
}
}
}
}
CloudWatch Alarms
# Create CPU alarm
aws cloudwatch put-metric-alarm \
--alarm-name high-cpu-alarm \
--alarm-description "Alert when CPU exceeds 80%" \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--evaluation-periods 2 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
--alarm-actions arn:aws:sns:region:account:my-topic
# Create disk space alarm (custom metric)
aws cloudwatch put-metric-alarm \
--alarm-name high-disk-usage \
--alarm-description "Alert when disk usage > 80%" \
--metric-name DiskUtilization \
--namespace CustomMetrics/EC2 \
--statistic Average \
--period 300 \
--evaluation-periods 1 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0,Name=path,Value=/ \
--alarm-actions arn:aws:sns:region:account:my-topic
# Create alarm with EC2 action (stop instance)
aws cloudwatch put-metric-alarm \
--alarm-name stop-instance-on-high-cpu \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--evaluation-periods 3 \
--threshold 95 \
--comparison-operator GreaterThanThreshold \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
--alarm-actions arn:aws:automate:region:ec2:stop
Alarm Actions
- SNS notification
- EC2 action: stop, terminate, reboot, recover
- Auto Scaling action
- Systems Manager action
- Lambda function invocation
CloudWatch Logs
# Create log group
aws logs create-log-group --log-group-name /aws/ec2/application
# Set retention policy
aws logs put-retention-policy \
--log-group-name /aws/ec2/application \
--retention-in-days 7
# Create metric filter
aws logs put-metric-filter \
--log-group-name /aws/ec2/application \
--filter-name ErrorCount \
--filter-pattern "[ERROR]" \
--metric-transformations \
metricName=ApplicationErrors,metricNamespace=CustomApp,metricValue=1
IAM Roles and Instance Profiles
Create IAM Role for EC2
# Create trust policy
cat > ec2-trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
# Create IAM role
aws iam create-role \
--role-name EC2-S3-Access-Role \
--assume-role-policy-document file://ec2-trust-policy.json
# Attach policy to role
aws iam attach-role-policy \
--role-name EC2-S3-Access-Role \
--policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess
# Create instance profile
aws iam create-instance-profile \
--instance-profile-name EC2-S3-Access-Profile
# Add role to instance profile
aws iam add-role-to-instance-profile \
--instance-profile-name EC2-S3-Access-Profile \
--role-name EC2-S3-Access-Role
# Attach instance profile to running instance
aws ec2 associate-iam-instance-profile \
--instance-id i-0123456789abcdef0 \
--iam-instance-profile Name=EC2-S3-Access-Profile
IAM Role with Terraform
resource "aws_iam_role" "ec2_role" {
name = "ec2-app-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "s3_access" {
role = aws_iam_role.ec2_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
}
resource "aws_iam_instance_profile" "ec2_profile" {
name = "ec2-app-profile"
role = aws_iam_role.ec2_role.name
}
resource "aws_instance" "app" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.medium"
iam_instance_profile = aws_iam_instance_profile.ec2_profile.name
}
Instance Management Commands
List and Describe Instances
# List all instances
aws ec2 describe-instances
# List instances with specific state
aws ec2 describe-instances \
--filters "Name=instance-state-name,Values=running"
# List instances with specific tag
aws ec2 describe-instances \
--filters "Name=tag:Environment,Values=Production"
# Get instance details in table format
aws ec2 describe-instances \
--query 'Reservations[*].Instances[*].[InstanceId,InstanceType,State.Name,PrivateIpAddress,PublicIpAddress,Tags[?Key==`Name`].Value|]' \
--output table
# Get specific instance details
aws ec2 describe-instances \
--instance-ids i-0123456789abcdef0
# Get instance status
aws ec2 describe-instance-status \
--instance-ids i-0123456789abcdef0
Tagging Operations
# Create tags
aws ec2 create-tags \
--resources i-0123456789abcdef0 \
--tags Key=Name,Value=WebServer Key=Environment,Value=Production
# Delete tags
aws ec2 delete-tags \
--resources i-0123456789abcdef0 \
--tags Key=OldTag
Console Access and Troubleshooting
# Get console output
aws ec2 get-console-output \
--instance-id i-0123456789abcdef0
# Get console screenshot
aws ec2 get-console-screenshot \
--instance-id i-0123456789abcdef0
# Get password data (Windows)
aws ec2 get-password-data \
--instance-id i-0123456789abcdef0 \
--priv-launch-key-file MyKeyPair.pem
Cost Optimization Best Practices
Right-Sizing
- Use AWS Compute Optimizer for recommendations
- Monitor CloudWatch metrics for actual utilization
- Start with burstable instances (T3/T4g) for variable workloads
- Use AWS Cost Explorer to identify underutilized instances
Instance Selection
- Prefer Graviton instances (T4g, M7g, C7g) for up to 40% better price-performance
- Use AMD instances (T3a, M5a, C5a) for 10% cost savings
- Consider Spot instances for fault-tolerant workloads (up to 90% savings)
- Implement Savings Plans for committed usage (up to 72% savings)
Storage Optimization
- Use gp3 instead of gp2 (20% cheaper, better performance)
- Delete unused EBS volumes and snapshots
- Implement lifecycle policies for snapshot retention
- Use S3 for infrequently accessed data
Auto Scaling Configuration
- Set appropriate min/max/desired capacity
- Use target tracking for dynamic scaling
- Implement scheduled scaling for predictable patterns
- Configure scale-in protection for long-running tasks
Monitoring and Cleanup
- Tag all resources for cost allocation
- Set up billing alerts
- Regularly review and terminate unused instances
- Use AWS Trusted Advisor for optimization recommendations
Security Best Practices
Network Security
- Deploy instances in private subnets
- Use security groups with least privilege
- Implement Network ACLs for subnet-level filtering
- Enable VPC Flow Logs for traffic analysis
- Use AWS PrivateLink for service access
Access Control
- Use IAM roles instead of access keys
- Implement least privilege IAM policies
- Enable MFA for privileged operations
- Use Systems Manager Session Manager instead of SSH (no key management)
- Rotate SSH keys regularly
Data Protection
- Enable EBS encryption by default
- Encrypt snapshots
- Use encrypted AMIs
- Implement backup strategies
- Enable termination protection for critical instances
Instance Hardening
- Keep OS and applications updated
- Use AWS Systems Manager Patch Manager
- Implement host-based firewalls
- Disable unnecessary services
- Use IMDSv2 for metadata access
- Enable CloudWatch Logs for audit trails
Monitoring and Compliance
- Enable CloudTrail for API logging
- Use AWS Config for compliance monitoring
- Implement AWS Security Hub
- Set up CloudWatch alarms for security events
- Regular security assessments and penetration testing
This comprehensive reference provides all essential working details for AWS EC2 operations in a structured, point-wise format suitable for quick reference and immediate implementation.
Top comments (0)