Ntombizakhona Mabaso

for AWS Community Builders

Posted on Apr 4

Design Cost-Optimized Storage Solutions

#aws #certification #cloud #solutionsarchitect

Exam Guide: Solutions Architect - Associate
⚡ Domain 4: Design Cost-Optimized Architectures
📘 Task Statement 4.1

🎯 Designing Cost-Optimized Storage Solutions is about storing data in the lowest-cost way that still meets business requirements.

Start with storage type (object, file, block), then check access frequency, performance needs, retention, backup/archive, and transfer method.

You are not just picking “cheap storage.”

You are picking the cheapest storage that still works.

Knowledge

1 | Access Options

Requester Pays

Sometimes storage costs are affected by who pays for access.

S3 Requester Pays

With Requester Pays, the requester pays for request and data transfer charges instead of the bucket owner.

Requester Pays is Useful When:

You share large datasets publicly or with many external consumers
You want to reduce the owner’s cost for downloads or access

“Dataset is shared with external users, and owner wants to reduce access cost” → Requester Pays.

2 | AWS Cost Management Service Features

Cost Allocation Tags & Multi-Account Billing

Tags and Multi-Account Billing reduce cost directly, but they help track and control cost.

Cost Allocation Tags: track cost by team/app/environment
AWS Organizations Consolidated Billing: central billing across accounts

3 | AWS Cost Management Tools

Cost Explorer, Budgets, Cost and Usage Report

1 AWS Cost Explorer: Visualize and analyze spending trends
2 AWS Budgets: Set thresholds and alerts for cost or usage

“Need alerts when costs exceed target” → Budgets 3 AWS Cost and Usage Report (CUR): Detailed billing data for deep analysis
“Need detailed billing data for analysis” → CUR

4 | AWS Storage Services With Appropriate Use Cases

FSx, EFS, S3, EBS

4.1 Amazon S3

Best for:

Cheapest scalable object storage for large amounts of data
Logs, backups, archives, static files, data lakes

4.2 Amazon EFS

Best for:

Shared file storage for Linux workloads
More expensive than S3; use when POSIX/shared file access is actually needed

4.3 Amazon EBS

Best for:

Block storage attached to EC2
Use only when the workload really needs block storage

4.4 Amazon FSx

Best for:

Managed file systems with specific compatibility/performance needs
Examples: 1 FSx for Windows File Server 2 FSx for Lustre 3 FSx for NetApp ONTAP 4 FSx for OpenZFS

Cost Mindset:

Don’t choose EFS or FSx if S3 is enough.

Don’t choose EBS if shared file or object storage fits better.

5 | Backup Strategies

Cost-optimized backups mean:
1 Keep backups only as long as needed
2 Move old backups to cheaper tiers
3 Use centralized backup policies where helpful

Common options:

AWS Backup
EBS snapshots
S3 versioning + lifecycle
Archive backups to Glacier tiers

6 | Block Storage Options

HDD vs SSD Volume Types

For EBS, cost depends heavily on volume type.

4.1 SSD-backed

gp3 / gp2: general purpose SSD
io1 / io2: provisioned IOPS SSD for very high IOPS

“Sequential throughput, large datasets, low cost” → st1

4.2 HDD-backed

st1: throughput-optimized HDD (good for large, sequential workloads)
sc1: cold HDD (lowest cost EBS, infrequent access)

“Very infrequent block access, cheapest EBS” → sc1

7 | Data Lifecycles

Lifecycle planning is one of the biggest cost optimization topics.

This is where S3 Lifecycle shines.

Examples:

New files are frequently accessed for 30 days
Older files are rarely accessed
After 1 year, they should be archived or deleted

8 | Hybrid Storage Options

DataSync, Transfer Family, Storage Gateway

8.1 DataSync

Good for:

Recurring large-scale data transfer from on-prem to AWS
Faster and easier than building custom copy jobs

8.2 Transfer Family

Good for:

Managed SFTP/FTPS/FTP into S3 or EFS

8.3 Storage Gateway

Good for:

Hybrid access where on-prem apps still need file or block or tape interfaces backed by AWS

9 | Storage Access Patterns

Choose storage or tier based on how often data is accessed.

Access Pattern	Typical Storage
Frequently accessed	S3 Standard / EBS SSD / EFS Standard
Infrequently accessed	S3 Standard-IA / One Zone-IA / EFS IA
Archive / long-term retention	S3 Glacier Instant Retrieval / Flexible Retrieval / Deep Archive

10 | Storage Tiering

This is mostly an S3 topic, but also appears in EFS.

10.1 S3 Storage classes

S3 Standard: hot data
S3 Standard-IA: infrequent access, multi-AZ
S3 One Zone-IA: infrequent access, single AZ, cheaper
S3 Intelligent-Tiering: unknown or changing access patterns
S3 Glacier Instant Retrieval: archive but still quick retrieval
S3 Glacier Flexible Retrieval
S3 Glacier Deep Archive: lowest cost, slowest retrieval

10.2 EFS Tiering

EFS lifecycle management can move files to EFS Infrequent Access automatically.

11 | Storage Types With Associated Characteristics

Object, File, Block

11.1 Object = S3

Cheapest at scale
Best for unstructured data, backups, logs, media, static files

11.2 File = EFS / FSx

Use when apps need mounted shared file systems

11.3 Block = EBS

Use when apps need low-latency disk attached to EC2

Skills

A | Design Appropriate Storage Strategies

Batch Uploads vs Individual Uploads

Sometimes the cheapest design is not just the storage type, but how data is uploaded.

Examples:
1 Batch uploads can reduce request overhead
2 Multipart upload is better for very large files
3 Aggregating small files can improve efficiency in analytics or data lake designs

B | Determine The Correct Storage Size For A Workload

Don’t massively overprovision:
1 Right-size EBS volumes
2 Estimate backup retention growth
3 Plan capacity based on actual growth trends, not vague “just in case”

C | Determine The Lowest-Cost Method Of Transferring Data To AWS Storage

Online recurring transfers → DataSync
Managed file transfer protocol → Transfer Family
Hybrid file/block/tape integration → Storage Gateway
Very large offline migration → Snow Family

D | Determine When Storage Auto Scaling Is Required

Auto scaling or storage elasticity matters when growth is uncertain.

Examples:

S3 scales automatically
EFS scales automatically
EBS requires sizing decisions (though it can be modified)
Some file systems or databases need explicit storage autoscaling settings

E | Manage S3 Object Lifecycles

1 Move old data to cheaper storage classes
2 Expire temporary or obsolete data
3 Transition logs or backups to archive classes automatically

F | Select Appropriate Backup And/Or Archival Solution

Examples:

Operational restore → snapshots / AWS Backup
Compliance archive → Glacier tiers / Object Lock if required
Long-term, low-cost retention → Deep Archive

G | Select The Appropriate Service For Data Migration To Storage Services

1 DataSync for recurring transfer
2 Transfer Family for SFTP needs
3 Storage Gateway for hybrid storage interfaces
4 Snowball for large offline migration

H | Select The Appropriate Storage Tier

Unknown access pattern → S3 Intelligent-Tiering
Rare access but quick retrieval → S3 Standard-IA or Glacier Instant Retrieval
Very rare long-term archive → Glacier Deep Archive

I | Select The Correct Data Lifecycle

Hot for 30 days → Standard
Warm for 60 days → Standard-IA
Archive after 90 days → Glacier
Delete after 7 years → lifecycle expiration

J | Select The Most Cost-Effective Storage Service For A Workload

Use S3 if object storage works
Use EFS/FSx only if file semantics are needed
Use EBS only when block storage is required
Archive to Glacier tiers when retrieval is rare

Cheat Sheet

Requirement	Choice
Massive unstructured data, lowest scalable cost	S3
Unknown or changing access patterns	S3 Intelligent-Tiering
Rare access, still needs fast retrieval	S3 Standard-IA or Glacier Instant Retrieval
Long-term archive, lowest cost	Glacier Deep Archive
Shared Linux file system	EFS
Windows file shares	FSx for Windows File Server
Low-cost block storage for infrequent access	EBS sc1
Sequential throughput-heavy block workload	EBS st1
Recurring on-prem → AWS data transfer	DataSync
Managed SFTP into AWS storage	Transfer Family
Hybrid storage interface for on-prem apps	Storage Gateway
External users should pay for S3 downloads	S3 Requester Pays

Recap Checklist ✅

1. [ ] I can choose object vs file vs block storage based on workload needs

2. [ ] I can match storage tiers to access frequency (hot, warm, cold, archive)

3. [ ] I can use S3 lifecycle policies to reduce cost automatically over time

4. [ ] I know when to use S3 Intelligent-Tiering for unknown access patterns

5. [ ] I can choose the right EBS volume type for cost or performance needs

6. [ ] I know which hybrid transfer/storage service fits the situation (DataSync, Transfer Family, Storage Gateway)

7. [ ] I can choose cost-effective backup/archive solutions (AWS Backup, snapshots, Glacier tiers)

8. [ ] I understand cost tracking tools (Cost Explorer, Budgets, CUR, tags) at a basic level

AWS Whitepapers and Official Documentation

🚀

🎯 Designing Cost-Optimized Storage Solutions is about storing data in the lowest-cost way that still meets business requirements.

Knowledge

1 | Access Options

Requester Pays

S3 Requester Pays

2 | AWS Cost Management Service Features

Cost Allocation Tags & Multi-Account Billing

3 | AWS Cost Management Tools

Cost Explorer, Budgets, Cost and Usage Report

4 | AWS Storage Services With Appropriate Use Cases

FSx, EFS, S3, EBS

4.1 Amazon S3

4.2 Amazon EFS

4.3 Amazon EBS

4.4 Amazon FSx

Cost Mindset:

5 | Backup Strategies

6 | Block Storage Options

HDD vs SSD Volume Types

4.1 SSD-backed

4.2 HDD-backed

7 | Data Lifecycles

8 | Hybrid Storage Options

DataSync, Transfer Family, Storage Gateway

8.1 DataSync

8.2 Transfer Family

8.3 Storage Gateway

9 | Storage Access Patterns

10 | Storage Tiering

10.1 S3 Storage classes

10.2 EFS Tiering

11 | Storage Types With Associated Characteristics

Object, File, Block

11.1 Object = S3

11.2 File = EFS / FSx

11.3 Block = EBS

Skills

A | Design Appropriate Storage Strategies

Batch Uploads vs Individual Uploads

B | Determine The Correct Storage Size For A Workload

C | Determine The Lowest-Cost Method Of Transferring Data To AWS Storage

D | Determine When Storage Auto Scaling Is Required

E | Manage S3 Object Lifecycles

F | Select Appropriate Backup And/Or Archival Solution

G | Select The Appropriate Service For Data Migration To Storage Services

H | Select The Appropriate Storage Tier

I | Select The Correct Data Lifecycle

J | Select The Most Cost-Effective Storage Service For A Workload

Cheat Sheet

Recap Checklist ✅

AWS Whitepapers and Official Documentation

Core Storage Services

S3 Lifecycle And Storage Classes

Backup And Archive

EBS Pricing Or Performance Direction

Hybrid transfer and migration

Cost Visibility And Governance