DEV Community

Cover image for Design Cost-Optimized Storage Solutions

Design Cost-Optimized Storage Solutions

Exam Guide: Solutions Architect - Associate
⚡ Domain 4: Design Cost-Optimized Architectures
📘 Task Statement 4.1

🎯 Designing Cost-Optimized Storage Solutions is about storing data in the lowest-cost way that still meets business requirements.

Start with storage type (object, file, block), then check access frequency, performance needs, retention, backup/archive, and transfer method.

You are not just picking “cheap storage.”

You are picking the cheapest storage that still works.


Knowledge

1 | Access Options

Requester Pays

Sometimes storage costs are affected by who pays for access.

S3 Requester Pays

With Requester Pays, the requester pays for request and data transfer charges instead of the bucket owner.

Requester Pays is Useful When:

  • You share large datasets publicly or with many external consumers
  • You want to reduce the owner’s cost for downloads or access

“Dataset is shared with external users, and owner wants to reduce access cost”Requester Pays.

2 | AWS Cost Management Service Features

Cost Allocation Tags & Multi-Account Billing

Tags and Multi-Account Billing reduce cost directly, but they help track and control cost.

  • Cost Allocation Tags: track cost by team/app/environment
  • AWS Organizations Consolidated Billing: central billing across accounts

3 | AWS Cost Management Tools

Cost Explorer, Budgets, Cost and Usage Report

1 AWS Cost Explorer: Visualize and analyze spending trends
2 AWS Budgets: Set thresholds and alerts for cost or usage

  • “Need alerts when costs exceed target”Budgets 3 AWS Cost and Usage Report (CUR): Detailed billing data for deep analysis
  • “Need detailed billing data for analysis”CUR

4 | AWS Storage Services With Appropriate Use Cases

FSx, EFS, S3, EBS

4.1 Amazon S3

Best for:

  • Cheapest scalable object storage for large amounts of data
  • Logs, backups, archives, static files, data lakes

4.2 Amazon EFS

Best for:

  • Shared file storage for Linux workloads
  • More expensive than S3; use when POSIX/shared file access is actually needed

4.3 Amazon EBS

Best for:

  • Block storage attached to EC2
  • Use only when the workload really needs block storage

4.4 Amazon FSx

Best for:

  • Managed file systems with specific compatibility/performance needs
  • Examples: 1 FSx for Windows File Server 2 FSx for Lustre 3 FSx for NetApp ONTAP 4 FSx for OpenZFS

Cost Mindset:

Don’t choose EFS or FSx if S3 is enough.

Don’t choose EBS if shared file or object storage fits better.

5 | Backup Strategies

Cost-optimized backups mean:
1 Keep backups only as long as needed
2 Move old backups to cheaper tiers
3 Use centralized backup policies where helpful

Common options:

  • AWS Backup
  • EBS snapshots
  • S3 versioning + lifecycle
  • Archive backups to Glacier tiers

6 | Block Storage Options

HDD vs SSD Volume Types

For EBS, cost depends heavily on volume type.

4.1 SSD-backed

  • gp3 / gp2: general purpose SSD
  • io1 / io2: provisioned IOPS SSD for very high IOPS

“Sequential throughput, large datasets, low cost”st1

4.2 HDD-backed

  • st1: throughput-optimized HDD (good for large, sequential workloads)
  • sc1: cold HDD (lowest cost EBS, infrequent access)

“Very infrequent block access, cheapest EBS” → sc1

7 | Data Lifecycles

Lifecycle planning is one of the biggest cost optimization topics.

This is where S3 Lifecycle shines.

Examples:

  • New files are frequently accessed for 30 days
  • Older files are rarely accessed
  • After 1 year, they should be archived or deleted

8 | Hybrid Storage Options

DataSync, Transfer Family, Storage Gateway

8.1 DataSync

Good for:

  • Recurring large-scale data transfer from on-prem to AWS
  • Faster and easier than building custom copy jobs

8.2 Transfer Family

Good for:

  • Managed SFTP/FTPS/FTP into S3 or EFS

8.3 Storage Gateway

Good for:

  • Hybrid access where on-prem apps still need file or block or tape interfaces backed by AWS

9 | Storage Access Patterns

Choose storage or tier based on how often data is accessed.

Access Pattern Typical Storage
Frequently accessed S3 Standard / EBS SSD / EFS Standard
Infrequently accessed S3 Standard-IA / One Zone-IA / EFS IA
Archive / long-term retention S3 Glacier Instant Retrieval / Flexible Retrieval / Deep Archive

10 | Storage Tiering

This is mostly an S3 topic, but also appears in EFS.

10.1 S3 Storage classes

  • S3 Standard: hot data
  • S3 Standard-IA: infrequent access, multi-AZ
  • S3 One Zone-IA: infrequent access, single AZ, cheaper
  • S3 Intelligent-Tiering: unknown or changing access patterns
  • S3 Glacier Instant Retrieval: archive but still quick retrieval
  • S3 Glacier Flexible Retrieval
  • S3 Glacier Deep Archive: lowest cost, slowest retrieval

10.2 EFS Tiering

EFS lifecycle management can move files to EFS Infrequent Access automatically.

11 | Storage Types With Associated Characteristics

Object, File, Block

11.1 Object = S3

  • Cheapest at scale
  • Best for unstructured data, backups, logs, media, static files

11.2 File = EFS / FSx

  • Use when apps need mounted shared file systems

11.3 Block = EBS

  • Use when apps need low-latency disk attached to EC2

Skills

A | Design Appropriate Storage Strategies

Batch Uploads vs Individual Uploads

Sometimes the cheapest design is not just the storage type, but how data is uploaded.

Examples:
1 Batch uploads can reduce request overhead
2 Multipart upload is better for very large files
3 Aggregating small files can improve efficiency in analytics or data lake designs

B | Determine The Correct Storage Size For A Workload

Don’t massively overprovision:
1 Right-size EBS volumes
2 Estimate backup retention growth
3 Plan capacity based on actual growth trends, not vague “just in case”

C | Determine The Lowest-Cost Method Of Transferring Data To AWS Storage

  • Online recurring transfersDataSync
  • Managed file transfer protocolTransfer Family
  • Hybrid file/block/tape integrationStorage Gateway
  • Very large offline migrationSnow Family

D | Determine When Storage Auto Scaling Is Required

Auto scaling or storage elasticity matters when growth is uncertain.

Examples:

  • S3 scales automatically
  • EFS scales automatically
  • EBS requires sizing decisions (though it can be modified)
  • Some file systems or databases need explicit storage autoscaling settings

E | Manage S3 Object Lifecycles

1 Move old data to cheaper storage classes
2 Expire temporary or obsolete data
3 Transition logs or backups to archive classes automatically

F | Select Appropriate Backup And/Or Archival Solution

Examples:

  • Operational restoresnapshots / AWS Backup
  • Compliance archiveGlacier tiers / Object Lock if required
  • Long-term, low-cost retentionDeep Archive

G | Select The Appropriate Service For Data Migration To Storage Services

1 DataSync for recurring transfer
2 Transfer Family for SFTP needs
3 Storage Gateway for hybrid storage interfaces
4 Snowball for large offline migration

H | Select The Appropriate Storage Tier

  • Unknown access patternS3 Intelligent-Tiering
  • Rare access but quick retrievalS3 Standard-IA or Glacier Instant Retrieval
  • Very rare long-term archiveGlacier Deep Archive

I | Select The Correct Data Lifecycle

  • Hot for 30 daysStandard
  • Warm for 60 daysStandard-IA
  • Archive after 90 daysGlacier
  • Delete after 7 yearslifecycle expiration

J | Select The Most Cost-Effective Storage Service For A Workload

  • Use S3 if object storage works
  • Use EFS/FSx only if file semantics are needed
  • Use EBS only when block storage is required
  • Archive to Glacier tiers when retrieval is rare

Cheat Sheet

Requirement Choice
Massive unstructured data, lowest scalable cost S3
Unknown or changing access patterns S3 Intelligent-Tiering
Rare access, still needs fast retrieval S3 Standard-IA or Glacier Instant Retrieval
Long-term archive, lowest cost Glacier Deep Archive
Shared Linux file system EFS
Windows file shares FSx for Windows File Server
Low-cost block storage for infrequent access EBS sc1
Sequential throughput-heavy block workload EBS st1
Recurring on-prem → AWS data transfer DataSync
Managed SFTP into AWS storage Transfer Family
Hybrid storage interface for on-prem apps Storage Gateway
External users should pay for S3 downloads S3 Requester Pays

Recap Checklist ✅

1. [ ] I can choose object vs file vs block storage based on workload needs

2. [ ] I can match storage tiers to access frequency (hot, warm, cold, archive)

3. [ ] I can use S3 lifecycle policies to reduce cost automatically over time

4. [ ] I know when to use S3 Intelligent-Tiering for unknown access patterns

5. [ ] I can choose the right EBS volume type for cost or performance needs

6. [ ] I know which hybrid transfer/storage service fits the situation (DataSync, Transfer Family, Storage Gateway)

7. [ ] I can choose cost-effective backup/archive solutions (AWS Backup, snapshots, Glacier tiers)

8. [ ] I understand cost tracking tools (Cost Explorer, Budgets, CUR, tags) at a basic level


AWS Whitepapers and Official Documentation

Core Storage Services

1. Amazon S3
2. Amazon EFS
3. Amazon EBS
4. Amazon FSx

S3 Lifecycle And Storage Classes

1. S3 Lifecycle

2. S3 storage classes

3. S3 Requester Pays
4. Multipart upload

Backup And Archive

1. AWS Backup
2. EBS snapshots

EBS Pricing Or Performance Direction

1. EBS volume types

Hybrid transfer and migration

1. AWS DataSync
2. AWS Transfer Family
3. AWS Storage Gateway

4. AWS Snow Family

Cost Visibility And Governance

1. Cost Explorer
2. AWS Budgets

3. Cost and Usage Report (CUR)

4. Cost allocation tags

🚀

Top comments (0)