Exam Guide: Solutions Architect - Associate
⚡ Domain 4: Design Cost-Optimized Architectures
📘 Task Statement 4.1
🎯 Designing Cost-Optimized Storage Solutions is about storing data in the lowest-cost way that still meets business requirements.
Start with storage type (object, file, block), then check access frequency, performance needs, retention, backup/archive, and transfer method.
You are not just picking “cheap storage.”
You are picking the cheapest storage that still works.
Knowledge
1 | Access Options
Requester Pays
Sometimes storage costs are affected by who pays for access.
S3 Requester Pays
With Requester Pays, the requester pays for request and data transfer charges instead of the bucket owner.
Requester Pays is Useful When:
- You share large datasets publicly or with many external consumers
- You want to reduce the owner’s cost for downloads or access
“Dataset is shared with external users, and owner wants to reduce access cost” → Requester Pays.
2 | AWS Cost Management Service Features
Cost Allocation Tags & Multi-Account Billing
Tags and Multi-Account Billing reduce cost directly, but they help track and control cost.
- Cost Allocation Tags: track cost by team/app/environment
- AWS Organizations Consolidated Billing: central billing across accounts
3 | AWS Cost Management Tools
Cost Explorer, Budgets, Cost and Usage Report
1 AWS Cost Explorer: Visualize and analyze spending trends
2 AWS Budgets: Set thresholds and alerts for cost or usage
- “Need alerts when costs exceed target” → Budgets 3 AWS Cost and Usage Report (CUR): Detailed billing data for deep analysis
- “Need detailed billing data for analysis” → CUR
4 | AWS Storage Services With Appropriate Use Cases
FSx, EFS, S3, EBS
4.1 Amazon S3
Best for:
- Cheapest scalable object storage for large amounts of data
- Logs, backups, archives, static files, data lakes
4.2 Amazon EFS
Best for:
- Shared file storage for Linux workloads
- More expensive than S3; use when POSIX/shared file access is actually needed
4.3 Amazon EBS
Best for:
- Block storage attached to EC2
- Use only when the workload really needs block storage
4.4 Amazon FSx
Best for:
- Managed file systems with specific compatibility/performance needs
- Examples: 1 FSx for Windows File Server 2 FSx for Lustre 3 FSx for NetApp ONTAP 4 FSx for OpenZFS
Cost Mindset:
Don’t choose EFS or FSx if S3 is enough.
Don’t choose EBS if shared file or object storage fits better.
5 | Backup Strategies
Cost-optimized backups mean:
1 Keep backups only as long as needed
2 Move old backups to cheaper tiers
3 Use centralized backup policies where helpful
Common options:
- AWS Backup
- EBS snapshots
- S3 versioning + lifecycle
- Archive backups to Glacier tiers
6 | Block Storage Options
HDD vs SSD Volume Types
For EBS, cost depends heavily on volume type.
4.1 SSD-backed
- gp3 / gp2: general purpose SSD
- io1 / io2: provisioned IOPS SSD for very high IOPS
“Sequential throughput, large datasets, low cost” → st1
4.2 HDD-backed
- st1: throughput-optimized HDD (good for large, sequential workloads)
- sc1: cold HDD (lowest cost EBS, infrequent access)
“Very infrequent block access, cheapest EBS” → sc1
7 | Data Lifecycles
Lifecycle planning is one of the biggest cost optimization topics.
This is where S3 Lifecycle shines.
Examples:
- New files are frequently accessed for 30 days
- Older files are rarely accessed
- After 1 year, they should be archived or deleted
8 | Hybrid Storage Options
DataSync, Transfer Family, Storage Gateway
8.1 DataSync
Good for:
- Recurring large-scale data transfer from on-prem to AWS
- Faster and easier than building custom copy jobs
8.2 Transfer Family
Good for:
- Managed SFTP/FTPS/FTP into S3 or EFS
8.3 Storage Gateway
Good for:
- Hybrid access where on-prem apps still need file or block or tape interfaces backed by AWS
9 | Storage Access Patterns
Choose storage or tier based on how often data is accessed.
| Access Pattern | Typical Storage |
|---|---|
| Frequently accessed | S3 Standard / EBS SSD / EFS Standard |
| Infrequently accessed | S3 Standard-IA / One Zone-IA / EFS IA |
| Archive / long-term retention | S3 Glacier Instant Retrieval / Flexible Retrieval / Deep Archive |
10 | Storage Tiering
This is mostly an S3 topic, but also appears in EFS.
10.1 S3 Storage classes
- S3 Standard: hot data
- S3 Standard-IA: infrequent access, multi-AZ
- S3 One Zone-IA: infrequent access, single AZ, cheaper
- S3 Intelligent-Tiering: unknown or changing access patterns
- S3 Glacier Instant Retrieval: archive but still quick retrieval
- S3 Glacier Flexible Retrieval
- S3 Glacier Deep Archive: lowest cost, slowest retrieval
10.2 EFS Tiering
EFS lifecycle management can move files to EFS Infrequent Access automatically.
11 | Storage Types With Associated Characteristics
Object, File, Block
11.1 Object = S3
- Cheapest at scale
- Best for unstructured data, backups, logs, media, static files
11.2 File = EFS / FSx
- Use when apps need mounted shared file systems
11.3 Block = EBS
- Use when apps need low-latency disk attached to EC2
Skills
A | Design Appropriate Storage Strategies
Batch Uploads vs Individual Uploads
Sometimes the cheapest design is not just the storage type, but how data is uploaded.
Examples:
1 Batch uploads can reduce request overhead
2 Multipart upload is better for very large files
3 Aggregating small files can improve efficiency in analytics or data lake designs
B | Determine The Correct Storage Size For A Workload
Don’t massively overprovision:
1 Right-size EBS volumes
2 Estimate backup retention growth
3 Plan capacity based on actual growth trends, not vague “just in case”
C | Determine The Lowest-Cost Method Of Transferring Data To AWS Storage
- Online recurring transfers → DataSync
- Managed file transfer protocol → Transfer Family
- Hybrid file/block/tape integration → Storage Gateway
- Very large offline migration → Snow Family
D | Determine When Storage Auto Scaling Is Required
Auto scaling or storage elasticity matters when growth is uncertain.
Examples:
- S3 scales automatically
- EFS scales automatically
- EBS requires sizing decisions (though it can be modified)
- Some file systems or databases need explicit storage autoscaling settings
E | Manage S3 Object Lifecycles
1 Move old data to cheaper storage classes
2 Expire temporary or obsolete data
3 Transition logs or backups to archive classes automatically
F | Select Appropriate Backup And/Or Archival Solution
Examples:
- Operational restore → snapshots / AWS Backup
- Compliance archive → Glacier tiers / Object Lock if required
- Long-term, low-cost retention → Deep Archive
G | Select The Appropriate Service For Data Migration To Storage Services
1 DataSync for recurring transfer
2 Transfer Family for SFTP needs
3 Storage Gateway for hybrid storage interfaces
4 Snowball for large offline migration
H | Select The Appropriate Storage Tier
- Unknown access pattern → S3 Intelligent-Tiering
- Rare access but quick retrieval → S3 Standard-IA or Glacier Instant Retrieval
- Very rare long-term archive → Glacier Deep Archive
I | Select The Correct Data Lifecycle
- Hot for 30 days → Standard
- Warm for 60 days → Standard-IA
- Archive after 90 days → Glacier
- Delete after 7 years → lifecycle expiration
J | Select The Most Cost-Effective Storage Service For A Workload
- Use S3 if object storage works
- Use EFS/FSx only if file semantics are needed
- Use EBS only when block storage is required
- Archive to Glacier tiers when retrieval is rare
Cheat Sheet
| Requirement | Choice |
|---|---|
| Massive unstructured data, lowest scalable cost | S3 |
| Unknown or changing access patterns | S3 Intelligent-Tiering |
| Rare access, still needs fast retrieval | S3 Standard-IA or Glacier Instant Retrieval |
| Long-term archive, lowest cost | Glacier Deep Archive |
| Shared Linux file system | EFS |
| Windows file shares | FSx for Windows File Server |
| Low-cost block storage for infrequent access | EBS sc1 |
| Sequential throughput-heavy block workload | EBS st1 |
| Recurring on-prem → AWS data transfer | DataSync |
| Managed SFTP into AWS storage | Transfer Family |
| Hybrid storage interface for on-prem apps | Storage Gateway |
| External users should pay for S3 downloads | S3 Requester Pays |
Recap Checklist ✅
1. [ ] I can choose object vs file vs block storage based on workload needs
2. [ ] I can match storage tiers to access frequency (hot, warm, cold, archive)
3. [ ] I can use S3 lifecycle policies to reduce cost automatically over time
4. [ ] I know when to use S3 Intelligent-Tiering for unknown access patterns
5. [ ] I can choose the right EBS volume type for cost or performance needs
6. [ ] I know which hybrid transfer/storage service fits the situation (DataSync, Transfer Family, Storage Gateway)
7. [ ] I can choose cost-effective backup/archive solutions (AWS Backup, snapshots, Glacier tiers)
8. [ ] I understand cost tracking tools (Cost Explorer, Budgets, CUR, tags) at a basic level
AWS Whitepapers and Official Documentation
Core Storage Services
1. Amazon S3
2. Amazon EFS
3. Amazon EBS
4. Amazon FSx
S3 Lifecycle And Storage Classes
1. S3 Lifecycle
2. S3 storage classes
3. S3 Requester Pays
4. Multipart upload
Backup And Archive
1. AWS Backup
2. EBS snapshots
EBS Pricing Or Performance Direction
Hybrid transfer and migration
1. AWS DataSync
2. AWS Transfer Family
3. AWS Storage Gateway
4. AWS Snow Family
Cost Visibility And Governance
1. Cost Explorer
2. AWS Budgets
3. Cost and Usage Report (CUR)
4. Cost allocation tags
🚀
Top comments (0)