Ntombizakhona Mabaso

for AWS Community Builders

Posted on Apr 8

Design Cost-Optimized Database Solutions

#aws #certification #cloud #solutionsarchitect

Exam Guide: Solutions Architect - Associate
⚡ Domain 4: Design Cost-Optimized Architectures
📘 Task Statement 4.3

🎯 Designing Cost-Optimized Database Solutions is about choosing the least expensive database design that still meets application requirements for performance, durability, retention, and availability.

Start with the data model and access pattern, then choose the database type, then optimize cost with right-sizing, caching, retention, connection management, and backup strategy.

You are often deciding between:

1. Relational vs non-relational
2. Provisioned vs serverless or on-demand
3. Read replicas vs caching
4. Keep data online vs archive or delete

Knowledge

1 | AWS Cost Management Features

Cost Allocation Tags & Multi-account Billing

Database cost should be tracked clearly by:

1 Environment (prod/dev/test)
2 Team or business unit
3 Application

We covered Cost Allocation Tags & Multi-account Billing in Task Statement 4.1 and Task Statement 4.2, but repetition is key even if it's monotonous.

1.1 Cost Allocation Tags

Track compute spend by app, team, environment, owner, cost center

1.2 Multi-Account Billing | Consolidated Billing

Manage cost centrally across multiple AWS accounts
Often used with AWS Organizations

2 | AWS Cost Management tools

Cost Explorer, Budgets, CUR

These help analyze and govern database spending.

We covered AWS Cost Management Tools in Task Statement 4.1 and Task Statement 4.2, but what's worse than repetition? A Bill Shock!

1 Cost Explorer: trend analysis

2 AWS Budgets: alerts

3 Cost and Usage Report (CUR): detailed data

3 | Caching Strategies

Caching can reduce DB cost by reducing:

1 Read load
2 Need for large instance sizes
3 Number of replicas

Caching Services

ElastiCache (Redis/Memcached) for app-side caching
DAX for DynamoDB read caching

“Reduce read cost and latency” → caching is often cheaper than scaling the database.

4 | Data Retention Policies

Retention policy is a major cost lever:

1 Keep only 30 days of operational data in the live DB
2 Archive old records elsewhere
3 Delete data after policy/legal period expires

Cost Principle:

Databases are expensive places to keep cold historical data.

Often, old data should move to S3 or archive systems.

5 | Database Capacity Planning

Capacity Units & Instance Sizing

1 RDS/Aurora: cost depends on instance size, storage, IOPS, backups, replicas
2 DynamoDB: cost depends on read/write capacity mode and access pattern
3 Overprovisioning: costs money every hour

6 | Database Connections And Proxies

Connection storms can force you to over-size a database unnecessarily.

Amazon RDS Proxy

1 Pools/reuses connections
2 Especially useful for Lambda or many short-lived app connections
3 Can reduce need to scale DB just for connection handling

7 | Database Engines

Relational Databases Engines

1 MySQL & Aurora MySQL: common compatibility choice
2 PostgreSQL & Aurora PostgreSQL: rich SQL features
3 Oracle & SQL Server: when app and licensing requirements demand them

Migration Patterns

Homogeneous migration: same engine → same engine
Heterogeneous migration: different engine → different engine

AWS DMS is common for moving data with minimal downtime.

8) Database Replication

Read Replicas

Read replicas cost money, so use them when they solve a real problem:

1 Read scaling
2 Reporting queries
3 Cross-region reads

They are not automatically the cheapest answer.

Sometimes caching is cheaper than adding replicas.

9 | Database Types And Services

9.1 Amazon RDS

Managed relational database
Good when standard SQL engine support is needed

9.2 Amazon Aurora

High-performance managed relational database
Often more scalable and has high-performance than standard RDS engines
Cost-effective when that performance benefit matters

9.3 Amazon DynamoDB

Key-valued database
Massive scale, low latency, no instance management
Often highly cost-effective for the right access pattern

9.4 Serverless Database Patterns

Aurora Serverless v2
DynamoDB On-Demand

Serverless Database Patterns are useful when workload is highly variable and you want to avoid always-on overprovisioning.

Skills

A | Design Appropriate Backup And Retention Policies

Cost-optimized backup strategy means:

1 Back up often enough to meet recovery requirements
2 Don’t retain backups forever unless required
3 Use snapshots and automated backups with retention matched to the business need

Examples:

Daily snapshots for dev, longer retention for prod
Archive historical exports to S3 instead of keeping them in the live DB

B | Determine An Appropriate Database Engine

MySQL vs PostgreSQL

Choose based on:

1 App compatibility
2 Required features
3 Licensing/cost constraints
4 Migration complexity

Don’t migrate to a more expensive/complex engine without a requirement.

Pick the simplest engine that meets needs.

C | Determine Cost-Effective AWS Database Services

DynamoDB vs Amazon RDS vs Serverless

1 Need joins, transaction, and relational schema → RDS & Aurora
2 Need key-valued at massive scale with simple access patterns → DynamoDB
3 Highly variable relational workload → Aurora Serverless v2 may fit
4 Variable NoSQL traffic → DynamoDB On-Demand

D | Determine Cost-Effective Database Types II

Time Series & Columnar

Time-series or event-style data often fits NoSQL better than relational
Analytical and columnar needs are often better outside traditional OLTP databases

Don’t force every dataset into a relational OLTP database, and understand that various types of databases exists for various workloads. 'Purpose Built' as they say.

E | Migrate Database Schemas And Data To Different Locations And / Or Engines

AWS DMS for ongoing and minimal-downtime migration
AWS Schema Conversion Tool (SCT) for heterogeneous schema conversion
Same engine migration → *often easier, lower risk *
Different engine migration → SCT + DMS pattern is common

Cheat Sheet

Requirement	Database
Relational app with joins and transactions	RDS or Aurora
Massive scale key-value/document workload	DynamoDB
Highly variable workload, avoid overprovisioning	Serverless / on-demand model
Reduce DB reads cheaply	ElastiCache (or DAX for DynamoDB)
Lambda/app opening too many DB connections	RDS Proxy
Need relational performance at scale	Aurora
Move data with minimal downtime	AWS DMS
Change DB engine during migration	AWS SCT + DMS
Read-heavy workload	Read replicas or caching (compare cost)
Old data is rarely used	Archive/export old data instead of keeping it in primary DB

Recap Checklist ✅

1. [ ] I can choose RDS/Aurora vs DynamoDB based on data model and access pattern

2. [ ] I understand that caching can be cheaper than constantly scaling the database

3. [ ] I know when read replicas are useful and when they may be unnecessary cost

4. [ ] I can match retention policies to business needs instead of keeping all data online forever

5. [ ] I can choose serverless/on-demand database options for variable workloads

6. [ ] I understand connection pooling with RDS Proxy

7. [ ] I know the migration difference between homogeneous and heterogeneous migrations

8. [ ] I can pick backup frequency/retention that meets recovery goals without overspending

AWS Whitepapers and Official Documentation

🚀