Bare Metal vs. AWS RDS: Storage Baseline — Longhorn vs Local SSD vs Managed Cloud
Before tuning anything, we needed to answer a simpler question first: does storage backend matter more than the platform itself?
This is Part 1 of a 2-part series. This article establishes bare metal storage baselines across four configurations and compares them against AWS managed PostgreSQL. Part 2 covers CPU/NUMA pinning and HugePages, where bare metal overtakes Aurora on write throughput.
Most PostgreSQL performance comparisons jump straight to config tuning. We didn't.
Before touching CPU governors or HugePages, we needed to answer a more fundamental question: how much does storage backend affect performance on bare metal Kubernetes? We ran four configurations — and the results reveal exactly where the bottleneck lives.
The Setup
All environments: 2 vCPU / 8 GB RAM throughout. Our bare metal node is a 32-core NUMA-aware host with Samsung SM863a Enterprise SSD in RAID 1 (SAS). AWS environments run on t3.large in ap-southeast-3.
Single-instance comparison throughout — one CNPG pod vs one RDS instance vs one Aurora instance. No read replicas, no Multi-AZ, no connection pooling.
PostgreSQL config — intentionally matched to AWS defaults:
shared_buffers = ~1.9 GB
effective_cache_size = ~3.8 GB
work_mem = 4 MB
max_connections = 839
wal_buffers = ~60 MB
maintenance_work_mem = 128 MB
By using the same config across all environments, performance differences come purely from platform and storage architecture — not tuning.
Benchmark: pgbench · Scale factor 100 (~10M rows) · 60s per run · 39 runs per environment
Four bare metal storage configurations tested:
| Label | Storage | Replicas | Disk |
|---|---|---|---|
| CNPG Local SSD | Direct-attached Samsung SM863a SAS | — | Dedicated |
| CNPG Longhorn 1R | Longhorn distributed storage | 1 replica | Dedicated |
| CNPG Longhorn 2R | Longhorn distributed storage | 2 replicas | Dedicated |
| CNPG Longhorn 2R+shared | Longhorn distributed storage | 2 replicas | Shared with OS/worker |
Results: AWS RDS Standard (t3.large)
| Clients | RO TPS | RO Lat | RW TPS | RW Lat | TPC-B TPS | TPC-B Lat |
|---|---|---|---|---|---|---|
| 1 | 1,677 | 0.60 ms | 253 | 3.95 ms | 178 | 5.63 ms |
| 10 | 13,955 | 0.72 ms | 1,881 | 5.32 ms | 1,460 | 6.85 ms |
| 25 | 12,859 | 1.94 ms | 2,839 | 8.80 ms | 1,864 | 13.41 ms |
| 50 | 10,397 | 4.81 ms | 2,620 | 19.09 ms | 1,646 | 30.37 ms |
| 100 | 10,627 | 9.41 ms | 2,585 | 38.68 ms | 1,623 | 61.61 ms |
Results: AWS Aurora IO-Optimized (t3.large)
| Clients | RO TPS | RO Lat | RW TPS | RW Lat | TPC-B TPS | TPC-B Lat |
|---|---|---|---|---|---|---|
| 1 | 2,607 | 0.38 ms | 285 | 3.51 ms | 218 | 4.58 ms |
| 10 | 10,928 | 0.92 ms | 984 | 10.16 ms | 739 | 13.53 ms |
| 25 | 9,265 | 2.70 ms | 1,278 | 19.57 ms | 880 | 28.42 ms |
| 50 | 8,163 | 6.12 ms | 1,472 | 33.96 ms | 990 | 50.49 ms |
| 100 | 7,783 | 12.85 ms | 1,623 | 61.63 ms | 1,027 | 97.41 ms |
Results: AWS Aurora Standard (t3.large)
| Clients | RO TPS | RO Lat | RW TPS | RW Lat | TPC-B TPS | TPC-B Lat |
|---|---|---|---|---|---|---|
| 1 | 1,540 | 0.65 ms | 191 | 5.23 ms | 150 | 6.66 ms |
| 10 | 10,020 | 1.00 ms | 922 | 10.85 ms | 690 | 14.48 ms |
| 25 | 9,189 | 2.72 ms | 1,179 | 21.20 ms | 800 | 31.23 ms |
| 50 | 8,014 | 6.24 ms | 1,384 | 36.13 ms | 897 | 55.77 ms |
| 100 | 7,665 | 13.05 ms | 1,557 | 64.22 ms | 970 | 103.10 ms |
Results: CNPG Local SSD
| Clients | RO TPS | RO Lat | RW TPS | RW Lat | TPC-B TPS | TPC-B Lat |
|---|---|---|---|---|---|---|
| 1 | 749 | 1.34 ms | 134 | 7.48 ms | 99 | 10.10 ms |
| 10 | 7,675 | 1.30 ms | 1,425 | 7.02 ms | 1,031 | 9.70 ms |
| 25 | 6,788 | 3.68 ms | 1,560 | 16.02 ms | 1,073 | 23.30 ms |
| 50 | 6,430 | 7.78 ms | 1,550 | 32.27 ms | 996 | 50.18 ms |
| 100 | 6,092 | 16.41 ms | 1,464 | 68.32 ms | 902 | 110.92 ms |
Results: CNPG Longhorn 1 Replica (Dedicated Disk)
| Clients | RO TPS | RO Lat | RW TPS | RW Lat | TPC-B TPS | TPC-B Lat |
|---|---|---|---|---|---|---|
| 1 | 754 | 1.33 ms | 119 | 8.43 ms | 90 | 11.12 ms |
| 10 | 7,713 | 1.30 ms | 940 | 10.64 ms | 748 | 13.37 ms |
| 25 | 7,311 | 3.42 ms | 1,254 | 19.93 ms | 1,015 | 24.64 ms |
| 50 | 6,587 | 7.59 ms | 1,384 | 36.12 ms | 1,064 | 46.98 ms |
| 100 | 6,109 | 16.37 ms | 1,453 | 68.83 ms | 1,009 | 99.14 ms |
Results: CNPG Longhorn 2 Replicas (Dedicated Disk)
| Clients | RO TPS | RO Lat | RW TPS | RW Lat | TPC-B TPS | TPC-B Lat |
|---|---|---|---|---|---|---|
| 1 | 752 | 1.33 ms | 103 | 9.71 ms | 81 | 12.39 ms |
| 10 | 7,655 | 1.31 ms | 712 | 14.04 ms | 607 | 16.47 ms |
| 25 | 7,379 | 3.39 ms | 996 | 25.11 ms | 835 | 29.93 ms |
| 50 | 6,699 | 7.46 ms | 1,144 | 43.71 ms | 908 | 55.07 ms |
| 100 | 6,063 | 16.49 ms | 1,269 | 78.78 ms | 852 | 117.39 ms |
Results: CNPG Longhorn 2 Replicas (Shared Disk)
| Clients | RO TPS | RO Lat | RW TPS | RW Lat | TPC-B TPS | TPC-B Lat |
|---|---|---|---|---|---|---|
| 1 | 741 | 1.35 ms | 97 | 10.32 ms | 76 | 13.08 ms |
| 10 | 8,399 | 1.19 ms | 681 | 14.69 ms | 598 | 16.72 ms |
| 25 | 8,279 | 3.02 ms | 957 | 26.11 ms | 802 | 31.17 ms |
| 50 | 7,406 | 6.75 ms | 1,081 | 46.27 ms | 873 | 57.29 ms |
| 100 | 6,697 | 14.93 ms | 1,206 | 82.89 ms | 829 | 120.67 ms |
The Combined Average Summary
Averaged across all 13 client/thread combinations per workload type:
| Config | Avg RO TPS | Avg RW TPS | Avg RW Lat | Overall Avg TPS |
|---|---|---|---|---|
| AWS RDS Standard | 10,724 | 2,250 | 17.30 ms | 4,826 |
| AWS Aurora IO-Prov | 8,370 | 1,234 | 29.72 ms | 3,480 |
| AWS Aurora Standard | 8,039 | 1,162 | 31.45 ms | 3,326 |
| CNPG Local SSD | 6,111 | 1,355 | 30.02 ms | 2,796 |
| CNPG Longhorn 1R | 6,376 | 1,152 | 32.46 ms | 2,797 |
| CNPG Longhorn 2R | 6,356 | 935 | 39.35 ms | 2,671 |
| CNPG Longhorn 2R+shared | 7,052 | 892 | 40.95 ms | 2,885 |
What The Data Tells Us
Finding 1: Overall average is misleading for storage comparison.
Overall avg TPS (RO+RW+TPCB combined) shows all bare metal configs at ~2,670–2,885 — nearly identical. This is because read tests dominate by volume and read performance is unaffected by storage. Always disaggregate by workload type.
Finding 2: Read performance — storage backend is irrelevant.
All four bare metal configs produce 6,111–7,052 Avg RO TPS — variation is within normal test noise. Aurora leads on reads (8,039–8,370) due to its distributed read-optimized storage layer.
Finding 3: Write performance — Local SSD wins clearly.
Local SSD delivers 1,355 Avg RW TPS vs Longhorn 2R's 892 — a 52% advantage. Every Longhorn replica adds ~3–4 ms write latency (network round-trip for replication acknowledgment).
Finding 4: Dedicated vs shared disk makes almost no difference.
Longhorn 2R dedicated (935 Avg RW TPS) vs Longhorn 2R shared (892) — only 4.6% difference. The bottleneck is network replication, not disk I/O contention.
Finding 5: Bare metal Local SSD write TPS beats Aurora IO-Optimized.
Local SSD Avg RW TPS (1,355) vs Aurora IO-Prov (1,234) — +9.8% advantage at baseline, before any CPU or kernel tuning. Aurora's write path pays network replication overhead just like Longhorn.
Finding 6: RDS Standard leads overall — but it's burstable.
RDS Standard's 4,826 overall avg and 2,250 Avg RW TPS comes from t3 CPU burst credits. Once credits are exhausted in sustained workloads, performance drops significantly.
Recommendations
| Workload | Recommendation |
|---|---|
| Write-intensive OLTP | Local SSD — 52% higher write TPS vs Longhorn 2R |
| Read-heavy (API, reporting) | Longhorn is fine — zero read overhead vs local SSD |
| HA with block-level durability | Longhorn 2R — accept write penalty, gain replication |
| Best write + HA | Local SSD + CNPG streaming replication — no storage network in write path |
| Managed simplicity | Aurora Standard — competitive write TPS, no ops overhead |
Environment Details
- CloudNativePG: v1.24 on Kubernetes 1.31
- Host: Bare Metal 32-Core (16 Physical / 16 HT), NUMA-Aware
- Storage: Samsung SM863a Enterprise SSD RAID 1 (SAS Interface)
- PostgreSQL config: Intentionally matched to AWS t3.large defaults for fair comparison
- Deployment: Single instance — no HA, no read replicas, no connection pooling
- AWS Region: ap-southeast-3 (Indonesia)
- Scale Factor: 100 (~10M rows, ~1.5 GB table)
- Benchmark runner: Kubernetes-native pgbench Job — source on GitHub
— Iwan Setiawan, Hybrid Cloud & Platform Architect · portfolio.kangservice.cloud
Top comments (0)