You must store IoT sensor readings that arrive at a rate of 10,000 writes per second.
Each reading includes:
-
deviceId(string, partition key) -
timestamp(ISO‑8601, sort key) -
temperature,humidity,pressure(numeric) -
metadata(JSON blob, optional)
Requirements:
-
Fast point‑lookup for the latest reading of a given
deviceId. - Efficient range queries to retrieve all readings for a device within a time window (e.g., last 24 h).
- Retention policy: keep data for 30 days, then automatically expire.
- Cost‑optimized for the high write throughput while keeping read latency < 50 ms.
1. Table Schema & Primary Key
| Attribute | Type | Role |
|---|---|---|
deviceId |
String | Partition key |
timestamp |
String (ISO‑8601, e.g., 2025-12-04T12:34:56Z) |
Sort key |
temperature, humidity, pressure
|
Number | Payload |
metadata |
String (JSON) | Optional payload |
ttl |
Number (epoch seconds) | TTL attribute for expiration |
-
Why this PK?
- Guarantees all readings for a device are stored together, enabling efficient range queries (
deviceId= X ANDtimestampBETWEEN …). - Allows a single‑item query for the latest reading by using
ScanIndexForward=falseandLimit=1.
- Guarantees all readings for a device are stored together, enabling efficient range queries (
2. Indexing Strategy
| Index | Partition Key | Sort Key | Use‑case |
|---|---|---|---|
| Primary Table | deviceId |
timestamp |
Point lookup & range queries per device |
Global Secondary Index (GSI) – DeviceLatestGSI |
deviceId |
timestamp (projected as DESC) |
Direct query for the latest reading without scanning the whole partition (use Limit=1, ScanIndexForward=false). |
Optional GSI – MetricGSI |
metricType (e.g., "temperature" constant) |
timestamp |
If you need cross‑device time‑range queries for a single metric (rare). |
Note: The primary table already supports the latest‑reading query; the GSI is optional and only adds cost if you anticipate many concurrent “latest” reads that could cause hot‑partition reads on the same deviceId. In most cases the primary table with Limit=1 suffices.
3. Capacity Mode & Scaling
| Mode | When to use | Configuration |
|---|---|---|
| On‑Demand | Unpredictable spikes, easy start‑up, no need to manage capacity. | Handles 10 k writes/sec automatically; pay per request. |
| Provisioned + Auto Scaling | Predictable traffic, want to control cost. | Start with 15,000 RCUs and 5,000 WCUs (each write of ≤ 1 KB consumes 1 WCU). Enable auto‑scaling target 70 % utilization. |
Cost comparison (approx., US East 1, Dec 2025):
- On‑Demand writes: $1.25 per million write request units → ~ $12.5 k/month for 10 k writes/s (≈ 26 M writes/day).
- Provisioned 5,000 WCUs ≈ $0.65 per WCU‑hour → $2.3 k/month plus auto‑scaling buffer. On‑Demand is simpler; provisioned can be cheaper if traffic is stable.
4. Mitigating Hot‑Partition Risk
-
Uniform
deviceIddistribution: Ensure device IDs are random (e.g., UUID or hashed). -
If a few devices dominate traffic: Use sharding – prepend a random shard suffix to
deviceId(e.g.,deviceId#shard01). Store the shard count in a small config table; the application queries all shards and merges results. This spreads write capacity across partitions.
5. Data Retention (TTL)
- Add a numeric attribute
ttl=timestampEpoch + 30 days. - Enable DynamoDB TTL on this attribute; DynamoDB automatically deletes expired items (typically within 48 h of expiration).
- No additional Lambda needed, keeping cost low.
6. Read Performance Optimizations
-
Projection: Keep only needed attributes in the GSI (e.g.,
temperature,humidity,pressure,timestamp). This reduces read size and cost. - Consistent vs. eventual reads: Use eventual consistency for most queries (cheaper, 0.5 RCU per 4 KB). For the “latest reading” where freshness is critical, use strongly consistent read (1 RCU per 4 KB).
- BatchGetItem for fetching multiple latest readings across devices in a single call.
7. Auxiliary Services (optional)
| Service | Purpose |
|---|---|
| AWS Kinesis Data Streams | Buffer inbound sensor data, smooth bursty writes, and feed DynamoDB via a Lambda consumer. |
| AWS Lambda (TTL cleanup) | If you need deterministic deletion exactly at 30 days, a scheduled Lambda can query items with ttl nearing expiration and delete them, but DynamoDB TTL is usually sufficient. |
| Amazon CloudWatch Alarms | Monitor ConsumedWriteCapacityUnits, ThrottledRequests, and SystemErrors to trigger scaling or alerts. |
| AWS Glue / Athena | For ad‑hoc analytics on historical data exported to S3 (via DynamoDB Streams → Lambda → S3). |
8. Trade‑offs Summary
| Trade‑off | Impact |
|---|---|
| On‑Demand vs. Provisioned | On‑Demand simplifies ops but can be ~30 % more expensive at steady 10 k writes/s. Provisioned requires capacity planning but can be cheaper with auto‑scaling. |
| Sharding vs. Simplicity | Sharding eliminates hot‑partition risk for skewed device traffic but adds complexity in query logic (multiple shards per device). |
| TTL vs. Lambda cleanup | TTL is low‑cost, eventual deletion (up to 48 h delay). Lambda gives precise control but adds compute cost. |
| GSI for latest reading | Guarantees O(1) read latency even under heavy load, but incurs extra write cost (each write updates the GSI). Often unnecessary if Limit=1 on primary table suffices. |
| Strong vs. eventual consistency | Strong reads double read cost; use only where immediate freshness is required. |
With this design you achieve:
-
Fast point‑lookup (
QuerywithdeviceId+Limit=1,ScanIndexForward=false). -
Efficient time‑range queries (
QuerywithdeviceIdandtimestamp BETWEEN …). - Automatic 30‑day expiration via DynamoDB TTL.
- Cost‑effective high‑throughput writes using on‑demand or provisioned capacity with auto‑scaling, plus optional sharding to avoid hot partitions.
Top comments (0)