DEV Community

Cover image for 33/60 Days System Design Questions
Joud Awad
Joud Awad

Posted on

33/60 Days System Design Questions

Your order service takes 200 writes/sec at peak.
You audit 6 months of data. Something's off — two orders show the same ID, different totals.

You have the current state. You don't have how it got there.

Your DB is a graveyard of overwritten rows.

Here's the system:

• OrderService → Postgres (current state only)
• Events: placed, updated, cancelled, refunded
• Every UPDATE overwrites the previous row
• No audit log. No event history. No replay.

A billing dispute just landed. You need to reconstruct exactly what happened to Order #8471. You can't.

That's the problem Event Sourcing solves.

Instead of storing the current state, you store the sequence of events that produced it.

What's your approach when redesigning this service?

A) Event Sourcing — append-only event log as the source of truth, current state derived from replaying events.
B) Change Data Capture (CDC) — keep Postgres as-is, but stream all row changes to Kafka for an audit trail.
C) Add an audit_log table — trigger-based shadow writes on every INSERT/UPDATE/DELETE.
D) Dual-write — write to both the current-state table and a separate events table on every operation.

One of these gives you full replay, projection flexibility, and a real source of truth. The others are patches.

Pick one — A, B, C, or D — and tell me why. I'll drop the full breakdown in the comments.

If your team is arguing about audit trails or event-driven redesigns, tag someone who needs to see this.

Drop your answer 👇

30DaysOfSystemDesign #SystemDesign #EventSourcing #SoftwareArchitecture

Top comments (4)

Collapse
 
thejoud1997 profile image
Joud Awad

Answer: A — Event Sourcing ✅

Here's why, and why the other three look reasonable but miss the point:

Why A wins (Event Sourcing):
Event Sourcing flips the model entirely. Instead of storing the latest state and losing how you got there, you store every event that ever happened. The current state is a projection — computed by replaying events from the beginning (or from a snapshot).

For Order #8471: you don't query a row. You replay OrderPlaced, OrderUpdated, PaymentCaptured, RefundInitiated. Every mutation is preserved, timestamped, immutable. You can reconstruct state at any point in time. You can build new projections (e.g., "total revenue by SKU last 90 days") without changing the core model — just replay and project differently.

This is what Axon, EventStoreDB, and the event-sourced layers in most serious fintech/e-commerce systems do. It's not a log — it's the primary data model.

Collapse
 
thejoud1997 profile image
Joud Awad

Why B is the trap (CDC):
CDC is powerful and often underrated. Tools like Debezium + Kafka let you stream every Postgres row change downstream without touching your app. It looks like Event Sourcing from the outside — you have a sequence of changes.

But here's the trap: CDC captures state deltas, not business events. You get total changed from 89.99 to 79.99 — not DiscountApplied by coupon SAVE10. The semantic meaning is lost. You can reconstruct what changed, not why it changed. That distinction kills you during disputes and compliance audits.

CDC is a great complement to Event Sourcing for propagation. It's not a substitute for it.

Collapse
 
thejoud1997 profile image
Joud Awad

Why C is wrong (audit_log table):
Trigger-based audit logs are the classic band-aid. The problem: you're treating the row as source of truth and logging changes as a side effect.

Side effects break. Triggers get disabled during migrations. Bulk imports skip them. Schema changes orphan the trigger. Six months later, the audit_log has gaps — exactly when you need it most. It's a shadow, not a foundation.

Collapse
 
thejoud1997 profile image
Joud Awad

Why D is wrong (Dual-write):
Dual-write seems reasonable. But you've created a distributed consistency problem inside your own service. If the state write succeeds and the events write fails — network hiccup, crash, OOM — you've got state with no corresponding event. Audit trail unreliable by design.

This is the exact problem the Outbox Pattern was invented to solve. But if you're going that far, you're halfway to proper Event Sourcing anyway.