Your bank doesn't show you just your current balance. It shows you every deposit, every withdrawal, every transfer — in order, forever. Not because storage is cheap, but because the history IS the data.
Now imagine your bank deleted all transactions and just stored "Current Balance: $500." When a dispute occurs, how do you prove what happened? You can't. The journey is gone.
That's what traditional databases do to your application state. Every UPDATE overwrites the past. Every DELETE erases it permanently.
Event Sourcing says: stop throwing away the journey.
The Way Memory Actually Works
You can't change the past. But you can replay it.
That's event sourcing. Your application's history works exactly like memory and documentation combined:
- Memory = immutable. What happened, happened. You append new experiences; you don't rewrite old ones.
- Documentation = replayable.
What Event Sourcing IS
Store every state change as an immutable event. Never update. Never delete. Rebuild current state by replaying all events.
Traditional approach:
Patient record: medication = "Y", dosage = 100mg
(Everything before this moment: gone)
Event Sourcing approach:
Event 1 - PrescriptionAdded
{ medication: "X", dosage: 50mg, doctor: "Smith" }
Event 2 - PrescriptionChanged
{ medication: "Y", dosage: 100mg, doctor: "Jones", reason: "allergy to X" }
Current state = replay events = medication "Y", 100mg
The events are the source of truth. The current state is derived from them.
What Event Sourcing is NOT
❌ Audit logging — Audit logs are secondary. Event sourcing IS the primary storage.
❌ Event-driven architecture — Related, but different. You can have one without the other.
❌ Time travel / undo — You can rebuild past states. You cannot retroactively change events.
❌ Simple to implement — This is significant complexity. Choose deliberately.
When NOT to Use Event Sourcing
❌ Simple CRUD applications — If your app is forms-to-database with no history requirements, this is genuine overkill.
❌ High write volume, low read value — IoT sensor data writing 10,000 events/second where you only care about current readings.
❌ Strong immediate consistency required — Projections are eventually consistent. Use an append-only ledger with double-entry bookkeeping instead.
❌ No team familiarity — The learning curve is real. Budget time for it.
The test: If you can't name at least two concrete scenarios where replaying your event history would save you — in debugging, compliance, or feature development — you probably don't need it yet.
The Real-World Pain: Healthcare Records
Current Architecture (Traditional)
A patient record system stores only the latest state:
┌──────────────────────────┐
│ Patient Table │
│ │
│ id: 12345 │
│ medication: "Y" │ ← Previous value: gone
│ dosage: 100mg │ ← Previous value: gone
│ updated_at: 10:00 AM │ ← When it changed, not why
└──────────────────────────┘
The damage:
- Regulatory violation — HIPAA requires a complete audit trail
- Legal liability — no proof of original prescription
- Patient safety risk — no way to track the chain of medication decisions
- No debugging capability — the system shows "Y" but you don't know how it got there
The Refactoring Journey
Attempt 1: Add updated_at Timestamps (15%)
ALTER TABLE patients ADD COLUMN updated_at TIMESTAMP;
What it does: You know when the record last changed.
What it still can't tell you:
- What the value was before the change
- Who made the change, or why
- Any change before the last one
One UPDATE wipes all timestamp context. You have a clock but no history.
Attempt 2: Audit Log Table (25%)
CREATE TABLE audit_log (
change_id SERIAL PRIMARY KEY,
table_name TEXT,
field_name TEXT,
old_value TEXT,
new_value TEXT,
changed_by UUID,
changed_at TIMESTAMP
);
What it does: Separate table captures every field-level change.
What it still can't tell you:
- The audit log is secondary — not the source of truth
- Easy to forget logging in some code paths
- You can't rebuild current state from audit entries alone
The audit log is a receipt printer, not a ledger.
Attempt 3: Versioned Records (50%)
CREATE TABLE patient_history (
version INT,
patient_id UUID,
medication TEXT,
dosage TEXT,
active BOOLEAN,
created_at TIMESTAMP
);
-- Never UPDATE. Only INSERT new versions.
What it does: Full history preserved. Never overwrites.
What it still can't tell you:
- Stores full snapshots — if one field changes, you copy the entire record
- No semantic meaning: was this a prescription change or a data correction?
- Version conflicts in distributed systems
You have history, but it's dumb history.
Attempt 4: Event Log + Projections (75%)
const events = [
{
type: "PrescriptionAdded",
data: { medication: "X", dosage: "50mg", doctor: "Smith" }
},
{
type: "PrescriptionChanged",
data: { medication: "Y", dosage: "100mg", doctor: "Jones", reason: "allergy to X" }
}
];
function rebuildPatientState(events) {
return events.reduce((state, event) => {
switch (event.type) {
case "PrescriptionAdded":
return { ...state, medication: event.data.medication, dosage: event.data.dosage };
case "PrescriptionChanged":
return { ...state, medication: event.data.medication, dosage: event.data.dosage };
default:
return state;
}
}, {});
}
What it does: Events carry meaning. History is semantic, not just snapshots.
What breaks at scale:
- 1 million events per patient = slow rebuild on every read
- No snapshots — must replay from event 1 every time
- Can't run SQL queries against events directly
You have the right idea but it collapses under production load.
Attempt 5: Production-Ready Event Sourcing (100%)
The complete solution: immutable event store + snapshots + projections + CQRS separation.
┌──────────────────┐
│ Command Handler │ (Write: AddPrescription)
└────────┬─────────┘
│
▼
┌─────────────────────────────┐
│ Event Store │
│ (Immutable, append-only) │
│ │
│ PrescriptionAdded │
│ PrescriptionChanged │
│ DosageAdjusted │
│ AllergyRecorded │
└──────────────┬──────────────┘
│
├──► Projection 1: Current Patient State (read model)
├──► Projection 2: Medication History View (read model)
└──► Projection 3: Compliance Audit Report (read model)
Event Store Schema:
CREATE TABLE events (
event_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
aggregate_id TEXT NOT NULL, -- "patient-12345"
aggregate_type TEXT NOT NULL, -- "Patient"
event_type TEXT NOT NULL, -- "PrescriptionChanged"
event_data JSONB NOT NULL, -- { medication, dosage, doctor, reason }
metadata JSONB NOT NULL, -- { user_id, ip, timestamp, causation_id }
version INT NOT NULL, -- optimistic concurrency control
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Append-only: no UPDATE, no DELETE ever touches this table
Snapshot Table:
CREATE TABLE snapshots (
aggregate_id TEXT PRIMARY KEY,
version INT NOT NULL, -- event version at snapshot time
state JSONB NOT NULL, -- full reconstructed state
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
Snapshots: The Cache That Fixes Replay Performance
Replaying 600 events is fast. Replaying 600,000 is not.
Think of snapshots as a manually managed materialized view — a checkpoint that says "here's the known-good state at version N, don't bother replaying anything before this."
Rule of thumb: create a snapshot every N events (common values: 50, 100, 500 — tune to your read latency requirements).
Event Versioning: When Your Schema Evolves
// Version 1 (old)
{ type: "PrescriptionAdded", version: 1, data: { medication, dosage } }
// Version 2 (new)
{ type: "PrescriptionAdded", version: 2, data: { medication, dosage, prescriber_license } }
// Upcaster — converts old events to new schema at read time
function upcastPrescriptionAdded(event) {
if (event.version === 1) {
return {
...event,
version: 2,
data: {
...event.data,
prescriber_license: lookupLicense(event.metadata.user_id)
}
};
}
return event;
}
All application code works with version 2. Old events are transparently upgraded at read time. No migrations. No data loss.
Indexing: Write Fast, Query Smart
Keep the Event Store Lean
The event store is not a query surface. It's a ledger. You only need two indexes:
-- Fetch and replay a single aggregate's history
CREATE INDEX idx_events_aggregate ON events (aggregate_id, version);
-- Stream all events in order (for projection rebuilds)
CREATE INDEX idx_events_created ON events (created_at);
That's it. No index on event_type. No index on event_data fields.
If you find yourself writing
WHERE event_data->>'medication' = 'Y'against the event store — stop. That's a projection's job.
The Write Path: Intentionally Simple
One row. One append. No joins. No cascading index updates.
Compare to a traditional schema under heavy writes:
Write to patients table:
→ Update 3 indexes on patients
→ Write to audit_log table
→ Update 2 indexes on audit_log
→ Update foreign key indexes
Event sourcing flips this:
Writes are cheap. Reads are someone else's problem — specifically, the projection's problem.
Projections: Where All the Indexing Lives
Event Store (lean, 2 indexes)
├──► Current Patient State Projection
│ Index: patient_id
│ Index: (medication, dosage)
├──► Medication History Projection
│ Index: (patient_id, changed_at)
└──► Compliance Audit Projection
Index: (changed_by, changed_at)
Index: event_type
Adding a new reporting requirement? Add a new projection. The event store never changes.
Debugging with Event Sourcing
Patient 12345 is showing the wrong medication. Here's your workflow:
-- 1. Query the event store
SELECT * FROM events WHERE aggregate_id = 'patient-12345' ORDER BY version;
-- 2. Replay events in sequence — see exact state at each step
-- 3. Identify which event caused the incorrect state
-- 4. Check metadata: who triggered it, from which IP, at what time
-- 5. Fix the projection logic and replay to verify
-- 6. Deploy — no data migration needed
Quick Reference
| Concept | What it does |
|---|---|
| Event Store | Immutable, append-only log of everything that happened |
| Aggregate | The entity whose state you're rebuilding (e.g. Patient) |
| Projection | A read model derived by replaying events |
| Snapshot | A checkpoint to avoid replaying from event 1 every time |
| Upcaster | Converts old event schemas to new ones at read time |
| CQRS | Separates the write path (commands) from the read path (queries) |
Next: CQRS — because storing every event is only half the solution. Reading them efficiently is the other half.
Building scalable systems? I write about architecture patterns and clean code.
Follow me on LinkedIn | Twitter | GitHub
Top comments (0)