DEV Community

Cover image for Event Sourcing: The Record Statement You Never Throw Away
Aishwarya B R
Aishwarya B R

Posted on • Originally published at Medium

Event Sourcing: The Record Statement You Never Throw Away

Your bank doesn't show you just your current balance. It shows you every deposit, every withdrawal, every transfer — in order, forever. Not because storage is cheap, but because the history IS the data.

Now imagine your bank deleted all transactions and just stored "Current Balance: $500." When a dispute occurs, how do you prove what happened? You can't. The journey is gone.

That's what traditional databases do to your application state. Every UPDATE overwrites the past. Every DELETE erases it permanently.

Event Sourcing says: stop throwing away the journey.


The Way Memory Actually Works

You can't change the past. But you can replay it.

That's event sourcing. Your application's history works exactly like memory and documentation combined:

  • Memory = immutable. What happened, happened. You append new experiences; you don't rewrite old ones.
  • Documentation = replayable.

What Event Sourcing IS

Store every state change as an immutable event. Never update. Never delete. Rebuild current state by replaying all events.

Traditional approach:

Patient record: medication = "Y", dosage = 100mg
Enter fullscreen mode Exit fullscreen mode

(Everything before this moment: gone)

Event Sourcing approach:

Event 1 - PrescriptionAdded
{ medication: "X", dosage: 50mg, doctor: "Smith" }

Event 2 - PrescriptionChanged
{ medication: "Y", dosage: 100mg, doctor: "Jones", reason: "allergy to X" }
Enter fullscreen mode Exit fullscreen mode

Current state = replay events = medication "Y", 100mg

The events are the source of truth. The current state is derived from them.


What Event Sourcing is NOT

Audit logging — Audit logs are secondary. Event sourcing IS the primary storage.

Event-driven architecture — Related, but different. You can have one without the other.

Time travel / undo — You can rebuild past states. You cannot retroactively change events.

Simple to implement — This is significant complexity. Choose deliberately.


When NOT to Use Event Sourcing

Simple CRUD applications — If your app is forms-to-database with no history requirements, this is genuine overkill.

High write volume, low read value — IoT sensor data writing 10,000 events/second where you only care about current readings.

Strong immediate consistency required — Projections are eventually consistent. Use an append-only ledger with double-entry bookkeeping instead.

No team familiarity — The learning curve is real. Budget time for it.

The test: If you can't name at least two concrete scenarios where replaying your event history would save you — in debugging, compliance, or feature development — you probably don't need it yet.


The Real-World Pain: Healthcare Records

Current Architecture (Traditional)

A patient record system stores only the latest state:

┌──────────────────────────┐
│ Patient Table            │
│                          │
│ id:         12345        │
│ medication: "Y"          │  ← Previous value: gone
│ dosage:     100mg        │  ← Previous value: gone
│ updated_at: 10:00 AM     │  ← When it changed, not why
└──────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The damage:

  • Regulatory violation — HIPAA requires a complete audit trail
  • Legal liability — no proof of original prescription
  • Patient safety risk — no way to track the chain of medication decisions
  • No debugging capability — the system shows "Y" but you don't know how it got there

The Refactoring Journey

Attempt 1: Add updated_at Timestamps (15%)

ALTER TABLE patients ADD COLUMN updated_at TIMESTAMP;
Enter fullscreen mode Exit fullscreen mode

What it does: You know when the record last changed.

What it still can't tell you:

  • What the value was before the change
  • Who made the change, or why
  • Any change before the last one

One UPDATE wipes all timestamp context. You have a clock but no history.


Attempt 2: Audit Log Table (25%)

CREATE TABLE audit_log (
  change_id   SERIAL PRIMARY KEY,
  table_name  TEXT,
  field_name  TEXT,
  old_value   TEXT,
  new_value   TEXT,
  changed_by  UUID,
  changed_at  TIMESTAMP
);
Enter fullscreen mode Exit fullscreen mode

What it does: Separate table captures every field-level change.

What it still can't tell you:

  • The audit log is secondary — not the source of truth
  • Easy to forget logging in some code paths
  • You can't rebuild current state from audit entries alone

The audit log is a receipt printer, not a ledger.


Attempt 3: Versioned Records (50%)

CREATE TABLE patient_history (
  version     INT,
  patient_id  UUID,
  medication  TEXT,
  dosage      TEXT,
  active      BOOLEAN,
  created_at  TIMESTAMP
);
-- Never UPDATE. Only INSERT new versions.
Enter fullscreen mode Exit fullscreen mode

What it does: Full history preserved. Never overwrites.

What it still can't tell you:

  • Stores full snapshots — if one field changes, you copy the entire record
  • No semantic meaning: was this a prescription change or a data correction?
  • Version conflicts in distributed systems

You have history, but it's dumb history.


Attempt 4: Event Log + Projections (75%)

const events = [
  {
    type: "PrescriptionAdded",
    data: { medication: "X", dosage: "50mg", doctor: "Smith" }
  },
  {
    type: "PrescriptionChanged",
    data: { medication: "Y", dosage: "100mg", doctor: "Jones", reason: "allergy to X" }
  }
];

function rebuildPatientState(events) {
  return events.reduce((state, event) => {
    switch (event.type) {
      case "PrescriptionAdded":
        return { ...state, medication: event.data.medication, dosage: event.data.dosage };
      case "PrescriptionChanged":
        return { ...state, medication: event.data.medication, dosage: event.data.dosage };
      default:
        return state;
    }
  }, {});
}
Enter fullscreen mode Exit fullscreen mode

What it does: Events carry meaning. History is semantic, not just snapshots.

What breaks at scale:

  • 1 million events per patient = slow rebuild on every read
  • No snapshots — must replay from event 1 every time
  • Can't run SQL queries against events directly

You have the right idea but it collapses under production load.


Attempt 5: Production-Ready Event Sourcing (100%)

The complete solution: immutable event store + snapshots + projections + CQRS separation.

┌──────────────────┐
│ Command Handler  │  (Write: AddPrescription)
└────────┬─────────┘
         │
         ▼
┌─────────────────────────────┐
│ Event Store                 │
│ (Immutable, append-only)    │
│                             │
│ PrescriptionAdded           │
│ PrescriptionChanged         │
│ DosageAdjusted              │
│ AllergyRecorded             │
└──────────────┬──────────────┘
               │
               ├──► Projection 1: Current Patient State   (read model)
               ├──► Projection 2: Medication History View (read model)
               └──► Projection 3: Compliance Audit Report (read model)
Enter fullscreen mode Exit fullscreen mode

Event Store Schema:

CREATE TABLE events (
  event_id        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  aggregate_id    TEXT NOT NULL,         -- "patient-12345"
  aggregate_type  TEXT NOT NULL,         -- "Patient"
  event_type      TEXT NOT NULL,         -- "PrescriptionChanged"
  event_data      JSONB NOT NULL,        -- { medication, dosage, doctor, reason }
  metadata        JSONB NOT NULL,        -- { user_id, ip, timestamp, causation_id }
  version         INT NOT NULL,          -- optimistic concurrency control
  created_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Append-only: no UPDATE, no DELETE ever touches this table
Enter fullscreen mode Exit fullscreen mode

Snapshot Table:

CREATE TABLE snapshots (
  aggregate_id  TEXT PRIMARY KEY,
  version       INT NOT NULL,   -- event version at snapshot time
  state         JSONB NOT NULL, -- full reconstructed state
  created_at    TIMESTAMPTZ NOT NULL DEFAULT now()
);
Enter fullscreen mode Exit fullscreen mode

Snapshots: The Cache That Fixes Replay Performance

Replaying 600 events is fast. Replaying 600,000 is not.

Think of snapshots as a manually managed materialized view — a checkpoint that says "here's the known-good state at version N, don't bother replaying anything before this."

Rule of thumb: create a snapshot every N events (common values: 50, 100, 500 — tune to your read latency requirements).


Event Versioning: When Your Schema Evolves

// Version 1 (old)
{ type: "PrescriptionAdded", version: 1, data: { medication, dosage } }

// Version 2 (new)
{ type: "PrescriptionAdded", version: 2, data: { medication, dosage, prescriber_license } }

// Upcaster — converts old events to new schema at read time
function upcastPrescriptionAdded(event) {
  if (event.version === 1) {
    return {
      ...event,
      version: 2,
      data: {
        ...event.data,
        prescriber_license: lookupLicense(event.metadata.user_id)
      }
    };
  }
  return event;
}
Enter fullscreen mode Exit fullscreen mode

All application code works with version 2. Old events are transparently upgraded at read time. No migrations. No data loss.


Indexing: Write Fast, Query Smart

Keep the Event Store Lean

The event store is not a query surface. It's a ledger. You only need two indexes:

-- Fetch and replay a single aggregate's history
CREATE INDEX idx_events_aggregate ON events (aggregate_id, version);

-- Stream all events in order (for projection rebuilds)
CREATE INDEX idx_events_created ON events (created_at);
Enter fullscreen mode Exit fullscreen mode

That's it. No index on event_type. No index on event_data fields.

If you find yourself writing WHERE event_data->>'medication' = 'Y' against the event store — stop. That's a projection's job.

The Write Path: Intentionally Simple

One row. One append. No joins. No cascading index updates.

Compare to a traditional schema under heavy writes:

Write to patients table:
→ Update 3 indexes on patients
→ Write to audit_log table
→ Update 2 indexes on audit_log
→ Update foreign key indexes
Enter fullscreen mode Exit fullscreen mode

Event sourcing flips this:

Writes are cheap. Reads are someone else's problem — specifically, the projection's problem.

Projections: Where All the Indexing Lives

Event Store (lean, 2 indexes)
├──► Current Patient State Projection
│      Index: patient_id
│      Index: (medication, dosage)
├──► Medication History Projection
│      Index: (patient_id, changed_at)
└──► Compliance Audit Projection
       Index: (changed_by, changed_at)
       Index: event_type
Enter fullscreen mode Exit fullscreen mode

Adding a new reporting requirement? Add a new projection. The event store never changes.


Debugging with Event Sourcing

Patient 12345 is showing the wrong medication. Here's your workflow:

-- 1. Query the event store
SELECT * FROM events WHERE aggregate_id = 'patient-12345' ORDER BY version;

-- 2. Replay events in sequence — see exact state at each step
-- 3. Identify which event caused the incorrect state
-- 4. Check metadata: who triggered it, from which IP, at what time
-- 5. Fix the projection logic and replay to verify
-- 6. Deploy — no data migration needed
Enter fullscreen mode Exit fullscreen mode

Quick Reference

Concept What it does
Event Store Immutable, append-only log of everything that happened
Aggregate The entity whose state you're rebuilding (e.g. Patient)
Projection A read model derived by replaying events
Snapshot A checkpoint to avoid replaying from event 1 every time
Upcaster Converts old event schemas to new ones at read time
CQRS Separates the write path (commands) from the read path (queries)

Next: CQRS — because storing every event is only half the solution. Reading them efficiently is the other half.

Building scalable systems? I write about architecture patterns and clean code.
Follow me on LinkedIn | Twitter | GitHub

Top comments (0)