DEV Community

Scale
Scale

Posted on

How dbt-Style Snapshots Mimic Slowly Changing Dimensions in Data Warehousing

In modern analytics engineering, tracking data changes over time is essential. Tools like dbt (data build tool) implement this using a concept called snapshots.

Snapshots allow you to preserve historical states of data without overwriting previous records.

What Is a dbt Snapshot?

A dbt snapshot is a SQL-based mechanism that:

  • Runs a query on a dataset
  • Compares results to previous runs
  • Stores changes as new records

This creates a full history of how data evolves over time.

The Core Idea: Type 2 Slowly Changing Dimensions

dbt snapshots implement a classic data warehousing pattern:

Type 2 Slowly Changing Dimension (SCD2)

This means:

  • Each change creates a new row
  • Old data is preserved
  • Validity periods are tracked

Example structure:

order_id status dbt_valid_from dbt_valid_to
1 pending 2024-01-01 2024-01-02
1 shipped 2024-01-02 NULL

Why This Matters

Snapshots solve a key analytics problem:

👉 “How did this data look last week, last month, or last year?”

Use cases include:

  • Customer state tracking
  • Order lifecycle analysis
  • Pricing history
  • Marketing attribution

Under the Hood

When dbt runs a snapshot:

  1. It executes a SELECT query
  2. Compares results with stored snapshot table
  3. Detects changes using configured keys
  4. Inserts new records for changed rows

This ensures full historical traceability.

Real-World Impact

With snapshots, analysts can:

  • Reconstruct historical dashboards
  • Audit data changes
  • Build time-aware ML features

Top comments (0)