DEV Community

Abdelrahman Adnan
Abdelrahman Adnan

Posted on

Part 11 - Dimensions and Fact Table 📊

Part 11 - Dimensions and Fact Table 📊

This part continues from the base model and explains the mart layer in dags/air_quality_dbt/models/marts/.

Why the mart layer exists

The mart layer is where the analytics shape becomes obvious. Instead of keeping everything in one large staging table, the project splits the data into a star-schema-style layout.

That makes downstream querying simpler and more efficient.

The dimension tables

The project creates two dimensions:

These are deduplicated reference tables. Each one extracts the stable descriptive fields that are useful for analysis and dashboard filtering.

The fact table

The main analytical table is fact_air_quality, defined in fact_air_quality.sql.

This table keeps the actual readings and the relevant weather context in one place. That is why the dashboard queries can stay straightforward.

Why the star schema is useful here

The warehouse design intentionally denormalizes some location fields into the fact table. That reduces the number of joins needed for common dashboard questions while still keeping clean dimension tables for reference and filtering.

For a tutorial project, this is a solid middle ground between realism and simplicity.

Tests on the mart layer

The mart schema tests check the key constraints:

  • station_id should be unique and not null in dim_station,
  • sensor_id should be unique and not null in dim_sensor,
  • station_id and sensor_id should not be null in fact_air_quality.

That keeps the model graph honest.

Continue

The next part moves from warehouse modeling into the visualization layer and explains how Superset is seeded with datasets, charts, and a dashboard layout.

Continue to Part 12: Superset Seeding and Dashboards.

Tag: #dataengineeringzoomcamp

Top comments (0)