Part 9 - dbt Project Setup and Contracts 🧱
This part continues from the warehouse load and looks at the dbt project under dags/air_quality_dbt/.
What dbt is doing here
In this repository, dbt is the modeling layer that turns the loaded staging table into structured analytics tables.
The main project file, dbt_project.yml, defines the model folders and default materializations:
- base models become views,
- mart models become tables.
That split is intentional and easy to reason about.
The dbt profile
The profile in profiles.yml connects dbt to PostgreSQL using environment variables. That means the same project works in a containerized local environment and in a cloud runtime where the connection values are injected differently.
The source contract
The base model starts from the source airquality_dwh.stg_air_quality. That source declaration creates a clear contract: dbt expects the warehouse load step to create and populate the staging table before modeling begins.
This is a useful teaching point because it shows how data contracts are formed in practice.
Model layers
The project is split into:
- base models that clean or standardize the source,
- mart models that reshape the data for analysis,
- and schema tests that validate important fields.
That organization matches the way many production dbt projects are structured.
Continue
In the next part, I will walk through the base model and schema tests so you can see how the loaded warehouse table becomes a clean dbt source for the marts.
Continue to Part 10: Base Model and Data Quality.
Tag: #dataengineeringzoomcamp
Top comments (0)