Baris Terzioglu

Posted on Mar 15

Data quality testing: how Bruin and dbt take different paths to the same goal

#dataengineering #sql #dataquality #devops

If you've built data pipelines for any length of time, you know the drill: the pipeline runs fine, the table gets created, and three days later someone discovers that half the rows have null IDs. The transformation was correct, the data just wasn't what you assumed.

Both Bruin and dbt have built-in systems for catching these problems. They solve the same problem, but in genuinely different ways. dbt treats tests as separate nodes in the DAG. Bruin embeds quality checks directly into asset definitions. Both approaches work, and the trade-offs between them are worth understanding regardless of which tool you use.

What we mean by "data quality testing"

Before comparing the tools, let me define the scope. I'm talking about checks that answer: "Does the data this pipeline just produced actually look right?" Common examples:

Is this column unique? Are there nulls where there shouldn't be?
Are all values in a column within an expected set?
Does a custom business rule hold? (e.g., total debits = total credits)

Both tools handle these. They just wire it up differently.

dbt: tests as first-class DAG nodes

dbt pioneered the idea of bringing software engineering testing practices to data. In dbt, a test is a SQL query that returns rows, if it returns zero rows, the test passes. If it returns rows, those are the failures.

There are two flavors of tests in dbt:

Generic tests are declared in schema.yml and apply to specific columns or models. The four built-in ones: unique, not_null, accepted_values, and relationships, cover the basics:

# schema.yml
models:
  - name: orders
    columns:
      - name: order_id
        data_tests:
          - unique
          - not_null
      - name: status
        data_tests:
          - accepted_values:
              values: ['placed', 'shipped', 'completed', 'returned']

Each of these gets compiled into a separate DAG node. Under the hood, unique becomes something like:

select order_id
from orders
group by order_id
having count(*) > 1

You can see this in the codebase GenericTestNode in core/dbt/contracts/graph/nodes.py (line 1083) inherits from CompiledNode, meaning each test is a full compiled SQL statement with its own node ID, configs, and execution context.

Singular tests are standalone SQL files you drop in the tests/ directory. They're just queries that should return zero rows:

-- tests/assert_total_payments_positive.sql
select order_id, total_amount
from {{ ref('orders') }}
where total_amount < 0

The test configuration system has real depth to it. Looking at TestConfig in core/dbt/artifacts/resources/v1/config.py:

@dataclass
class TestConfig(NodeAndTestConfig):
    severity: Severity = Severity("ERROR")
    store_failures: Optional[bool] = None
    store_failures_as: Optional[str] = None
    where: Optional[str] = None
    limit: Optional[int] = None
    fail_calc: str = "count(*)"
    warn_if: str = "!= 0"
    error_if: str = "!= 0"

That severity field with warn/error options is genuinely useful. You can say "I want to know about this problem, but don't block the pipeline":

data_tests:
  - not_null:
      severity: warn
      warn_if: "> 10"  # only warn if more than 10 nulls

And store_failures: true materializes the failing rows into a table so you can actually go look at what failed. When you're debugging data quality issues at 2am, being able to SELECT * FROM dbt_test__audit.not_null_orders_order_id is a lifesaver.

The thing to understand about dbt's approach: tests are nodes in the DAG, run via dbt test (or dbt build, which runs models and tests together). The TestTask class in core/dbt/task/test.py inherits from RunTask, tests go through the same execution machinery as models. The TestRunner.execute_data_test method renders the test SQL through the materialization macro system, executes it, and expects back exactly one row with three columns (failures, should_warn, should_error).

Bruin: checks embedded in the asset

Bruin takes a different approach. Quality checks aren't separate nodes , they're declared inside the asset definition itself, right next to the SQL that produces the data:

/* @bruin

name: analytics.orders
type: sf.sql
materialization:
  type: table

columns:
  - name: order_id
    type: integer
    checks:
      - name: unique
      - name: not_null
  - name: amount
    type: float
    checks:
      - name: positive
      - name: min
        value: 0.01

custom_checks:
  - name: row count above threshold
    query: SELECT count(*) > 100 FROM analytics.orders
    value: 1

@bruin */

SELECT
  order_id,
  amount
FROM raw.orders
WHERE status != 'cancelled'

Everything about the data quality expectations lives in the same file as the transformation. The column definitions, their types, and the checks they should pass are all declared together.

Bruin ships with nine built-in check types: unique, not_null, positive, negative, non_negative, accepted_values, pattern, min, and max. That's five more than dbt's four built-in generic tests. The positive, negative, non_negative, min, and max checks don't have out-of-the-box equivalents in dbt you'd write custom generic tests or singular test files.

Looking at the implementation, the check SQL generation is straightforward. From pkg/ansisql/checks.go:

// NotNullCheck generates: SELECT count(*) FROM {table} WHERE {column} IS NULL
func (c *NotNullCheck) Check(ctx context.Context, ti *scheduler.ColumnCheckInstance) error {
    qq := fmt.Sprintf("SELECT count(*) FROM %s WHERE %s IS NULL",
        ti.GetAsset().Name, ti.Column.Name)

    return (&CountableQueryCheck{
        conn:                c.conn,
        expectedQueryResult: 0,
        queryInstance:       &query.Query{Query: qq},
        checkName:           "not_null",
        customError: func(count int64) error {
            return errors.Errorf("column '%s' has %d null values", ti.Column.Name, count)
        },
    }).Check(ctx, ti)
}

Each check type generates a SELECT count(*) query and expects a specific result (usually zero). The CountableQueryCheck pattern runs the query against the actual database connection, parses the integer result, and compares it. Simple.

For custom checks, Bruin supports inline SQL queries directly in the asset YAML. The CustomCheck type in the same file renders the query through Jinja (so you can use template variables), then runs it:

func (c *CustomCheck) Check(ctx context.Context, ti *scheduler.CustomCheckInstance) error {
    qq := ti.Check.Query
    // Jinja rendering happens here...
    expected := ti.Check.Value
    if ti.Check.Count != nil {
        expected = *ti.Check.Count
        qq = fmt.Sprintf("SELECT count(*) FROM (%s) AS t", qq)
    }
    return NewCountableQueryCheck(c.conn, expected, &query.Query{Query: qq}, ti.Check.Name, ...)
}

How they wire into the execution graph

This is where the design philosophy really diverges.

In dbt, tests are independent nodes. When you run dbt build, the DAG might look like:

stg_orders (model) → orders (model) → test: unique_orders_order_id
                                     → test: not_null_orders_order_id
                                     → customers (model, depends on orders)

Tests and downstream models run after the model they test. But tests don't block downstream execution by default dbt build runs them in the DAG order, but a test failure on orders doesn't automatically prevent customers from running. You'd need to rely on the DAG structure or use dbt build with --fail-fast to get blocking behavior.

Bruin does something different. Quality checks are scheduled as ColumnCheckInstance and CustomCheckInstance objects, sub-tasks of the asset they belong to. In pkg/scheduler/scheduler.go (line 668), the scheduler explicitly wires them:

// add the upstream-downstream relationships for the main task to its quality checks
s.taskNameMap[assetName].AddUpstreamByType(TaskInstanceTypeColumnCheck, ti)
s.taskNameMap[assetName].AddUpstreamByType(TaskInstanceTypeCustomCheck, ti)

So the execution graph looks like:

raw_orders (asset) → [quality checks: unique, not_null, positive] → downstream_asset

Quality checks run after their asset completes, and they block downstream assets by default. The blocking field on each check controls this:

func (t *ColumnCheckInstance) Blocking() bool {
    return t.Check.Blocking.Bool()  // defaults to true
}

When constructInstanceRelationships builds the DAG, it considers blocking status. A downstream asset won't start until all blocking checks on its upstream assets have passed. Non-blocking checks still run, still report failures, but they don't hold up the pipeline.

You can also run checks independently without re-running the asset:

bruin run --only checks assets/orders.sql

This is useful when you want to re-check data quality without re-materializing.

Where each approach shines

dbt's "tests as separate nodes" design has some real advantages:

Reusability through macros. dbt's generic test system lets you write a test macro once and use it across your whole project. Packages like dbt-utils and dbt-expectations add dozens of test types. The macro system is genuinely powerful for this.
store_failures gives you debugging data. When a test fails, you can query the actual failing rows. Bruin tells you how many rows failed; dbt can show you which rows.
Granular severity. The warn_if/error_if system with thresholds is more nuanced than a binary blocking/not-blocking toggle. "Warn if more than 5% of rows fail" is a useful middle ground.

Bruin's "checks in the asset" design has different strengths:

Co-location. When I open a SQL file, I see the transformation AND the quality expectations in one place. I don't need to cross-reference between a SQL file and a schema.yml to understand what an asset does and what constraints it should satisfy. For onboarding engineers to a project, this is a real benefit.
Checks as pipeline gates. Blocking checks are wired into the DAG by default. If the orders table has null IDs, downstream assets won't even try to run. You don't need to think about test ordering or --fail-fast flags, it's the default behavior.
More built-in check types. Nine built-in checks vs four means less custom work for common validations. positive, min, and max come up all the time in financial and analytics data, and having them built in saves you from writing (and maintaining) custom test macros.
Custom checks without extra files. Need a business-specific check? Add a custom_checks entry with a SQL query. In dbt, you'd create a separate SQL file in tests/ or write a generic test macro in macros/. Bruin keeps it in the asset.

The deeper difference

These two approaches reflect a broader design question: should data quality be something you add next to your transformations, or something you define inside them?

dbt inherits from the software testing tradition: tests live in separate files, run as a separate step. There's a clean separation of concerns. The transformation does one thing, the test does another. This is familiar if you come from application development where src/ and tests/ are separate directories.

Bruin treats an asset as a complete unit: here's the data I produce, here are the columns it should have, and here are the constraints those columns must satisfy. It's closer to how a database schema with CHECK constraints works,the expectations are part of the definition, not a separate layer.

I find Bruin's approach particularly practical for teams where the person writing the transformation is also the person responsible for its quality. You define what the data should look like in the same breath as defining how to produce it. There's no friction of switching files or remembering to update a separate YAML when you change a column name.

That said, dbt's the ecosystem of test packages is something Bruin hasn't matched yet. If you need 50 different test types, dbt-expectations has them ready to go.

For me, the blocking-by-default behavior is the strongest argument for Bruin's design. Quality gates should be the default, not something you have to opt into. When a data quality check fails, the pipeline should stop and you shouldn't have to remember to configure that.