137Foundry

Posted on Jun 15

How to Run Contract Tests Against a SaaS Source on a 15-Minute Schedule

#api #integration #python

If your pipeline pulls from a SaaS API and you have not had a silent-data incident yet, you have a SaaS vendor's good behavior to thank, not your engineering. SaaS vendors push schema changes through minor releases without coordinating with downstream consumers. Contract tests are the cheapest defense.

Here is a step-by-step for running them on a 15-minute schedule, with the tradeoffs that matter at small scale.

Step 1: Pick the surface area to contract

You do not contract every endpoint. You contract the endpoints whose drift would cause silent data corruption downstream. For most integrations, that is two to five endpoints out of however many the SaaS exposes.

Pull the list of endpoints your pipeline actually reads. For each one, ask whether a renamed field or a new enum value would silently break a downstream table. If yes, it goes on the contract list. If no (you do not consume the affected field), skip it.

The output is a short list. Keep it under ten. A contract suite that asserts everything is unmaintainable.

Step 2: Capture the current schema as a fixture

For each endpoint on the contract list, capture a representative response from the SaaS as a JSON fixture. Strip personally identifiable data. Commit the fixture to the repo. Annotate it with the date of capture and the SaaS API version.

The fixture is the ground-truth schema. The contract tests assert that future responses still conform to it. When the schema changes legitimately, you regenerate the fixture and commit a new version.

Pydantic is a fast way to turn the fixture into a typed model. Generate the Pydantic model from the JSON fixture, hand-edit the field annotations where the inferred types are wrong, and commit the model next to the fixture.

Step 3: Write the contract assertion

For each endpoint, the contract assertion is the same shape: fetch a small sample of records, parse them through the Pydantic model, fail the test if parsing raises.

import pydantic
from your_pipeline.models import CustomerRecord  # the typed contract

def test_customer_endpoint_contract():
    sample = saas_client.get("/customers", limit=10).json()
    for record in sample["data"]:
        try:
            CustomerRecord(**record)
        except pydantic.ValidationError as e:
            raise AssertionError(f"Schema drift in /customers: {e}")

Run this against the SaaS production API, not a mock or fixture. The point of the contract test is to detect drift in the live source. A test that runs against a captured fixture will pass forever - the fixture never drifts.

Step 4: Schedule the test

A 15-minute interval is the right starting cadence for most SaaS sources. Drift detection in 15 minutes versus 24 hours is a meaningful difference for time-to-remediation. Drift detection in 1 minute versus 15 minutes usually is not, and you start running into API rate limits.

Run the tests as a separate job from the main pipeline. A drift failure should not block the pipeline - it should alert and let the pipeline keep running until the team decides what to do. The pipeline's behavior under drift is a separate design question.

Common schedulers: cron in a small container, GitHub Actions on a 15-minute schedule, a dedicated Airflow DAG, or whatever your orchestration layer supports.

Step 5: Wire the alert to a real channel

The trap is the alert that goes to a no-op channel. Schema drift alerts should go to the same channel the on-call engineer watches for pipeline failures. If they go to a Slack channel nobody reads, the test buys you nothing.

The alert should include: the endpoint that drifted, the validation error, a link to the fixture, and a one-line guide on what to do next. The minimum viable triage instruction is "regenerate the fixture if the change is expected; otherwise pause the integration and check downstream tables."

Step 6: Make fixture regeneration a first-class command

When drift is legitimate (the SaaS added an expected field, you should consume it), the engineer regenerating the fixture needs a fast path. A make regenerate-fixtures or python -m pipeline.fixtures.regenerate command that pulls a fresh sample, regenerates the Pydantic model, and writes the diff for review beats forcing the engineer to do it by hand.

The slow path is what kills drift detection over time. If regenerating a fixture takes an hour of manual work, engineers will silently disable the failing test instead. Make the legitimate-drift path cheap, and the test stays useful.

Step 7: Plan for the cases contract tests will not catch

Contract tests catch structural drift in the fields you wrote a contract for. They miss:

New fields the source added that you did not contract on
Drift in nested or array sub-objects below your model's recursion depth
Semantic drift where the type is unchanged but the meaning changed
Drift in fields you do not pull but that affect related queries

For the first, layer in distributional monitoring on the warehouse side using Great Expectations or dbt's data tests to catch null rate and cardinality shifts. For the second, deepen the Pydantic model or write a separate test for the nested structures. For the third, run a periodic record-level diff against the source. For the fourth, accept the blind spot and document it.

The longer piece on combining contract tests with the other detection patterns is at https://137foundry.com/articles/how-to-detect-schema-drift-data-integration-before-silent-drops. The 137Foundry data integration service page has more on how this fits into a broader observability layer.

What this earns you in practice

Once the contract suite is running on a 15-minute schedule with real alerts, schema drift goes from "silent for weeks" to "alerted within an hour." That is the entire value proposition. The fix for any given drift event is still engineering work, but it is hours of work instead of days of remediation, and it happens before the downstream tables have weeks of partial data.

For most pipelines, this is the highest-leverage detection layer you can build in a single sprint. The next-highest is column-level statistical monitoring on the warehouse, which catches the silent drift that contract tests miss.

The Wikipedia entry on schema evolution is a good background read if you want the academic frame for what drift detection is actually solving. The practical frame is simpler: every external source you integrate with will drift on you eventually, and contract tests are the cheapest way to find out fast.

Common mistakes that erode contract tests over time

Two mistakes show up in every team that has run contract tests for more than a year:

Disabling failing tests under deadline pressure. A contract test fails the day before a release. The engineer disables it. The pipeline ships. The drift was real and the disabled test never gets reenabled. Six weeks later, downstream tables are wrong. Make this socially expensive in code review.

Contracting too many fields. A contract that asserts every field, including the ones you do not consume, fails on every legitimate addition. Engineers learn to ignore the alerts. Contract the fields your pipeline actually depends on - usually 10 to 30 percent of the available surface area.

Both failure modes look like discipline problems. They are usually design problems. Build the contract test for what your pipeline actually needs, make the fixture regeneration path fast, and the tests stay useful for years.

DEV Community