DEV Community

Jonny
Jonny

Posted on

Contract-First vs Assertion-First: LLM Agent Reliability

When an agent pipeline fails in production, the question is: where was correctness being enforced, and when did it break down?
Two approaches. Different tradeoffs.

Assertion-First

async def score_company(state: dict) -> dict:
    assert state.get("enriched"), "must be enriched before scoring"

    result = await llm.run(score_prompt, state)

    assert result.get("score") is not None
    return {**state, **result}
Enter fullscreen mode Exit fullscreen mode

Fast to write. Familiar. Works fine for small pipelines.
Problems at scale: checks scatter across every executor. When a pipeline crashes mid-run, you restart from the beginning — including any external API calls already executed. When the assertion fires, you get a stack trace. Not the state that caused it.

Contract-First
Correctness lives outside the executor, in a separate spec:

agent score_agent
  policy
    cap budget_tokens <= 3000
    deny score_company if region == "restricted"

  contract score_contract
    pre  enriched
    post scored
Enter fullscreen mode Exit fullscreen mode

The executor just executes. The runtime handles pre/post evaluation and policy enforcement — before the LLM call.
If region == "restricted" — the model never runs. No tokens consumed. DLQ entry written with full context.

What You Get on Failure
Assertion-first:


AssertionError: must be enriched before scoring
  File "pipeline.py", line 34
Enter fullscreen mode Exit fullscreen mode

Stack trace. Restart from scratch.
Contract-first:

[taxonomy]  ✕ post: species_identified  →  ContractViolation
             → state preserved
             → DLQ entry: stage, predicate, full context snapshot
             → replay available from: taxonomy
Fix the issue. Replay from the failure point. Completed steps don't re-execute — idempotency keys handle that.
Enter fullscreen mode Exit fullscreen mode

When to Use Each
Assertion-first — short pipelines, fast iteration, no replay requirements.
Contract-first — multi-stage pipelines, external side effects, auditability requirements, policy enforcement before execution.
They're not mutually exclusive. Contracts handle pipeline-level invariants. Assertions handle local logic inside executors.

pip install deed-runtime
Enter fullscreen mode Exit fullscreen mode

GitHub: github.com/Deadly-Reiter/deed

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.