DEV Community

Hui Supat Lertsirigorn
Hui Supat Lertsirigorn

Posted on

Correctness By Construction: Inverted pyramid of integrity

Stop hiring people to find bugs that your architecture should have made impossible to create.

Architect for correctness-by-construction to make invalid states unrepresentable and you will not have to pay for a "search and rescue" operation every time you release. High-stakes engineering requires structural laws, not post-implementation checklists. If your release cycle depends on a "Risk-Based" trade-off to survive entropy, your design has already failed. True efficiency comes from engineering out the possibility of error at the root so that illegal outcomes are unrepresentable at your decision and persistence boundaries.

Flip the traditional testing model and adopt an Inverted Test Pyramid to ensure every line of code is a direct response to a business promise and you will not have the "specification gaming" issue of vanity unit-test metrics. In the legacy pyramid, thousands of brittle unit tests attempt to catch mistakes after they are made; in the Inverted Pyramid of Integrity, you focus on 100% line coverage derived exclusively from Acceptance Criteria (The Right Cucumber). When you enforce these invariants at the persistence boundary via a Rich Domain Model 1, you move the enforcement from a reactive check to a structural law. If you model a balance as a standard number, you are building a house with no doors and hiring a guard to watch the opening; instead, make the forbidden state unrepresentable so the code cannot even compile a reality your business rules forbid.

// Encode the invariant into the type: callers can only hold "NonNegative" balances.
type NonNegative = number & { readonly __brand: unique symbol };

function asNonNegative(n: number): NonNegative {
  if (!Number.isFinite(n) || n < 0) {
    throw new Error("balance must be >= 0");
  }
  return n as NonNegative;
}

type Account = { balance: NonNegative };

function applyDebit(account: Account, debit: NonNegative): Account {
  // Any attempt to create an invalid state must fail at the boundary.
  const next = account.balance - debit;
  return { balance: asNonNegative(next) };
}
Enter fullscreen mode Exit fullscreen mode

Separate Code Correctness from Contextual Correctness using a Dual-Agent loop to stop AI agents from gaming your Definition of Done. In the era of LLMs, models will always produce what looks like correct code to pass a technical test, often finding shortcuts that violate the actual business promise 2. Use one agent to push for Code Correctness (implementation) and a second, independent agent to enforce Contextual Correctness (intent) and you will not have the specification gaming 3 issue. By forcing the second agent to challenge the implementation using RAG over your global context—Git history, PR discussions, and architectural docs—you ensure the solution isn't just "passing" a check but is valid against the system's total historical intent.

Replace the "Quality Theater" of expensive E2E environments with High-Fidelity Fakes to provide identical guardrails at near-zero cost. Traditional E2E testing is a drain on capital; it is slow, non-deterministic, and frequently hides logic errors behind infrastructure flakiness. High-fidelity fakes are in-memory implementations that manage real state transitions and honor the same Consumer-Driven Contracts as your production services. By using fakes, you enable "State-Driven" verification where you can trigger complex business preconditions instantly. This gives you the exact same guardrails as a staging environment but at unit-test speed, ensuring that your business invariants remain deterministic and indestructible without the "Infrastructure Tax."

Move beyond "checking if it works" to "proving it cannot fail" by replacing the "Security Theater" of outsourced QA with adversarial validation. Run separate, asynchronous sieges to ensure you will not have the circular logic of authors with partial context validating their own flawed assumptions. Deploy Static Analysis to find structural mistakes, Fuzzing to bombard the system with randomized chaos, and Mutation Testing to intentionally break your code and see if your guardrails actually detect the lie.

Enforce these systemic laws to collapse complexity and you will not have the "moving fast and breaking things" liability that plagues modern software. Leadership in the age of AI isn't about overseeing execution—it's about defining the laws of the system. For more complex runtime policies, leverage tools like Open Policy Agent (OPA) 4 to enforce invariants at the persistence layer. When those laws are sound and every line of code is exercised by a business promise, the "bug" becomes a legacy concept: a thing that used to happen when correctness was someone else’s job.


References


  1. Martin Fowler, "Anemic Domain Model": https://martinfowler.com/bliki/AnemicDomainModel.html 

  2. DeepMind, "Specification gaming: the flip side of AI ingenuity": http://deepmind.com/blog/specification-gaming-the-flip-side-of-ai-ingenuity 

  3. Google Research, "A Study on Overfitting in Deep Reinforcement Learning": https://research.google/pubs/a-study-on-overfitting-in-deep-reinforcement-learning/ 

  4. Open Policy Agent, Policy-based control for cloud-native environments: https://www.openpolicyagent.org/ 

Top comments (0)