David Ohnstad

Posted on Jun 12 • Originally published at davidohnstad.com

Federated Data Architectures: Accountability Without Authority

#productivity #career #datascience #management

This article was originally published on davidohnstad.com. I cross-post here to reach the Dev.to community.

Why Federated Data Architectures Set PMs Up for Accountability Without Authority

Three weeks after launch, a VP asked David Ohnstad why the revenue attribution dashboard showed conflicting numbers between marketing and sales. The answer: two source systems, two definitions of "closed deal," and zero enforcement mechanism to make either team change their schema. According to Gartner's 2025 Data & Analytics Summit research, 68% of federated data initiatives fail within 18 months—not because the architecture is wrong, but because accountability lives with product managers who have no authority over the data contracts that determine success.

The Henkel case study published in CDO Magazine this April reveals the structural flaw most organizations ignore when they adopt decentralized analytics models. Henkel built governance-aligned data products across business units, celebrated the federated architecture as a win for agility, and then watched data product managers become scapegoats when cross-functional dashboards returned inconsistent results. The governance framework existed. The decentralized teams had autonomy. But nobody owned enforcement—the layer between policy and execution where data contracts either hold or break.

This is the accountability trap: data product managers are responsible for delivering trusted insights, but they don't control the upstream data quality, the engineering sprint priorities that fix schema drift, or the governance mechanisms that enforce standard definitions across teams. When a dashboard shows conflicting revenue numbers, leadership blames the PM who shipped it—not the sales engineering team that changed a field definition without documentation, or the governance board that wrote a policy with no enforcement tooling.

The Failure Pattern: Responsibility Without Remediation Rights

Most federated data architectures fail at the same point: the moment a downstream data product needs to enforce a contract with an upstream source system. The organizational design gives PMs accountability for outcomes (accurate dashboards, trusted metrics, repeatable insights) while giving them zero formal authority over the systems that produce the data. When source data changes without warning—a field gets deprecated, a calculation changes, a new ETL pipeline introduces duplicates—the PM discovers the break only after users report incorrect results.

According to McKinsey's 2024 State of Data & Analytics report, 73% of enterprises now use some form of federated or decentralized data architecture, up from 41% in 2022. The adoption curve is steep. But the same report found that only 28% of those organizations have implemented automated contract enforcement between data producers and consumers. The gap between architectural ambition and operational reality is where data product managers get crushed.

Here's the specific failure mode David Ohnstad has seen play out across three organizations: a data product team builds a multi-source dashboard, defines clear data contracts with each upstream system, documents the schema requirements, and launches successfully. Six weeks later, an upstream team makes a "minor" change to improve their own reporting—renaming a field, adjusting a timestamp format, changing how nulls are handled. That change breaks the downstream dashboard. Users see errors or, worse, silent data corruption that produces plausible but incorrect results. The PM is held accountable for the broken product, but they have no standing to block the upstream change, no automated validation to catch the break before users see it, and no organizational mandate to enforce contract compliance across teams they don't manage.

The traditional response is "better communication" or "tighter governance documentation." Both are necessary. Neither solves the enforcement gap. A Slack thread asking an upstream team to please revert their schema change is not an enforcement mechanism. A governance wiki page documenting field definitions is not a contract that prevents breaking changes. The PM is accountable, but powerless to prevent the exact failures they'll be blamed for.

The Contract Enforcement Layer Framework

Most organizations treat data contracts as documentation artifacts—Wiki pages, Confluence entries, or spreadsheet tabs that define what each field means and how it should be structured. That model assumes compliance is a cultural problem solved by clarity and goodwill. It's not. Compliance is a tooling problem. If breaking a data contract doesn't trigger an automated alert and block a deployment, the contract is a suggestion, not an enforceable agreement. David Ohnstad built a solution for this gap using a five-layer enforcement model he calls the Contract Enforcement Layer Framework—a system that moves data quality accountability from PMs to the pipeline itself.

Layer 1: Contract Registration. Every upstream data source must register a formal schema contract before a downstream product can depend on it. This is not documentation—this is a versioned API-style contract stored in a central registry that both producer and consumer teams reference. The contract defines field names, data types, null handling, expected ranges, update frequency, and the contact owner for each source. If a field isn't in the registered contract, the downstream pipeline rejects it. This forces upstream teams to make schema changes explicit and visible, not silent and discovered later.

Layer 2: Automated Validation Gates. Every data ingestion pipeline runs contract validation before processing begins. If the incoming data violates the registered schema—wrong data type, unexpected null values, missing required fields, values outside expected ranges—the pipeline halts and triggers an alert to both the producer and consumer teams. This is the enforcement step most organizations skip. Without automated validation, a contract is just a document someone can ignore. With validation gates, breaking a contract stops the pipeline, surfaces the issue immediately, and prevents bad data from reaching downstream products.

Layer 3: Version-Controlled Schema Changes. When an upstream team needs to change a field definition, they must submit a schema change request through the contract registry. The request triggers notifications to every downstream consumer that depends on that field, includes a mandatory migration window (typically 30 days), and requires explicit acknowledgment from each consumer before the change can be deployed. This is the step that feels bureaucratic to agile-minded teams, but it's the step that prevents the "I didn't know that would break your dashboard" failure mode. Schema changes are treated like breaking API changes in a microservices architecture—documented, versioned, communicated, and coordinated.

Layer 4: Consumer-Side Escape Hatches. Even with contracts and validation, production systems sometimes need flexibility. The framework includes a temporary override mechanism: a downstream PM can accept schema violations for a defined period (maximum 7 days) to avoid blocking critical reporting, but the override triggers daily escalation alerts to leadership and must include a documented remediation plan. This prevents the contract from becoming a bureaucratic bottleneck while maintaining visibility and urgency around the compliance gap. The override is not a permanent workaround—it's a bridge to a fix, and the escalation ensures it doesn't become permanent technical debt.

Layer 5: Enforcement Dashboards and SLA Tracking. The final layer is observability. Every contract violation, schema change request, override activation, and pipeline halt gets logged in a centralized enforcement dashboard that tracks compliance by team, source system, and time period. This dashboard becomes the accountability mechanism leadership actually needs: instead of asking why a PM's dashboard broke, they can see which upstream team violated a contract and how long the violation persisted. The PM is no longer the scapegoat—the enforcement layer makes accountability transparent and data-driven. According to Forrester's 2025 Data Governance Trends report, organizations that implement automated contract enforcement reduce data quality incidents by 64% within the first year and cut mean time to resolution by 52%.

The counterintuitive step here is Layer 4—the escape hatch. Most governance frameworks try to make contracts rigid and absolute, which leads to teams bypassing the system entirely when they face production pressure. The escape hatch acknowledges reality: sometimes you need to ship a critical report even when upstream data is imperfect. But it makes the tradeoff visible, time-bound, and escalated so it doesn't become a permanent workaround that erodes the entire contract model.

How David Ohnstad Built This at Scale Using arr-guardian

When David Ohnstad joined Veeam's data product organization, the team was running a classic federated architecture: multiple business units owned their own data pipelines, a central analytics team provided shared infrastructure, and data product managers were responsible for building cross-functional dashboards that stitched together insights from sales, marketing, customer success, and product usage data. The model worked well for localized reporting within a single business unit. It failed catastrophically for any dashboard that needed consistent definitions across teams.

The breaking point came during a quarterly business review when the CEO asked why ARR (annual recurring revenue) numbers differed between the sales dashboard, the finance dashboard, and the customer success dashboard. Three teams, three source systems, three slightly different definitions of what constituted a "closed deal" and when revenue should be recognized. The discrepancies were small—2-3% variance—but the credibility damage was massive. Leadership questioned whether any of the data products could be trusted. The data product managers took the heat, even though they had documented the definitional differences and escalated the issue months earlier. Documentation without enforcement meant the issue persisted until it became a crisis.

David Ohnstad responded by building an enforcement layer using a tool he called arr-guardian—a contract validation system that sat between upstream source systems and downstream analytics pipelines. The tool implemented all five layers of the Contract Enforcement Layer Framework. Every source system that fed ARR data had to register a schema contract that defined exactly which fields contributed to the ARR calculation, how nulls were handled, what timestamp formats were required, and who owned the data quality for that source. The contract wasn't a Wiki page—it was a JSON schema stored in a version-controlled repository that both the source system and the analytics pipeline referenced.

When the sales engineering team needed to change how they tracked renewal dates—a reasonable operational improvement—they submitted a schema change request through arr-guardian. The system automatically identified that the downstream ARR dashboard depended on that field, triggered notifications to the data product team and the finance analytics team, and required explicit acknowledgment before the change could deploy. The 30-day migration window gave downstream teams time to adjust their pipelines, test the new logic, and validate that the change wouldn't break existing reports. The change still happened, but it happened with coordination and visibility instead of as a surprise that broke production dashboards.

The escape hatch layer proved essential during an end-of-quarter reporting crunch when a customer success data pipeline failed validation because an upstream CRM export changed a timestamp format. The pipeline would normally halt and block the report. The data product manager activated a 7-day override, allowing the report to proceed with a documented caveat, and escalated the timestamp issue to the CRM team with daily reminders until it was fixed. The override prevented a reporting crisis, the escalation ensured the issue didn't get ignored, and the enforcement dashboard gave leadership full visibility into both the problem and the remediation plan. Three days later, the CRM team fixed the timestamp format, the override was deactivated, and the contract was back in compliance.

Within six months of deploying arr-guardian, the data product team reduced schema-related dashboard failures by 71%, cut mean time to detect data quality issues from 11 days to 4 hours, and shifted accountability conversations from "why did your dashboard break" to "which team violated the contract and when will it be fixed." The tool didn't eliminate all data quality issues—upstream systems still had bugs, requirements still changed, and edge cases still surfaced. But it moved enforcement from the PM's Slack DMs to an automated system with clear accountability, documented exceptions, and transparent tracking. Leadership stopped blaming data product managers for breaks they didn't cause and couldn't prevent. Instead, they started holding source system owners accountable for maintaining the contracts their downstream consumers depended on.

Stop Treating Data Contracts as Documentation

Most organizations treat data contracts as a governance documentation exercise—something to fill out during planning sessions and reference when things break. That model assumes the problem is awareness: if everyone knows what the contract says, compliance will follow. That assumption is wrong. Compliance doesn't fail because teams don't know the rules. It fails because breaking the rules has no immediate consequence, and following the rules has no immediate reward. A downstream PM discovering a broken dashboard three weeks after an upstream schema change is not an enforcement mechanism—it's a delayed failure signal that punishes the wrong person.

The contrarian claim David Ohnstad makes is this: data contracts without automated enforcement are worse than no contracts at all, because they create the illusion of accountability while systematically setting data product managers up to fail. A documented schema that nobody validates is a lie the organization tells itself—a promise of data quality with no mechanism to keep that promise. When the dashboard breaks, leadership points to the contract and asks the PM why they didn't enforce it. But the PM has no enforcement authority. They can't block an upstream deployment. They can't require schema change notifications. They can't automatically halt a pipeline when validation fails. The contract gave them accountability without giving them the tools to deliver on it.

According to IDC's 2025 Data Trust and Quality Survey, 82% of enterprises report having documented data contracts or schema definitions, but only 19% have automated systems that validate those contracts before data enters production pipelines. The 63-point gap between documentation and enforcement is where data product managers get trapped. They're responsible for delivering trusted insights using data they don't control, from systems they don't manage, with contracts they can't enforce. When the inevitable failure happens, the documented contract becomes evidence of the PM's negligence rather than evidence of a broken enforcement model.

The solution is not better documentation. The solution is treating data contracts like API contracts in a microservices architecture: versioned, validated, and enforced automatically. When a microservice tries to call another service with an incompatible request format, the API gateway rejects the call immediately—before bad data enters the system. The same model applies to data pipelines. When an upstream system sends data that violates the registered schema, the pipeline should reject it immediately and alert both teams. That's enforcement. Everything else is just paperwork.

What is the difference between a data contract and a schema definition?

A schema definition describes the structure of a dataset—field names, data types, and formats. A data contract is a versioned, enforceable agreement between a data producer and consumer that includes the schema plus validation rules, update frequency, ownership, and what happens when violations occur. Contracts add accountability; schemas just document structure.

How do you enforce data contracts in a federated architecture?

Enforcement requires automated validation gates that run before data enters downstream pipelines. When incoming data violates the registered contract, the pipeline halts and triggers alerts to both producer and consumer teams. This prevents bad data from reaching production and makes contract violations immediately visible rather than discovered weeks later through broken dashboards.

Why do data product managers get blamed when upstream data changes break dashboards?

Because most organizations give PMs accountability for data quality outcomes without authority over upstream systems or enforcement mechanisms. When source data changes without coordination, PMs discover the break only after users report it. Without automated contract validation, the PM has no way to prevent or catch the failure before it impacts production reporting.

For more on how AI agents complicate this enforcement layer when they autonomously generate insights without validation checkpoints, see David Ohnstad on AI and enterprise SaaS. For perspectives on building leadership structures that support federated teams without requiring managers to be technical experts, explore David Ohnstad's woodworking and making where similar principles of clear contracts and enforcement mechanisms apply to physical builds.

Two Takeaways and One Question

For practitioners: If you're a data product manager in a federated architecture, your first priority is not building dashboards—it's building the enforcement layer that makes data contracts real. Document the schema, yes. But also implement automated validation gates, version-controlled change management, and observability dashboards that track contract compliance by team. Without enforcement tooling, you're accountable for failures you can't prevent. Build the tooling or escalate the gap to leadership as a blocker to trusted data products.

For leaders: Stop holding data product managers accountable for data quality issues caused by upstream teams that violate undocumented or unenforced contracts. If your organization has adopted a federated data architecture, you must invest in the enforcement layer—automated validation, schema registries, change notification systems, and compliance tracking. Accountability without authority is a recipe for scapegoating. Either give PMs the enforcement tools they need, or restructure accountability to include the source system owners who control the data quality you're demanding.

When did you last audit whether your data contracts are actually enforced—or just documented in a Wiki that nobody checks until a dashboard breaks in production?

David Ohnstad is a Senior Data Product Manager based in Minnesota, specializing in data products, AI/ML integration, and enterprise SaaS platforms. Follow his work at github.com/davidohnstad40-netizen.

DEV Community