In any sufficiently large distributed system, data reconciliation is the dark matter of engineering — invisible, pervasive, and holding everything together through mechanisms nobody fully understands.
Rule-based reconciliation works until it doesn't. Rule engines break on ambiguity, cannot handle semantic equivalence across schema versions, and generate false positives at scale that overwhelm operations teams. AI — specifically embedding-based similarity and LLM classification — fills the gap. Not as a replacement, but as a layer that handles what rules cannot.
Where Traditional Reconciliation Breaks Down
Eventual consistency windows — a naive reconciliation job that diffs at a point in time generates thousands of false positives that are self-healing within seconds. The rule engine cannot distinguish transient inconsistency from legitimate divergence.
Cross-service schema drift — Service A stores an address as { street, city, state, zip }. Service B stores it as { addressLine1, municipality, postalCode }. Semantically equivalent. A field-level comparator flags every record as mismatched.
Semantic equivalence in free-text — "Acme Corporation" vs "ACME Corp." vs "Acme Corp (formerly Roadrunner Supplies)". Rule-based systems cannot reason about semantic identity at scale.
Volume-driven false positive fatigue — at millions of records per day, even 0.1% false positives generate thousands of alerts. Real issues get buried. The reconciliation system becomes theater.
The Architecture: Rules First, AI at the Boundary
The pattern is not AI-first. It is rules-first, AI at the boundary:
- Deterministic mismatch detection — checksums, field comparisons, primary key matching
- High-confidence matches/mismatches — auto-resolve or route to correction (no AI needed)
-
Ambiguous cases → AI classification layer:
- Embedding similarity — detect semantic equivalence across schema variations
- LLM classification — reason about why a mismatch exists
Embedding-Based Similarity
Serialize records into schema-agnostic text representations before embedding. Compute cosine similarity. Calibrate thresholds against labeled data:
- ≥ 0.95 → auto-resolve as equivalent
- 0.80 – 0.95 → route to LLM classification
- < 0.80 → high-confidence mismatch, route to correction
The thresholds are not universal — calibrate against 500–1000 manually classified record pairs from your actual data.
LLM Classification for the Ambiguous Band
For the 5–15% of mismatches that fall in the ambiguous range, an LLM reasons about context that vector distance cannot capture. Classifications: equivalent, stale_copy, legitimate_divergence, data_corruption.
Cost management: route only the ambiguous band to the LLM. Batch where latency allows. Cache results for record pairs re-evaluated in subsequent cycles.
Where AI Should Never Be Trusted
- Financial and compliance records — dollar amount disagreements are correctness errors, not semantic questions
- Primary key and identity resolution — AI suggestions acceptable; auto-resolution without human sign-off is not
- Any decision that must be explainable to a regulator — "87% confidence" is not an audit-compliant explanation
The Key Insight
AI in reconciliation is a judgment layer, not a trust layer. It handles ambiguous cases that rules cannot, reduces volume reaching human review, and provides structured reasoning. The deterministic foundation must remain intact.
A reconciliation system you cannot audit is worse than one that generates false positives. Build the observability before you build the AI.
Read the Full Article
This is a summary of my deep dive into AI-assisted data reconciliation. The full article covers the complete architecture with implementation examples:
👉 AI-Assisted Data Reconciliation at Scale — Full Article
The full article includes:
- Where traditional reconciliation breaks down (4 failure modes)
- Full architecture diagram with rules-first, AI-at-boundary pattern
- Embedding-based similarity implementation (Python, OpenAI embeddings)
- LLM classification prompt pattern with structured JSON output
- Observation window pattern for filtering eventual consistency false positives
- Hard boundaries where AI should never auto-resolve
- Observability patterns with structured logging
- Production deployment checklist
Top comments (0)