A pattern for AI-generated, human-verified project intelligence in real time
Kiprono Ngetich
Ontology is the difference between data and intelligence. A static schema is a prison. A living ontology one that evolves, respects human override, and traces provenance — is the only way to make AI useful in operational environments.
This document describes an ontological pattern built for project status intelligence. But the pattern generalizes.
Any domain where:
unstructured human communication contains structured intent
AI can propose, but humans must verify
real-time collaboration is non-negotiable
auditability is a compliance requirement
...can use this pattern.
The core tension
Every operational ontology faces a fundamental tension:
The AI can extract structure from chaos, but the AI is often wrong. The human knows the truth, but the human won't maintain the ontology.
Traditional solutions fail in one of two directions:
AI-first: The ontology is generated automatically. Humans are viewers. The system drifts. Trust erodes.
Human-first: The ontology is manually maintained. AI is ignored. The system is accurate but stale. Humans burn out.
A third way exists.
The pattern: provenance as the primary axis
Most ontologies organize around what things are. This one organizes around how we know what we know.
Every entity carries a provenance flag: isManuallyEdited.
This single boolean transforms the ontology from a static classification into a dynamic trust layer.
The ontology in minimal form:
text
WeekData
- contains → Task (AI-extracted)
- generates → StatusRow (human-editable)
- has flag → isManuallyEdited (boolean)
- has lineage → sourceTask (reference)
The critical insight: AI-generated and human-verified are not separate entity types. They are the same entity type with different provenance states.
This means:
The query layer can filter by provenance (show me only human-verified rows)
The AI layer can learn from provenance (rows where isManuallyEdited=true are training data)
The compliance layer can audit provenance (every field has a traceable source: AI or human)
The three ontological invariants
Invariant 1: The POST-once constraint
An AI generation occurs at most once per unique domain entity (in this case, per week).
Why this matters for ontology: Most systems treat AI as an on-demand service. Every view triggers generation. This makes provenance meaningless — there is no "original" AI state to compare against.
This invariant creates a fixed point. The first AI output is frozen as the baseline. All subsequent human edits are deltas against that baseline.
This is the difference between streaming intelligence (ephemeral, unrepeatable) and settled intelligence (auditable, comparable, learnable).
Invariant 2: The override-is-sacred rule
When isManuallyEdited transitions from false to true, the system records the original AI value but never surfaces it again.
Why this matters: In most AI systems, human feedback is treated as a training signal — something that eventually improves the model. In operational environments, that latency is unacceptable. The human correction must be immediately authoritative.
The ontology enforces this at the storage layer, not just the application layer. Even if the AI reruns, it cannot overwrite a field where isManuallyEdited is true.
Invariant 3:** The ephemeral presence boundary**
User presence — who is looking at what right now — is explicitly excluded from persistence.
Why this matters: Ontologies have a tendency to capture everything, including ephemeral state. This creates two problems:
Storage grows with activity, not with domain complexity
Queries become cluttered with irrelevant "current state"
By declaring presence as out-of-ontology, it is forced into a separate channel (WebSocket presence messages). The ontology remains focused on settled truth.
What this enables
For AI operations
The provenance flag creates a natural training loop:
AI generates initial state
Human corrects specific fields
System compares original vs. edited
Corrections become labeled training data
The ontology preserves the alignment between input (raw message) and corrected output (edited row). This is significantly more valuable than generic human feedback — the model can learn why it was wrong by re-examining the original message in light of the correction.
For human operations
Users never see the AI's mistakes. They see the corrected truth. But they can see the AI's original output if they choose — the audit trail is there, just behind a toggle.
This creates psychological safety. The AI is allowed to be wrong. No one is blamed for a bad extraction. The system simply learns and improves.
For compliance
Every field has a provenance chain:
If isManuallyEdited = false, the value came from the AI, which was operating on the rawMessage (stored, immutable, timestamped)
If isManuallyEdited = true, the value came from a specific agent (username) at a specific time, with the original AI value retained
This is sufficient for regulated environments where documentation must meet evidentiary standards.
Comparison with conventional approaches
Concern Conventional ontology This pattern
AI output Ephemeral, regenerated on each view Fixed point, auditable baseline
Human correction Overwrites AI output, no trace Preserves original, flags override
Training data Separate collection pipeline Natural byproduct of use
Real-time sync Out of scope or bolted on First-class via separate channel
Storage scaling O(views × entities) O(weeks × projects)
The generalization
This pattern applies to any domain where:
Unstructured inputs (emails, transcripts, logs, messages) contain latent structure
AI can propose that structure with acceptable accuracy (70-90%)
Humans must verify corrections with high confidence
Real-time visibility is required across multiple actors
Auditability is a compliance requirement
Example domains:
Intelligence analysis: AI extracts entities and relationships from raw intelligence; analysts correct and enrich; the ontology tracks provenance
Incident response: AI proposes timeline and impact from alert streams; responders correct in real time; command sees live status
Supply chain visibility: AI extracts shipment status from carrier messages; logistics teams correct exceptions; stakeholders see authoritative truth
Clinical trials: AI extracts patient status from site reports; monitors verify; regulators audit the provenance chain
In every case, the core pattern is the same:
AI proposes. Human disposes. The ontology remembers the difference.
The open question
This ontology is one-way: AI → human → (frozen). There is no closed loop where human corrections flow back into the AI's world model for future extractions, beyond being training data.
The instinct is that the solution lives in the ontology itself: treat isManuallyEdited as a signal that the original extraction should be re-weighted in the model's latent space. But that remains unbuilt.
Closing
A small ontology for a small problem (project status reporting). But the pattern — provenance as primary axis, POST-once generation, sacred human override, ephemeral presence separation — feels general.
It solves the core tension of AI in operational environments: the machine proposes, the human disposes, and the system never confuses the two.
That is not just a data model. That is a trust model.
Kiprono Ngetich
Appendix: Ontology specification (condensed)
Entities:
Entity Key fields Provenance
WeekData weekNumber, year, rawMessage Immutable after creation
StatusRow 9 content fields, isManuallyEdited, sourceTask Flag toggles once (false→true)
Agent agentId, agentType (HUMAN/AI/SYSTEM) Immutable
EditOperation oldValue, newValue, timestamp, username Immutable, append-only
Invariants:
(weekNumber, year) unique
AI generation ≤1 per (weekNumber, year)
isManuallyEdited = true → AI never overwrites
StatusSheet.rows indices contiguous
presentIn relationships never persisted
Top comments (0)