Rono

Posted on Jun 1

The Living Ontology

#ai #architecture #nlp #systemdesign

A pattern for AI-generated, human-verified project intelligence in real time

Kiprono Ngetich

Ontology is the difference between data and intelligence. A static schema is a prison. A living ontology one that evolves, respects human override, and traces provenance — is the only way to make AI useful in operational environments.

This document describes an ontological pattern built for project status intelligence. But the pattern generalizes.

Any domain where:

unstructured human communication contains structured intent

AI can propose, but humans must verify

real-time collaboration is non-negotiable

auditability is a compliance requirement

...can use this pattern.

The core tension

Every operational ontology faces a fundamental tension:

The AI can extract structure from chaos, but the AI is often wrong. The human knows the truth, but the human won't maintain the ontology.

Traditional solutions fail in one of two directions:

AI-first: The ontology is generated automatically. Humans are viewers. The system drifts. Trust erodes.

Human-first: The ontology is manually maintained. AI is ignored. The system is accurate but stale. Humans burn out.

A third way exists.

The pattern: provenance as the primary axis
Most ontologies organize around what things are. This one organizes around how we know what we know.

Every entity carries a provenance flag: isManuallyEdited.

This single boolean transforms the ontology from a static classification into a dynamic trust layer.

The ontology in minimal form:

text
WeekData
  - contains → Task (AI-extracted)
  - generates → StatusRow (human-editable)
      - has flag → isManuallyEdited (boolean)
      - has lineage → sourceTask (reference)

The critical insight: AI-generated and human-verified are not separate entity types. They are the same entity type with different provenance states.

This means:

The query layer can filter by provenance (show me only human-verified rows)

The AI layer can learn from provenance (rows where isManuallyEdited=true are training data)

The compliance layer can audit provenance (every field has a traceable source: AI or human)

The three ontological invariants
Invariant 1: The POST-once constraint
An AI generation occurs at most once per unique domain entity (in this case, per week).

Why this matters for ontology: Most systems treat AI as an on-demand service. Every view triggers generation. This makes provenance meaningless — there is no "original" AI state to compare against.

This invariant creates a fixed point. The first AI output is frozen as the baseline. All subsequent human edits are deltas against that baseline.

This is the difference between streaming intelligence (ephemeral, unrepeatable) and settled intelligence (auditable, comparable, learnable).

Invariant 2: The override-is-sacred rule
When isManuallyEdited transitions from false to true, the system records the original AI value but never surfaces it again.

Why this matters: In most AI systems, human feedback is treated as a training signal — something that eventually improves the model. In operational environments, that latency is unacceptable. The human correction must be immediately authoritative.

The ontology enforces this at the storage layer, not just the application layer. Even if the AI reruns, it cannot overwrite a field where isManuallyEdited is true.

Invariant 3:** The ephemeral presence boundary**
User presence — who is looking at what right now — is explicitly excluded from persistence.

Why this matters: Ontologies have a tendency to capture everything, including ephemeral state. This creates two problems:

Storage grows with activity, not with domain complexity

Queries become cluttered with irrelevant "current state"

By declaring presence as out-of-ontology, it is forced into a separate channel (WebSocket presence messages). The ontology remains focused on settled truth.

What this enables
For AI operations
The provenance flag creates a natural training loop:

AI generates initial state

Human corrects specific fields

System compares original vs. edited

Corrections become labeled training data

The ontology preserves the alignment between input (raw message) and corrected output (edited row). This is significantly more valuable than generic human feedback — the model can learn why it was wrong by re-examining the original message in light of the correction.

For human operations
Users never see the AI's mistakes. They see the corrected truth. But they can see the AI's original output if they choose — the audit trail is there, just behind a toggle.

This creates psychological safety. The AI is allowed to be wrong. No one is blamed for a bad extraction. The system simply learns and improves.

For compliance
Every field has a provenance chain:

If isManuallyEdited = false, the value came from the AI, which was operating on the rawMessage (stored, immutable, timestamped)

If isManuallyEdited = true, the value came from a specific agent (username) at a specific time, with the original AI value retained

This is sufficient for regulated environments where documentation must meet evidentiary standards.

Comparison with conventional approaches
Concern Conventional ontology This pattern

AI output   Ephemeral, regenerated on each view Fixed point, auditable baseline
Human correction    Overwrites AI output, no trace  Preserves original, flags override
Training data   Separate collection pipeline    Natural byproduct of use
Real-time sync  Out of scope or bolted on   First-class via separate channel
Storage scaling O(views × entities)    O(weeks × projects)

The generalization
This pattern applies to any domain where:

Unstructured inputs (emails, transcripts, logs, messages) contain latent structure

AI can propose that structure with acceptable accuracy (70-90%)

Humans must verify corrections with high confidence

Real-time visibility is required across multiple actors

Auditability is a compliance requirement

Example domains:

Intelligence analysis: AI extracts entities and relationships from raw intelligence; analysts correct and enrich; the ontology tracks provenance

Incident response: AI proposes timeline and impact from alert streams; responders correct in real time; command sees live status

Supply chain visibility: AI extracts shipment status from carrier messages; logistics teams correct exceptions; stakeholders see authoritative truth

Clinical trials: AI extracts patient status from site reports; monitors verify; regulators audit the provenance chain

In every case, the core pattern is the same:

AI proposes. Human disposes. The ontology remembers the difference.

The open question
This ontology is one-way: AI → human → (frozen). There is no closed loop where human corrections flow back into the AI's world model for future extractions, beyond being training data.

The instinct is that the solution lives in the ontology itself: treat isManuallyEdited as a signal that the original extraction should be re-weighted in the model's latent space. But that remains unbuilt.

Closing
A small ontology for a small problem (project status reporting). But the pattern — provenance as primary axis, POST-once generation, sacred human override, ephemeral presence separation — feels general.

It solves the core tension of AI in operational environments: the machine proposes, the human disposes, and the system never confuses the two.

That is not just a data model. That is a trust model.

Kiprono Ngetich

Appendix: Ontology specification (condensed)
Entities:

Entity  Key fields  Provenance
WeekData    weekNumber, year, rawMessage    Immutable after creation
StatusRow   9 content fields, isManuallyEdited, sourceTask  Flag toggles once (false→true)
Agent   agentId, agentType (HUMAN/AI/SYSTEM)    Immutable
EditOperation   oldValue, newValue, timestamp, username Immutable, append-only
Invariants:

(weekNumber, year) unique

AI generation ≤1 per (weekNumber, year)

isManuallyEdited = true → AI never overwrites

StatusSheet.rows indices contiguous

presentIn relationships never persisted

DEV Community

The Living Ontology

A pattern for AI-generated, human-verified project intelligence in real time

The core tension

Top comments (0)