There's a category of problem that only shows up after you've been running an automated knowledge system for a while. The first month feels like magic - pages compile themselves, citations appear, everything is fresh. Three months later, you open a page about a library that shipped three breaking versions since the source was last ingested. The page looks perfectly healthy. The confidence is "high." The lint passed. And yet, everything in it is quietly wrong.
Static knowledge bases have no vocabulary for "this was true." Synthadoc v0.6.0 gives your wiki one.
Synthadoc release v0.6.0 ships two features that change how a wiki ages: a five-state page lifecycle machine that tracks content freshness with a permanent audit trail, and a wiki export system that serializes not just content but provenance, history, and cost, in four machine-readable formats, with zero additional LLM calls.
The 5-State Page Lifecycle
The core idea is simple: every page has a status that reflects what the system knows about it right now, not just what it says. That status moves through five states based on signals from ingest, lint, and the source files themselves.
Automatic transitions (system-triggered):
| Transition | Trigger | Who |
|---|---|---|
→ draft |
New page created via ingest | IngestAgent |
draft → active |
Lint passes all structural and consistency checks | LintAgent |
active → stale |
SHA-256 hash of source file has changed since last ingest | LintAgent |
stale → draft |
Source re-ingested with--force; page updated |
IngestAgent |
draft / active / stale → contradicted |
New source conflicts with this page; status set directly, bypasses transition API | IngestAgent |
Manual transitions (user CLI commands):
| Command | Transition | Description |
|---|---|---|
lifecycle activate <slug> |
draft → active |
Promote without waiting for the next lint run |
lifecycle archive <slug> |
draft / active / contradicted / stale → archived |
Retire the page; it's kept for reference |
lifecycle restore <slug> |
archived → draft |
Re-admit the page; re-enters the lint queue |
Note: active → stale and stale → draft have no user-facing CLI command, they are exclusively system-triggered by lint and re-ingest respectively. The only path out of contradicted is archiving it; you cannot promote a contradicted page directly to active.
Every single transition - automatic or manual - is permanently written to the audit database with a timestamp, the triggering agent or user, and a reason string. That's the part that actually matters. The state tells you where a page is. The log tells you how it got there and when someone last looked at it.
# Check the full history of a page
synthadoc lifecycle log alan-turing
Slug From To By Timestamp Reason
----------------------------------------------------------------------------------------------------
alan-turing null draft ingest 2026-04-12T09:14:22 initial ingest
alan-turing draft active lint 2026-04-12T09:31:07 all checks passed
alan-turing active stale lint 2026-05-03T02:00:11 source hash mismatch
alan-turing stale draft ingest 2026-05-03T08:22:55 re-ingest of stale page
alan-turing draft active lint 2026-05-03T08:45:02 all checks passed
If you prefer a visual view, the same full cross-wiki audit trail is available in Obsidian under Synthadoc: Manage Page Lifecycle → Audit Log. Every transition shows colour-coded From/To state badges, the triggering agent or user, the timestamp, and the reason string - searchable by slug, filterable by state, paginated:
For a fleet-level view, synthadoc status gives a live summary across all five states, including pages sitting in candidates, along with an action hint for anything that needs attention:
synthadoc status
Wiki: history-of-computing
Pages: 42
Jobs pending: 0
Jobs total: 187
Page lifecycle:
active 38
draft 2 <- run `synthadoc lint run` to promote
draft (staged) 1 <- promote from candidates/ first, then lint
stale 1 <- re-ingest needed
contradicted 0
archived 1
Two things worth knowing about what these numbers mean. First, Pages: 42 at the top counts only pages that have been admitted into wiki/ - pages still quarantined in wiki/candidates/ are excluded from that total. Second, draft and draft (staged) are distinct rows: draft is pages already inside wiki/ waiting for their first lint pass; draft (staged) is pages physically quarantined in wiki/candidates/ , and they haven't been promoted yet, have no lifecycle state, and are invisible to every part of the system until a human explicitly promotes them. The lifecycle section only shows draft (staged) when the count is greater than zero, so on a wiki with staging turned off you'll never see that row. The action hints tell you exactly what to do next for each group: run lint for drafts, re-ingest for stale pages, review and archive for contradictions.
If you prefer to manage lifecycle states visually, the Obsidian plugin surfaces the same data in Synthadoc: Manage Page Lifecycle → Current States. The table is sortable and filterable by state, shows the last transition timestamp and who triggered it, and gives you a one-click archive button per page. The contradicted chip makes it easy to find the pages that need attention first:
Candidates Staging: A Quality Gate Before Lifecycle Begins
The lifecycle machine handles what happens after a page enters the wiki. Candidates staging handles whether it enters at all.
Where a new page lands depends entirely on the staging policy configured for that wiki. There are three options:
-
off(default): every new page goes straight intowiki/asdraft. Staging is not involved. -
all: every new page goes towiki/candidates/regardless of confidence. Nothing is admitted automatically - you review and promote everything. -
threshold: IngestAgent checks the page's confidence rating against your configured minimum. Pages that meet or exceed it go directly intowiki/; pages that fall below it go towiki/candidates/for review.
wiki/candidates/ is a holding area excluded from search, context packs, and export. No downstream consumer sees a candidate. The page exists on disk, but it hasn't been admitted into the lifecycle yet, it has no audit log entry and doesn't appear in synthadoc status counts.
Here's how the three paths look end-to-end under the threshold policy:
The key design decision here: staging and lifecycle are orthogonal systems that compose cleanly. Staging decides admission. Lifecycle decides state after admission. A page in wiki/candidates/ has no lifecycle state yet, it's not in the audit log, it doesn't count in synthadoc status, and it doesn't appear in any export. The moment you promote it, it enters the lifecycle as draft and the lint queue picks it up on the next run.
This matters for teams that need a human gate on automated ingestion. Nightly ingest jobs run at 2AM, pull new sources, compile pages. They all land in candidates. A person reviews the list in the morning, promotes what looks right, discards what doesn't. The wiki only grows with reviewed content.
# Enable threshold staging: auto-promote high-confidence, hold everything else
synthadoc staging policy threshold --min-confidence high
# Morning review
synthadoc candidates list
Candidates (3):
machine-learning-fundamentals confidence: medium ingested: 2026-05-31
attention-mechanism confidence: low ingested: 2026-05-31
transformer-architecture confidence: medium ingested: 2026-05-31
synthadoc candidates promote transformer-architecture
synthadoc candidates discard attention-mechanism
Wiki Export: Four Formats, Zero LLM Calls
Export was designed around one constraint: once your wiki is compiled, you shouldn't need to spend more API budget to serialize it. All four formats are computed entirely from the stored wiki state - no prompts, no completions, no waiting.
The --status flag is what makes export practically useful. When you're feeding a downstream LLM, you probably only want active pages — the ones that passed lint and haven't gone stale:
synthadoc export --format llms.txt --status active
synthadoc export --format json --status active --output exports/wiki.json
The --status contradicted flag is genuinely useful for forensics — you can export just the pages with conflicts and analyse them without touching the rest of the wiki.
The JSON format is the one worth drawing attention to specifically. Most wiki exports give you a flat document dump. This one gives you provenance at the sentence level (claims[] maps each paragraph to the exact source file and line range that generated it), the complete state transition history (lifecycle_history[]), and the per-page API cost to compile it. If you're building downstream tooling or reporting on knowledge quality, these three fields eliminate an entire layer of instrumentation you'd otherwise have to build yourself.
{
"slug": "alan-turing",
"status": "active",
"ingest_cost_usd": 0.0012,
"claims": [
{
"text": "Turing proposed the imitation game in 1950...",
"source": "raw_sources/turing-biography.md",
"lines": [42, 48]
}
],
"lifecycle_history": [
{ "from": null, "to": "draft", "by": "ingest", "ts": "2026-04-12T09:14:22" },
{ "from": "draft", "to": "active", "by": "lint", "ts": "2026-04-12T09:31:07" }
]
}
What Makes This Different
Most LLM wiki tools treat knowledge as append-only. You ingest, you query. There's no concept of a page going stale, no audit trail of who reviewed what and when, and no way to know that the page you're reading was compiled from a source that changed three months ago. They're effectively write-once databases with a chat interface on top.
Synthadoc's lifecycle machine makes the wiki temporally aware. A SHA-256 hash is stored for every source at ingest time. When lint runs (nightly, typically), it compares current hashes against stored ones. A changed source triggers an automatic active → stale transition with a timestamp. You know exactly which pages need attention and when they last didn't.
The other thing that separates Synthadoc architecturally is that it's not a retrieval pipeline with a generation step, it's a compilation pipeline. Every page is a synthesized artifact, not a retrieved chunk. That's why the JSON export can include ingest_cost_usd per page: because each page has a discrete compilation history, not a query-time cost that varies every time someone asks a question.
The combination of lifecycle tracking and export also enables something practical for teams: you can run synthadoc export --format llms.txt --status active as the input to a downstream agent, and you know exactly what you're giving it. No stale content. No contradicted pages. Just the subset of the wiki that the system has marked as reviewed and consistent.
Quick Demo
The quickest way to see the lifecycle machine in action is step 8 of the quick-start guide: Step 8 — Manage Page Lifecycle.
Export is step 21: Step 21 — Export Your Wiki.
Both steps work with the history-of-computing demo wiki, so you can run the full thing locally in about ten minutes against your selected LLM provider:
git clone https://github.com/axoviq-ai/synthadoc.git
pip3 install -e ".[dev]"
synthadoc install history-of-computing --target ~/wikis --demo
synthadoc plugin install history-of-computing
If you find Synthadoc useful, a star ⭐ on GitHub goes a long way toward keeping this project visible: https://github.com/axoviq-ai/synthadoc.





Top comments (0)