<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Paul Chen</title>
    <description>The latest articles on DEV Community by Paul Chen (@paul_chen_90371fe7426cb44).</description>
    <link>https://dev.to/paul_chen_90371fe7426cb44</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3898190%2F38f30ba0-280f-4a52-8c7d-c773315b8da8.jpg</url>
      <title>DEV Community: Paul Chen</title>
      <link>https://dev.to/paul_chen_90371fe7426cb44</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/paul_chen_90371fe7426cb44"/>
    <language>en</language>
    <item>
      <title>Synthadoc: Streaming Queries, Local Web Chat, and a Self-Invalidating Cache</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Mon, 08 Jun 2026 21:02:24 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-streaming-queries-local-web-chat-and-a-self-invalidating-cache-1hgb</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-streaming-queries-local-web-chat-and-a-self-invalidating-cache-1hgb</guid>
      <description>&lt;p&gt;There's a moment every Synthadoc user hits eventually. You've got forty or fifty compiled pages, a nightly ingest schedule running, lint keeping everything healthy. And then you open a terminal, type &lt;code&gt;synthadoc query "..."&lt;/code&gt;, and wait. The BM25 retrieval is instant. But then the cursor blinks. The LLM is thinking. You wait four seconds, six seconds, eight seconds. The answer eventually appears, all at once, like a curtain dropping.&lt;/p&gt;

&lt;p&gt;That wait is fine the first time. It gets annoying on the tenth query when you're in a research session and you already know the answer is coming - you just want to read it as it forms, not stare at a blinking cursor.&lt;/p&gt;

&lt;p&gt;v0.7.0 improves that. Streaming query output across all three query surfaces, a local web chat UI that understands the health of your wiki, and a query cache that eliminates the LLM call entirely when nothing in your wiki has changed. The architecture behind each of these turned out to be more interesting than I expected when we started building them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Diagram 1: What Changed in v0.7.0: The Architecture at a Glance
&lt;/h2&gt;

&lt;p&gt;The diagram below maps the full Synthadoc architecture as it stands after v0.7.0. Items marked &lt;strong&gt;[NEW]&lt;/strong&gt; are additions in this release; everything else was already present. The three features in this post - Web Chat UI, streaming query, and query cache - touch three separate layers: the access layer gains a new client, the engine gains new agents, and the core gains a cache component tied to a new &lt;code&gt;wiki_epoch&lt;/code&gt; counter.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4tq2mjmw89xrlvabve24.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4tq2mjmw89xrlvabve24.png" alt=" " width="800" height="860"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three additions connect the three features in this post:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Query Web UI&lt;/strong&gt; (Access Layer): the new &lt;code&gt;synthadoc web&lt;/code&gt; browser client using HTTP + SSE&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query (stream) · Action · Hint Engine&lt;/strong&gt; (Agents): streaming query pipeline, live command execution, deterministic hint generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query Cache + wiki_epoch&lt;/strong&gt; (Core): shared cache with epoch-based automatic invalidation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything flows through the same server process per wiki. The CLI, Obsidian plugin, and Web Chat UI are all thin clients talking to the same HTTP + SSE endpoint. There's no separate service for the web UI. The MCP Server (shown as optional with a dashed border) is a fourth access path for AI tools like Claude Desktop or Cursor — it exposes the same wiki operations over the Model Context Protocol and requires opt-in setup.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three Ways to Query Your Wiki and When to Use Each
&lt;/h2&gt;

&lt;p&gt;Before getting into the streaming and caching mechanics, it's worth laying out the three query surfaces Synthadoc now supports. All three can answer the same question, the difference is workflow fit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1flol9w7xcvf100v4by.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1flol9w7xcvf100v4by.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CLI: when the query is part of a larger workflow
&lt;/h3&gt;

&lt;p&gt;The CLI is where you go when a query isn't just a question, it's a step in something automated. The obvious case is CI/CD: a post-ingest job that queries the wiki to verify a newly compiled page before promoting it to active. Less obvious is using it as part of an agent integration, where an external orchestrator issues queries and parses the structured JSON output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Stream to terminal - tokens appear as the LLM generates them&lt;/span&gt;
synthadoc query &lt;span class="s2"&gt;"What were the main causes of the 2008 financial crisis?"&lt;/span&gt;

&lt;span class="c"&gt;# Script mode - waits for full response, stdout is clean for piping&lt;/span&gt;
synthadoc query &lt;span class="s2"&gt;"Summarize page: moore's-law"&lt;/span&gt; &lt;span class="nt"&gt;--no-stream&lt;/span&gt; | jq &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Force LLM call even if cache has a result - useful when wiki just changed&lt;/span&gt;
synthadoc query &lt;span class="s2"&gt;"What changed in the latest ingest?"&lt;/span&gt; &lt;span class="nt"&gt;--no-cache&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--no-stream&lt;/code&gt; flag is specifically for automation. Streaming output is beautiful on a terminal and disruptive in a pipeline. A script that parses &lt;code&gt;stdout&lt;/code&gt; doesn't want token-by-token delivery, it wants a complete JSON blob when the query is done. &lt;code&gt;--no-stream&lt;/code&gt; gives it that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Obsidian Plugin: when you're in a research session
&lt;/h3&gt;

&lt;p&gt;The Obsidian plugin exists for a different moment: you're writing a note, you need to check a claim against your wiki, and you don't want to leave Obsidian. The query modal (&lt;code&gt;Ctrl/Cmd+P → Synthadoc: Query: ask the wiki...&lt;/code&gt;) is the right tool here. It renders &lt;code&gt;[[wikilinks]]&lt;/code&gt; as clickable links, which means an answer that references related pages becomes navigable instantly.&lt;/p&gt;

&lt;p&gt;The streaming behaviour in the Obsidian plugin mirrors the CLI, tokens appear as they arrive, citations follow at the end. The bypass cache checkbox is visible in the modal, unchecked by default. For researchers doing active ingest sessions, checking it once gets you fresh output without reaching for the terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Web Chat UI: when you want a session, not a one-shot query
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;synthadoc web&lt;/code&gt; is the new entry. It opens a local chat interface in your browser, nothing leaves your machine, no cloud service, no authentication. It's designed for the kind of session that's too exploratory for the CLI and too long for the Obsidian modal.&lt;/p&gt;

&lt;p&gt;Each turn is an independent query - the same cache applies here as in the CLI and Obsidian plugin. The chat history is displayed in the browser, but prior messages are not yet injected into the LLM prompt; multi-turn context injection is planned for a future release.&lt;/p&gt;

&lt;p&gt;What the web UI adds over the other surfaces: operational commands. You can type "run lint", "show wiki status", "what pages are orphan pages?" or "schedule ingest every night at 9 PM" directly in the chat, and the Action Agent parses those and executes them live against your wiki, with results shown inline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytddsarsce76raaqaia6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytddsarsce76raaqaia6.png" alt=" " width="800" height="537"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The screenshot above shows a live session against the history-of-computing demo wiki. The response to "What changed in the wiki this week?" includes a date-indexed ingest table, current lifecycle counts (Active: 80, Draft/Stale/Contradicted/Archived: all zero), and three action chips - "Activate a draft page", "Archive a stale page", "Restore an archived page to draft" - rendered inline as clickable buttons. The left panel shows prior session queries, allowing you to jump back into an earlier thread.&lt;/p&gt;




&lt;h2&gt;
  
  
  Streaming: The Architecture Behind a Two-Phase Response
&lt;/h2&gt;

&lt;p&gt;Every Synthadoc query goes through two phases. Phase 1 is retrieval: BM25 search, routing, sub-question decomposition if needed. This is synchronous and fast, typically 100–200ms. Phase 2 is synthesis: the LLM generates an answer from the retrieved pages. This is where the latency lives.&lt;/p&gt;

&lt;p&gt;The decision to stream only Phase 2 was deliberate. Phase 1 finishes before the first LLM token could possibly arrive, there's no partial retrieval state worth exposing. So the SSE protocol is clean:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frlkbvullfvzkob50f18i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frlkbvullfvzkob50f18i.png" alt=" " width="800" height="563"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;status&lt;/code&gt; events let the UI give immediate feedback. The user knows within 150ms whether the wiki found relevant pages or not before any LLM latency has accumulated. "sources: 3" in the synthesizing event tells them the answer is backed by three pages before they've read a single word of it.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;gap&lt;/code&gt; event fires only when the wiki doesn't have enough to answer confidently. Instead of a vague "I don't know," it returns &lt;code&gt;suggested_searches&lt;/code&gt; - concrete ingest strings the user can use to fill the gap. These are generated by a secondary LLM call that decomposes the original question into targeted search queries - the same decomposition that drives sub-question retrieval, reused here to produce actionable ingest suggestions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Provider Streaming Behavior
&lt;/h3&gt;

&lt;p&gt;Not all providers stream in the same sense. API-based providers - OpenAI, Anthropic, Gemini, Ollama - emit tokens as they are generated, so the CLI and web UI render them character-by-character in real time. The latency shown in the SSE sequence above (one token every ~20ms) is what these providers deliver.&lt;/p&gt;

&lt;p&gt;CLI subprocess providers - &lt;strong&gt;Claude Code&lt;/strong&gt; (&lt;code&gt;claude-code&lt;/code&gt;) and &lt;strong&gt;Opencode&lt;/strong&gt; (&lt;code&gt;opencode&lt;/code&gt;) - work differently. They run as child processes and write their output only when the process exits, so there is no per-token stream to intercept. Synthadoc runs the subprocess to completion, then emits the result word-by-word through the same SSE pipe. The words arrive in a rapid burst rather than a gradual flow - the total wait is the same, but the perceived streaming effect is a short pause followed by the full answer appearing almost at once.&lt;/p&gt;

&lt;p&gt;If you are using a CLI subprocess provider and queries are timing out, increase the default timeout:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc query &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--timeout&lt;/span&gt; 180
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default is 60 seconds, which is sufficient for API providers but may be short for subprocess providers on complex queries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session Management
&lt;/h3&gt;

&lt;p&gt;Sessions live server-side in &lt;code&gt;audit.db&lt;/code&gt;, in two tables: &lt;code&gt;chat_sessions&lt;/code&gt; and &lt;code&gt;chat_messages&lt;/code&gt;. The React UI stores only the &lt;code&gt;session_id&lt;/code&gt; in memory - it's React state, not localStorage. This means sessions don't survive page reload, and every new browser tab starts fresh. This is a deliberate design choice: a session is tied to one exploratory thread, not your entire browsing history.&lt;/p&gt;

&lt;p&gt;Chat messages are stored to &lt;code&gt;audit.db&lt;/code&gt; after each turn, but prior messages are not yet injected into the LLM prompt, each query is answered independently. The session record is used for mode persistence and hint rotation, not for conversational context. Multi-turn prompt injection is planned for a future release.&lt;/p&gt;

&lt;h3&gt;
  
  
  Diagram 2: Web Query Flow: Client to Server, Session to Stream
&lt;/h3&gt;

&lt;p&gt;The diagram below traces a complete web UI query round-trip, from the user typing a question to the hint chips updating after the response. The left column is the browser; the right column is the server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5pnjtkgom6f0hjl83ek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5pnjtkgom6f0hjl83ek.png" alt=" " width="800" height="950"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A few things worth highlighting in this flow. The &lt;code&gt;session_id&lt;/code&gt; lives only in React state - close the tab and it's gone. The mode determined at &lt;code&gt;POST /sessions&lt;/code&gt; (step 1) persists for the lifetime of that tab and shapes hint generation at every &lt;code&gt;done&lt;/code&gt; event (step 3 and 4). The HintEngine never calls the LLM - it reads the answer content and the session mode and applies deterministic rules to generate the three chips.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adaptive Hints: No LLM Required
&lt;/h3&gt;

&lt;p&gt;The hint chips - three clickable suggestions rendered below the chat input - update after every response. They're generated by a deterministic &lt;code&gt;HintEngine&lt;/code&gt;, not an LLM. No API call, no extra cost.&lt;/p&gt;

&lt;p&gt;The engine first classifies the wiki's health state when the session is created:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Condition&lt;/th&gt;
&lt;th&gt;Initial hints&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;NEW_WIKI&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fewer than 5 pages&lt;/td&gt;
&lt;td&gt;Guide user toward first ingest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EXPLORER&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;First session, healthy wiki&lt;/td&gt;
&lt;td&gt;Offer tour queries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;HEALTH_CHECK&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Stale or contradicted pages exist&lt;/td&gt;
&lt;td&gt;Surface lint and lifecycle actions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POWER_USER&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Returning user, healthy wiki&lt;/td&gt;
&lt;td&gt;Context-sensitive topic suggestions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;After each assistant response, the &lt;code&gt;done&lt;/code&gt; SSE event carries a &lt;code&gt;next_hints&lt;/code&gt; array, three suggestions computed from the answer content and session mode. If the answer mentioned a specific page, the hints might suggest a follow-up on a related page. If the answer triggered a knowledge gap, the hints offer the &lt;code&gt;suggested_searches&lt;/code&gt; as clickable options.&lt;/p&gt;

&lt;p&gt;The design principle here is that hints should reflect where you are in the conversation, not where you were when you opened the browser. A user on a HEALTH_CHECK session who just asked about contradicted pages shouldn't see generic "try querying about X" chips, they should see "run lint", "list orphan pages", "archive contradicted page". The mode carries through the session, shaping every hint update.&lt;/p&gt;




&lt;h2&gt;
  
  
  Query Caching: When Not to Call the LLM
&lt;/h2&gt;

&lt;p&gt;The cache design started from a specific observation: most queries against a domain wiki are repeated. A team maintaining a knowledge base about their software architecture will ask the same questions dozens of times - during onboarding, during incident reviews, during planning sessions. Every one of those calls hits the LLM and incurs both latency and cost.&lt;/p&gt;

&lt;p&gt;The cache eliminates that. But the tricky part is knowing when to invalidate it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cache Key Design
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;key = SHA-256(normalized_question + "|" + wiki_epoch + "|" + provider_model)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Normalized question&lt;/strong&gt; - lowercased, whitespace-collapsed. "What is Moore's Law?" and "what is moore's law?" hit the same cache entry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wiki epoch&lt;/strong&gt; - an integer counter on the server instance. It starts at 0 on startup and increments on every ingest job completion and every lifecycle state transition. When the epoch changes, the cache key for every question changes. Prior entries don't get deleted immediately, they just become unreachable. Old entries are cleaned up in a background sweep (entries more than 5 epochs behind current, or older than 7 days).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provider/model&lt;/strong&gt; - &lt;code&gt;"openai/gpt-4o-mini"&lt;/code&gt; or &lt;code&gt;"anthropic/claude-sonnet-4-6"&lt;/code&gt;. Switching models invalidates the cache. A cached answer from a smaller model shouldn't surface when you've upgraded to a better one.&lt;/p&gt;

&lt;p&gt;The epoch approach is what makes invalidation automatic. You don't call "invalidate cache" after an ingest, the epoch bump does it implicitly. Any query after a wiki change computes a new key that has never been seen, misses the cache, and calls the LLM fresh. The previous answer doesn't need to be deleted; it simply ceases to be looked up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Diagram 3: Cache Lookup, Hit, Miss, and Epoch Invalidation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmcpzmxgyza7mukezuquy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmcpzmxgyza7mukezuquy.png" alt=" " width="800" height="860"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Measured Results
&lt;/h3&gt;

&lt;p&gt;Rather than leaving the latency claims as estimates, we wrote a full performance test suite against the cache layer. All numbers below come from running &lt;code&gt;pytest tests/performance/test_query_cache_perf.py&lt;/code&gt; locally on a Windows development machine with an SSD. Linux bare-metal numbers are consistently 30–40% better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chart 1 - Cache read latency distribution (500 reads, 200 cached entries)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuk33uv3b4ftjzpsbe2m3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuk33uv3b4ftjzpsbe2m3.png" alt="Cache read latency distribution - P50=0.26ms P95=0.34ms P99=0.41ms" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;P50 = &lt;strong&gt;0.26ms&lt;/strong&gt;, P95 = &lt;strong&gt;0.34ms&lt;/strong&gt;, P99 = &lt;strong&gt;0.41ms&lt;/strong&gt; against a 10ms SLO. The distribution is extremely tight - the persistent connection eliminates the per-call connection-open overhead that was the main source of outliers. Every percentile sits well inside the budget with headroom to spare.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chart 2 - Cache hit vs miss latency at varying LLM speeds&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhw4ay8u34xm193ag4a2z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhw4ay8u34xm193ag4a2z.png" alt="Cache hit vs miss latency and speedup factor at 50ms / 200ms / 500ms / 2000ms simulated LLM" width="800" height="308"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The left panel uses a log scale because the gap is so large it can't be shown linearly. Cache hit P50 stays flat at ~0.25ms regardless of LLM speed - one shared persistent connection makes the hit path a pure queue-and-execute SQLite read with no file-open cost. The miss path scales directly with LLM latency. The right panel shows the resulting speedup factor:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Simulated LLM speed&lt;/th&gt;
&lt;th&gt;Cache miss P50&lt;/th&gt;
&lt;th&gt;Cache hit P50&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;50ms (fast provider)&lt;/td&gt;
&lt;td&gt;95ms&lt;/td&gt;
&lt;td&gt;0.29ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~330×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;200ms (mid provider)&lt;/td&gt;
&lt;td&gt;235ms&lt;/td&gt;
&lt;td&gt;0.32ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~730×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;500ms (slow provider)&lt;/td&gt;
&lt;td&gt;544ms&lt;/td&gt;
&lt;td&gt;0.24ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~2270×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2000ms (reasoning model)&lt;/td&gt;
&lt;td&gt;2055ms&lt;/td&gt;
&lt;td&gt;0.26ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~7900×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The cache hit time is so small relative to any real LLM that the ratio is dominated entirely by provider latency. At reasoning models (o3-mini, MiniMax M2) a single saved round-trip reclaims 15–30 seconds of wall time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chart 3 - Concurrent readers: persistent connection scaling curve&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ov6fhi498vrdojk7x8y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ov6fhi498vrdojk7x8y.png" alt="Concurrent cache readers P95 latency vs concurrency - smooth monotonic scaling" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Single reader: &lt;strong&gt;0.5ms P95&lt;/strong&gt;. Ten readers: &lt;strong&gt;2.0ms&lt;/strong&gt;. Twenty-five: &lt;strong&gt;3.8ms&lt;/strong&gt;. Fifty: &lt;strong&gt;7.8ms&lt;/strong&gt;. One hundred: &lt;strong&gt;14.9ms&lt;/strong&gt;. The curve is smooth and monotonically increasing - no Windows spikes, no non-monotonic jitter. All concurrent reads queue through one shared aiosqlite background thread; the connection-open overhead that caused the old instability is simply not there. For a local single-user tool the realistic ceiling is n=5–10 concurrent reads, where P95 is under 2ms. Even at n=100 the tail is well inside a 50ms budget.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chart 4 - Cache vs no-cache throughput (queries/second)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi77o76kkyli94a180vqf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi77o76kkyli94a180vqf.png" alt="Cache vs no-cache throughput and advantage ratio at concurrency 1 to 100" width="800" height="308"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The throughput advantage starts at &lt;strong&gt;75.7×&lt;/strong&gt; at n=1 and compresses to &lt;strong&gt;4.1×&lt;/strong&gt; at n=100. The compression is expected: &lt;code&gt;asyncio.gather()&lt;/code&gt; parallelises the simulated LLM calls so the no-cache path scales nearly linearly with concurrency. The cache path, sharing one connection, serializes through the aiosqlite queue and grows sublinearly. But critically, the cache always wins by a wide margin - 4.1× at n=100 is far better than the 1.3× seen before the persistent connection fix. At realistic single-user concurrency (n=1–5), the advantage is 33–76×.&lt;/p&gt;

&lt;h3&gt;
  
  
  Estimated Latency Gains
&lt;/h3&gt;

&lt;p&gt;A typical Synthadoc query against a mid-size wiki has two latency components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Phase 1 (BM25 retrieval):&lt;/strong&gt; 100–200ms. This runs regardless of cache.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 2 (LLM synthesis):&lt;/strong&gt; 2–10 seconds depending on provider, model, and answer length.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A cache hit skips Phase 2 entirely. The server reads the cached &lt;code&gt;result_json&lt;/code&gt; from SQLite (~0.26ms P50 on SSD via a persistent connection), then emits a synthetic SSE burst at full network speed. The client receives what looks like a live streamed response, but the entire burst completes in under 100ms instead of waiting 2–10 seconds for the LLM. With a reasoning model provider, that gap widens to 15–30 seconds per query, the cache makes those queries feel instant.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Cache Is Shared Across All Three Surfaces
&lt;/h3&gt;

&lt;p&gt;CLI, Obsidian plugin, and Web Chat UI all share the same &lt;code&gt;cache.db&lt;/code&gt;. If you ran &lt;code&gt;synthadoc query "..."&lt;/code&gt; from the CLI this morning and the wiki hasn't changed, opening the Obsidian modal and asking the same question will hit the cache. The key is identical - same normalized question, same epoch, same model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Drop the entire cache - both LLM response cache and query cache&lt;/span&gt;
synthadoc cache clear
Cache cleared: 47 entries removed.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What Makes This Architecturally Different
&lt;/h2&gt;

&lt;p&gt;Most streaming chat interfaces work the same way: user sends a message, server calls the LLM, tokens stream back. There's no retrieval, no structured knowledge, no notion of whether the answer is backed by reviewed sources.&lt;/p&gt;

&lt;p&gt;Synthadoc's streaming pipeline is a two-phase system where the first phase is a structured knowledge retrieval against a compiled, lifecycle-tracked wiki. The tokens you receive as they arrive are not hallucinated filler - they're synthesized from pages that passed lint, have known provenance, and carry a lifecycle state that tells you when they were last reviewed. The &lt;code&gt;sources: N&lt;/code&gt; in the status event isn't decorative. It tells you before the first word of the answer how much of your wiki was relevant.&lt;/p&gt;

&lt;p&gt;The session mode detection adds something I haven't seen elsewhere: the server classifies your wiki's health state when you open a session and uses that classification to shape every hint update for the rest of the session. A HEALTH_CHECK session doesn't give you generic "explore your wiki" prompts, it gives you "these pages need attention." The hints aren't cosmetic. They're a live triage system for wiki health.&lt;/p&gt;

&lt;p&gt;The caching architecture also differs from the typical approach of setting an explicit TTL (cache for 24 hours, or cache for one week). TTL-based caches are almost always wrong at the edges: they're either too short (you evict answers that are still valid) or too long (you serve stale content after a wiki update). Epoch-based invalidation is event-driven, the cache is valid until something in the wiki changes, exactly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Demo
&lt;/h2&gt;

&lt;p&gt;All three query surfaces are covered in the quick-start guide against the history-of-computing demo wiki:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CLI streaming and caching:&lt;/strong&gt; &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-5--query-the-pre-built-wiki-cli--obsidian" rel="noopener noreferrer"&gt;Step 5 - Query the pre-built wiki&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Chat UI:&lt;/strong&gt; &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-22--use-the-web-chat-ui" rel="noopener noreferrer"&gt;Step 22 - Use the web chat UI&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query caching:&lt;/strong&gt; &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-23--query-caching" rel="noopener noreferrer"&gt;Step 23 - Query caching&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full thing runs locally in about ten minutes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/axoviq-ai/synthadoc.git
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;
synthadoc &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing &lt;span class="nt"&gt;--target&lt;/span&gt; ~/wikis &lt;span class="nt"&gt;--demo&lt;/span&gt;
synthadoc plugin &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing
synthadoc web   &lt;span class="c"&gt;# opens browser&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you find Synthadoc useful, a ⭐ on GitHub helps the project reach more people: &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>architecture</category>
      <category>automation</category>
    </item>
    <item>
      <title>Synthadoc: Staleness Detection, Full Audit Trails, and Four Export Formats - No Extra LLM Calls</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Mon, 01 Jun 2026 20:53:18 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-staleness-detection-full-audit-trails-and-four-export-formats-no-extra-llm-calls-2ild</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-staleness-detection-full-audit-trails-and-four-export-formats-no-extra-llm-calls-2ild</guid>
      <description>&lt;p&gt;There's a category of problem that only shows up after you've been running an automated knowledge system for a while. The first month feels like magic - pages compile themselves, citations appear, everything is fresh. Three months later, you open a page about a library that shipped three breaking versions since the source was last ingested. The page looks perfectly healthy. The confidence is "high." The lint passed. And yet, everything in it is quietly wrong.&lt;/p&gt;

&lt;p&gt;Static knowledge bases have no vocabulary for "this was true." Synthadoc v0.6.0 gives your wiki one.&lt;/p&gt;

&lt;p&gt;Synthadoc release v0.6.0 ships two features that change how a wiki ages: a &lt;strong&gt;five-state page lifecycle machine&lt;/strong&gt; that tracks content freshness with a permanent audit trail, and a &lt;strong&gt;wiki export system&lt;/strong&gt; that serializes not just content but provenance, history, and cost, in four machine-readable formats, with zero additional LLM calls.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 5-State Page Lifecycle
&lt;/h2&gt;

&lt;p&gt;The core idea is simple: every page has a status that reflects what the system &lt;em&gt;knows about it&lt;/em&gt; right now, not just what it &lt;em&gt;says&lt;/em&gt;. That status moves through five states based on signals from ingest, lint, and the source files themselves.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Automatic transitions (system-triggered):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcgn1xj15a9hwzn85p4ww.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcgn1xj15a9hwzn85p4ww.png" alt=" " width="800" height="1202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Transition&lt;/th&gt;
&lt;th&gt;Trigger&lt;/th&gt;
&lt;th&gt;Who&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;→ draft&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;New page created via ingest&lt;/td&gt;
&lt;td&gt;IngestAgent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;draft → active&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Lint passes all structural and consistency checks&lt;/td&gt;
&lt;td&gt;LintAgent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;active → stale&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SHA-256 hash of source file has changed since last ingest&lt;/td&gt;
&lt;td&gt;LintAgent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;stale → draft&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Source re-ingested with&lt;code&gt;--force&lt;/code&gt;; page updated&lt;/td&gt;
&lt;td&gt;IngestAgent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;draft / active / stale → contradicted&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;New source conflicts with this page; status set directly, bypasses transition API&lt;/td&gt;
&lt;td&gt;IngestAgent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Manual transitions (user CLI commands):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Transition&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lifecycle activate &amp;lt;slug&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;draft → active&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Promote without waiting for the next lint run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lifecycle archive &amp;lt;slug&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;draft / active / contradicted / stale → archived&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Retire the page; it's kept for reference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lifecycle restore &amp;lt;slug&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;archived → draft&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Re-admit the page; re-enters the lint queue&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Note: &lt;code&gt;active → stale&lt;/code&gt; and &lt;code&gt;stale → draft&lt;/code&gt; have no user-facing CLI command, they are exclusively system-triggered by lint and re-ingest respectively. The only path out of &lt;code&gt;contradicted&lt;/code&gt; is archiving it; you cannot promote a contradicted page directly to active.&lt;/p&gt;

&lt;p&gt;Every single transition - automatic or manual - is permanently written to the audit database with a timestamp, the triggering agent or user, and a reason string. That's the part that actually matters. The state tells you &lt;em&gt;where&lt;/em&gt; a page is. The log tells you &lt;em&gt;how it got there&lt;/em&gt; and &lt;em&gt;when someone last looked at it&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check the full history of a page&lt;/span&gt;
synthadoc lifecycle log alan-turing

Slug                      From           To             By           Timestamp              Reason
&lt;span class="nt"&gt;----------------------------------------------------------------------------------------------------&lt;/span&gt;
alan-turing               null           draft          ingest       2026-04-12T09:14:22    initial ingest
alan-turing               draft          active         lint         2026-04-12T09:31:07    all checks passed
alan-turing               active         stale          lint         2026-05-03T02:00:11    &lt;span class="nb"&gt;source hash &lt;/span&gt;mismatch
alan-turing               stale          draft          ingest       2026-05-03T08:22:55    re-ingest of stale page
alan-turing               draft          active         lint         2026-05-03T08:45:02    all checks passed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you prefer a visual view, the same full cross-wiki audit trail is available in Obsidian under &lt;strong&gt;Synthadoc: Manage Page Lifecycle → Audit Log&lt;/strong&gt;. Every transition shows colour-coded From/To state badges, the triggering agent or user, the timestamp, and the reason string - searchable by slug, filterable by state, paginated:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fam7w7364a0mbkegx7foo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fam7w7364a0mbkegx7foo.png" alt=" " width="800" height="695"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For a fleet-level view, &lt;code&gt;synthadoc status&lt;/code&gt; gives a live summary across all five states, including pages sitting in candidates, along with an action hint for anything that needs attention:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc status

Wiki:         history-of-computing
Pages:        42
Jobs pending: 0
Jobs total:   187

Page lifecycle:
  active         38
  draft           2  &amp;lt;- run &lt;span class="sb"&gt;`&lt;/span&gt;synthadoc lint run&lt;span class="sb"&gt;`&lt;/span&gt; to promote
  draft &lt;span class="o"&gt;(&lt;/span&gt;staged&lt;span class="o"&gt;)&lt;/span&gt;  1  &amp;lt;- promote from candidates/ first, &lt;span class="k"&gt;then &lt;/span&gt;lint
  stale           1  &amp;lt;- re-ingest needed
  contradicted    0
  archived        1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things worth knowing about what these numbers mean. First, &lt;code&gt;Pages: 42&lt;/code&gt; at the top counts only pages that have been admitted into &lt;code&gt;wiki/&lt;/code&gt; - pages still quarantined in &lt;code&gt;wiki/candidates/&lt;/code&gt; are excluded from that total. Second, &lt;code&gt;draft&lt;/code&gt; and &lt;code&gt;draft (staged)&lt;/code&gt; are distinct rows: &lt;code&gt;draft&lt;/code&gt; is pages already inside &lt;code&gt;wiki/&lt;/code&gt; waiting for their first lint pass; &lt;code&gt;draft (staged)&lt;/code&gt; is pages physically quarantined in &lt;code&gt;wiki/candidates/&lt;/code&gt; , and they haven't been promoted yet, have no lifecycle state, and are invisible to every part of the system until a human explicitly promotes them. The lifecycle section only shows &lt;code&gt;draft (staged)&lt;/code&gt; when the count is greater than zero, so on a wiki with staging turned off you'll never see that row. The action hints tell you exactly what to do next for each group: run lint for drafts, re-ingest for stale pages, review and archive for contradictions.&lt;/p&gt;

&lt;p&gt;If you prefer to manage lifecycle states visually, the Obsidian plugin surfaces the same data in &lt;strong&gt;Synthadoc: Manage Page Lifecycle → Current States&lt;/strong&gt;. The table is sortable and filterable by state, shows the last transition timestamp and who triggered it, and gives you a one-click archive button per page. The &lt;code&gt;contradicted&lt;/code&gt; chip makes it easy to find the pages that need attention first:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe6mrmv0m9jupvf9m7uyj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe6mrmv0m9jupvf9m7uyj.png" alt=" " width="800" height="701"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Candidates Staging: A Quality Gate Before Lifecycle Begins
&lt;/h2&gt;

&lt;p&gt;The lifecycle machine handles what happens &lt;em&gt;after&lt;/em&gt; a page enters the wiki. Candidates staging handles &lt;em&gt;whether&lt;/em&gt; it enters at all.&lt;/p&gt;

&lt;p&gt;Where a new page lands depends entirely on the staging policy configured for that wiki. There are three options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;off&lt;/code&gt;&lt;/strong&gt; (default): every new page goes straight into &lt;code&gt;wiki/&lt;/code&gt; as &lt;code&gt;draft&lt;/code&gt;. Staging is not involved.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;all&lt;/code&gt;&lt;/strong&gt;: every new page goes to &lt;code&gt;wiki/candidates/&lt;/code&gt; regardless of confidence. Nothing is admitted automatically - you review and promote everything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;threshold&lt;/code&gt;&lt;/strong&gt;: IngestAgent checks the page's confidence rating against your configured minimum. Pages that meet or exceed it go directly into &lt;code&gt;wiki/&lt;/code&gt;; pages that fall below it go to &lt;code&gt;wiki/candidates/&lt;/code&gt; for review.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;wiki/candidates/&lt;/code&gt; is a holding area excluded from search, context packs, and export. No downstream consumer sees a candidate. The page exists on disk, but it hasn't been admitted into the lifecycle yet, it has no audit log entry and doesn't appear in &lt;code&gt;synthadoc status&lt;/code&gt; counts.&lt;/p&gt;

&lt;p&gt;Here's how the three paths look end-to-end under the &lt;code&gt;threshold&lt;/code&gt; policy:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8innuccjs5sy3to2cxx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8innuccjs5sy3to2cxx.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The key design decision here: staging and lifecycle are orthogonal systems that compose cleanly. Staging decides &lt;em&gt;admission&lt;/em&gt;. Lifecycle decides &lt;em&gt;state&lt;/em&gt; after admission. A page in &lt;code&gt;wiki/candidates/&lt;/code&gt; has no lifecycle state yet, it's not in the audit log, it doesn't count in &lt;code&gt;synthadoc status&lt;/code&gt;, and it doesn't appear in any export. The moment you promote it, it enters the lifecycle as &lt;code&gt;draft&lt;/code&gt; and the lint queue picks it up on the next run.&lt;/p&gt;

&lt;p&gt;This matters for teams that need a human gate on automated ingestion. Nightly ingest jobs run at 2AM, pull new sources, compile pages. They all land in candidates. A person reviews the list in the morning, promotes what looks right, discards what doesn't. The wiki only grows with reviewed content.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Enable threshold staging: auto-promote high-confidence, hold everything else&lt;/span&gt;
synthadoc staging policy threshold &lt;span class="nt"&gt;--min-confidence&lt;/span&gt; high

&lt;span class="c"&gt;# Morning review&lt;/span&gt;
synthadoc candidates list

Candidates &lt;span class="o"&gt;(&lt;/span&gt;3&lt;span class="o"&gt;)&lt;/span&gt;:
  machine-learning-fundamentals    confidence: medium   ingested: 2026-05-31
  attention-mechanism              confidence: low      ingested: 2026-05-31
  transformer-architecture         confidence: medium   ingested: 2026-05-31

synthadoc candidates promote transformer-architecture
synthadoc candidates discard attention-mechanism
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Wiki Export: Four Formats, Zero LLM Calls
&lt;/h2&gt;

&lt;p&gt;Export was designed around one constraint: once your wiki is compiled, you shouldn't need to spend more API budget to serialize it. All four formats are computed entirely from the stored wiki state - no prompts, no completions, no waiting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuv1hcbju8am52pshy3e6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuv1hcbju8am52pshy3e6.png" alt=" " width="800" height="1000"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;--status&lt;/code&gt; flag is what makes export practically useful. When you're feeding a downstream LLM, you probably only want &lt;code&gt;active&lt;/code&gt; pages — the ones that passed lint and haven't gone stale:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc &lt;span class="nb"&gt;export&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; llms.txt &lt;span class="nt"&gt;--status&lt;/span&gt; active
synthadoc &lt;span class="nb"&gt;export&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; json &lt;span class="nt"&gt;--status&lt;/span&gt; active &lt;span class="nt"&gt;--output&lt;/span&gt; exports/wiki.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--status contradicted&lt;/code&gt; flag is genuinely useful for forensics — you can export just the pages with conflicts and analyse them without touching the rest of the wiki.&lt;/p&gt;

&lt;p&gt;The JSON format is the one worth drawing attention to specifically. Most wiki exports give you a flat document dump. This one gives you provenance at the sentence level (&lt;code&gt;claims[]&lt;/code&gt; maps each paragraph to the exact source file and line range that generated it), the complete state transition history (&lt;code&gt;lifecycle_history[]&lt;/code&gt;), and the per-page API cost to compile it. If you're building downstream tooling or reporting on knowledge quality, these three fields eliminate an entire layer of instrumentation you'd otherwise have to build yourself.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"slug"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"alan-turing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"active"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ingest_cost_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0012&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"claims"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Turing proposed the imitation game in 1950..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"raw_sources/turing-biography.md"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"lines"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"lifecycle_history"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"by"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ingest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-12T09:14:22"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"active"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"by"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lint"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-12T09:31:07"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What Makes This Different
&lt;/h2&gt;

&lt;p&gt;Most LLM wiki tools treat knowledge as append-only. You ingest, you query. There's no concept of a page going stale, no audit trail of who reviewed what and when, and no way to know that the page you're reading was compiled from a source that changed three months ago. They're effectively write-once databases with a chat interface on top.&lt;/p&gt;

&lt;p&gt;Synthadoc's lifecycle machine makes the wiki &lt;em&gt;temporally aware&lt;/em&gt;. A SHA-256 hash is stored for every source at ingest time. When lint runs (nightly, typically), it compares current hashes against stored ones. A changed source triggers an automatic &lt;code&gt;active → stale&lt;/code&gt; transition with a timestamp. You know exactly which pages need attention and when they last didn't.&lt;/p&gt;

&lt;p&gt;The other thing that separates Synthadoc architecturally is that it's not a retrieval pipeline with a generation step, it's a compilation pipeline. Every page is a synthesized artifact, not a retrieved chunk. That's why the JSON export can include &lt;code&gt;ingest_cost_usd&lt;/code&gt; per page: because each page has a discrete compilation history, not a query-time cost that varies every time someone asks a question.&lt;/p&gt;

&lt;p&gt;The combination of lifecycle tracking and export also enables something practical for teams: you can run &lt;code&gt;synthadoc export --format llms.txt --status active&lt;/code&gt; as the input to a downstream agent, and you know exactly what you're giving it. No stale content. No contradicted pages. Just the subset of the wiki that the system has marked as reviewed and consistent.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Demo
&lt;/h2&gt;

&lt;p&gt;The quickest way to see the lifecycle machine in action is step 8 of the quick-start guide: &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-8--manage-page-lifecycle" rel="noopener noreferrer"&gt;Step 8 — Manage Page Lifecycle&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Export is step 21: &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-21--export-your-wiki" rel="noopener noreferrer"&gt;Step 21 — Export Your Wiki&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Both steps work with the history-of-computing demo wiki, so you can run the full thing locally in about ten minutes against your selected LLM provider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/axoviq-ai/synthadoc.git
pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;
synthadoc &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing &lt;span class="nt"&gt;--target&lt;/span&gt; ~/wikis &lt;span class="nt"&gt;--demo&lt;/span&gt;
synthadoc plugin &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you find Synthadoc useful, a star ⭐ on GitHub goes a long way toward keeping this project visible: &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc&lt;/a&gt;.&lt;/p&gt;




</description>
      <category>ai</category>
      <category>automation</category>
      <category>knowledgemanagement</category>
      <category>llm</category>
    </item>
    <item>
      <title>Synthadoc: Built an AI Judge for Our LLM Wiki Compiler - Here's What We Learned</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Sat, 23 May 2026 17:19:15 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-we-built-an-ai-judge-for-our-ai-wiki-compiler-heres-what-we-learned-13k3</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-we-built-an-ai-judge-for-our-ai-wiki-compiler-heres-what-we-learned-13k3</guid>
      <description>&lt;p&gt;There's a particular kind of anxiety that comes from reading an LLM-compiled document you wrote six months ago. The prose is clean. The structure is coherent. And then you spot a sentence like "this approach became the industry standard by the late 1990s" - and you have no idea whether that came from your source material, or whether the model just... said it.&lt;/p&gt;

&lt;p&gt;That's the problem we've been chipping away at with &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;Synthadoc&lt;/a&gt;, an open-source LLM knowledge compiler. You feed it raw source files - PDFs, text docs, YouTube transcripts, web pages - and it synthesises them into a structured, queryable wiki. We shipped structural lint early on: contradiction detection, orphan page checks, broken link validation. But structural checks don't tell you whether the &lt;em&gt;content&lt;/em&gt; is trustworthy. They tell you the wiring is correct, not whether the building should fall down.&lt;/p&gt;

&lt;p&gt;In v0.5.0 we tackled this directly with two features. Here's how they work and why we think they matter.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Adversarial Judge
&lt;/h2&gt;

&lt;p&gt;The first instinct when you want to validate LLM output is to ask the same model to review itself. This doesn't work well. A model that just synthesised a page with a particular framing will tend to confirm that framing when asked to check it. It's not hallucinating - it's just consistent in the way humans are consistent with their own reasoning.&lt;/p&gt;

&lt;p&gt;The fix is simple in principle: use a &lt;em&gt;different&lt;/em&gt; model to review the output. A judge with different training data, different inductive biases, different tendencies to hedge or assert. In practice, that means configuring a second model - ideally from a different provider entirely - to act as a sceptical editor after each lint run.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# config.toml&lt;/span&gt;
&lt;span class="nn"&gt;[agents]&lt;/span&gt;
&lt;span class="py"&gt;lint&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"minimax"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="py"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"MiniMax-M2.5"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="py"&gt;adversarial&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;model&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"claude-sonnet-4-6"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The adversarial pass runs after structural checks complete. Each page is sent to the judge with a single brief: find claims that are overstated, unsupported, or contradicted elsewhere in the source material. Results come back as &lt;code&gt;{claim, concern}&lt;/code&gt; pairs, capped at a configurable limit per page (default: 2 - enough signal without drowning the author in noise).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7uiwp3uoxb8luhoi7r0f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7uiwp3uoxb8luhoi7r0f.png" alt=" " width="800" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Warnings are written directly into each page's YAML frontmatter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;lint_warnings&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;claim&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Saved&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;over&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;fourteen&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;million&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;lives."&lt;/span&gt;
    &lt;span class="na"&gt;concern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;figure&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;lacks&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;scholarly&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;consensus&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;—&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;historians&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;dispute&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;both&lt;/span&gt;
              &lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;precision&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;the&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;causal&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;attribution&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Turing's&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;cryptanalysis&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;alone."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The field is absent when no warnings exist, and cleared automatically if you run &lt;code&gt;--no-adversarial&lt;/code&gt; , so stale warnings never persist past the last lint run that produced them.&lt;/p&gt;

&lt;p&gt;When we ran this on the history-of-computing demo wiki for the first time, the judge flagged a paragraph about the second AI winter that described symbolic approaches as having been "largely abandoned." The compilation model had written that confidently. The judge noted it was contested - hybrid approaches continued in some research communities throughout the period. That's exactly the kind of subtle overstatement that's invisible to a format checker, and invisible to the model that wrote it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why this matters for teams
&lt;/h3&gt;

&lt;p&gt;The adversarial pass produces documented evidence that a second, independent reviewer assessed each page. For enterprise knowledge bases - compliance documentation, research synthesis, internal policy wikis that audit trail has real value. "The LLM wrote it and we ran a second LLM over it with a different model" is a meaningfully stronger claim than "the LLM wrote it." Not a guarantee of accuracy. A documented process.&lt;/p&gt;




&lt;h2&gt;
  
  
  Claim-Level Provenance
&lt;/h2&gt;

&lt;p&gt;The adversarial pass tells you a claim &lt;em&gt;might&lt;/em&gt; be wrong. Provenance tells you exactly where it came from so you can check yourself.&lt;/p&gt;

&lt;p&gt;The core mechanism is a citation annotation pass that runs during ingest, immediately after each page section is written. The model receives the numbered source text alongside the compiled paragraph and returns the paragraph with a &lt;code&gt;^[filename:L–L]&lt;/code&gt; marker appended:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Alan Turing proposed the Turing Test in 1950.^[turing-biography.txt:12-24]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The marker encodes the source filename and the exact line range in the raw document that supports the claim. These markers are stored in the page body, recorded in the audit database, validated by lint, and - in Obsidian - rendered as interactive chips in Reading View.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9zdj4rhz4pyw8vywnky2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9zdj4rhz4pyw8vywnky2.png" alt=" " width="800" height="946"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click a chip and you get the Source Viewer: the referenced lines, highlighted, with surrounding context. For PDF sources, a pagemap sidecar resolves line numbers to the correct PDF page so the viewer offers an "Open PDF at page N →" button that navigates directly to the passage.&lt;/p&gt;

&lt;p&gt;The annotation pass has a fallback: if the LLM fails, returns unparseable output, or references line numbers that don't exist, the original un-annotated section is used and the failure is recorded as an audit event. Ingest always completes. Results are also cached by section SHA-256, so re-ingesting an unchanged file doesn't incur an extra LLM call just to re-annotate the same paragraphs.&lt;/p&gt;

&lt;h3&gt;
  
  
  The full citation record
&lt;/h3&gt;

&lt;p&gt;Every citation is stored in &lt;code&gt;claim_citations&lt;/code&gt; in the audit database - page slug, source file, line range, and a 100-character excerpt of the annotated paragraph. You can query it directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc audit citations &lt;span class="nt"&gt;-w&lt;/span&gt; history-of-computing &lt;span class="nt"&gt;--page&lt;/span&gt; alan-turing
synthadoc audit citations &lt;span class="nt"&gt;-w&lt;/span&gt; history-of-computing &lt;span class="nt"&gt;--broken&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or browse the whole wiki via &lt;strong&gt;Synthadoc: View Page Provenance&lt;/strong&gt; in the Obsidian command palette - a sortable, paginated table where every row opens the Source Viewer for that citation's exact line range.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why this matters for teams
&lt;/h3&gt;

&lt;p&gt;Traceability is one of the hardest blocks to enterprise AI adoption. "The model synthesised this from your documents" is not a citation. A recorded &lt;code&gt;^[turing-biography.txt:12-24]&lt;/code&gt; that links back to the primary source and surfaces that passage on click, that's closer. It doesn't eliminate the need for human review, but it makes human review fast enough to actually happen.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Layers of Trust
&lt;/h2&gt;

&lt;p&gt;These features are complementary and designed to be used together. Provenance tells you &lt;em&gt;where a claim came from&lt;/em&gt;. Adversarial review tells you &lt;em&gt;whether you should trust it&lt;/em&gt;. Neither is sufficient alone.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmh2v6q01ihk9ngo7p69e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmh2v6q01ihk9ngo7p69e.png" alt=" " width="800" height="947"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A page that has full citation coverage and a second-model lint review is not a guaranteed-accurate page. But it's a page where you know where every claim came from, and where an independent model has registered any concerns it had. That feels like the right foundation for knowledge systems that get used in high-stakes contexts, as opposed to knowledge systems that sit in a folder and get ignored because nobody trusts them.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Few Honest Notes
&lt;/h2&gt;

&lt;p&gt;The annotation pass is itself an LLM call, which means it can make mistakes: misidentifying the supporting line range, over-annotating trivial sentences, or under-annotating genuinely sourced claims. We've found it to be reasonably accurate in practice, and the fallback-to-unannotated behaviour means a bad annotation result doesn't corrupt the page. But it's not ground truth. Treat it as strong evidence, not proof.&lt;/p&gt;

&lt;p&gt;Similarly, the adversarial judge is only as good as the model you point it at. A judge that's too agreeable produces noise. A judge that's too aggressive produces fatigue. The &lt;code&gt;adversarial_max_per_page&lt;/code&gt; cap (configurable, default 2) helps, but choosing the right model for the right domain still takes some experimentation.&lt;/p&gt;




&lt;p&gt;If any of this is interesting or you're thinking through similar problems in your own knowledge tooling, feedback is genuinely welcome. The project is at &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc&lt;/a&gt;  , and stars all appreciated.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>claude</category>
      <category>agents</category>
    </item>
    <item>
      <title>Synthadoc: From Raw Documents to Domain Intelligence (Youtube)</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Sun, 17 May 2026 13:33:15 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-from-raw-documents-to-domain-intelligence-p7f</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-from-raw-documents-to-domain-intelligence-p7f</guid>
      <description>&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/rIGO6zi9XQE"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Synthadoc&lt;/strong&gt; is an open-source AI knowledge engine that turns your raw documents - PDFs, research papers, web pages, spreadsheets - into a living, queryable wiki your team can actually trust.&lt;/p&gt;

&lt;p&gt;In this video, we walk through how Synthadoc ingests sources, detects contradictions between them, keeps your knowledge base clean with lint and routing tools, and lets you query it with cited answers grounded in your own domain content. We also cover the Obsidian plugin, context packs for LLM grounding, staging for quality control, and the built-in audit trail for enterprise accountability.&lt;/p&gt;

&lt;p&gt;Whether you're a researcher, a team lead managing a growing knowledge base, or a developer building AI-powered workflows. Synthadoc gives you the infrastructure to go from scattered documents to reliable domain intelligence.&lt;/p&gt;

&lt;p&gt;🔗 GitHub: &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc&lt;/a&gt;&lt;br&gt;
  ⭐ Star the project if you find it useful!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>opensource</category>
      <category>rag</category>
    </item>
    <item>
      <title>Synthadoc: From Raw Documents to Domain Intelligence (Youtube)

https://youtu.be/rIGO6zi9XQE?si=nuhdnvdl9pcZ4NV_</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Sun, 17 May 2026 04:22:14 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-from-raw-documents-to-domain-intelligence-youtube-2ha2</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-from-raw-documents-to-domain-intelligence-youtube-2ha2</guid>
      <description>&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
      &lt;div class="c-embed__body flex items-center justify-between"&gt;
        &lt;a href="https://youtu.be/rIGO6zi9XQE?si=nuhdnvdl9pcZ4NV_" rel="noopener noreferrer" class="c-link fw-bold flex items-center"&gt;
          &lt;span class="mr-2"&gt;youtu.be&lt;/span&gt;
          

        &lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


</description>
      <category>ai</category>
      <category>productivity</category>
      <category>automation</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Synthadoc: Routing at Scale, Quality Gates, and the Knowledge Backend Pattern</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Mon, 11 May 2026 14:21:30 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-routing-at-scale-quality-gates-and-the-knowledge-backend-pattern-lil</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-routing-at-scale-quality-gates-and-the-knowledge-backend-pattern-lil</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiwnufzdbrz98r7yj7i6c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiwnufzdbrz98r7yj7i6c.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;When we shipped v0.1.0, Synthadoc did one thing well: it turned raw sources into a structured wiki that got smarter with every ingest. v0.2.0 made that wiki searchable with hybrid BM25 + vector retrieval. v0.3.0 opened up the source types - YouTube transcripts, web search fan-out, CLI provider integration so your existing Claude Code or Opencode subscription could power the whole thing.&lt;/p&gt;

&lt;p&gt;But a pattern kept surfacing in user feedback. Once a wiki crossed a few hundred pages, three problems appeared in a cluster: queries got slower, low-confidence pages polluted search results, and there was no clean way to pipe the wiki's structured knowledge into an agent prompt without getting back a synthesised answer when you wanted the raw evidence.&lt;/p&gt;

&lt;p&gt;v0.4.0 addresses all three. This post walks through the design decisions behind each feature, the benchmark numbers, and why we think the third one, context packs, points at something larger than a feature.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Scale Problem
&lt;/h2&gt;

&lt;p&gt;BM25 is fast. On a 100-page wiki, a query completes in single-digit milliseconds. The problem is that BM25 is also undiscriminating: it scores every page against every query, regardless of how obviously irrelevant most of those pages are.&lt;/p&gt;

&lt;p&gt;That's fine at 100 pages. At 1,000 pages with diverse topics - a personal research wiki covering ML, distributed systems, organisational theory, and management literature - you're running a full-corpus scan for every query. And because query decomposition splits one question into 3–5 sub-questions, you multiply that cost by 3–5 on every call.&lt;/p&gt;

&lt;p&gt;We benchmarked the unrouted baseline across corpus sizes. The numbers aren't alarming until they are:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Corpus size&lt;/th&gt;
&lt;th&gt;Full-corpus P95 latency&lt;/th&gt;
&lt;th&gt;Routed P95 latency (2 of 10 branches)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;100 pages&lt;/td&gt;
&lt;td&gt;7 ms&lt;/td&gt;
&lt;td&gt;7 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;500 pages&lt;/td&gt;
&lt;td&gt;38 ms&lt;/td&gt;
&lt;td&gt;12 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000 pages&lt;/td&gt;
&lt;td&gt;74 ms&lt;/td&gt;
&lt;td&gt;18 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5,000 pages&lt;/td&gt;
&lt;td&gt;112 ms&lt;/td&gt;
&lt;td&gt;21 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 pages&lt;/td&gt;
&lt;td&gt;191 ms&lt;/td&gt;
&lt;td&gt;24 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6jprgpopq2w23n2qn97i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6jprgpopq2w23n2qn97i.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Routed search stays near-flat as the corpus grows, because the per-branch page count doesn't change even as the total wiki does. Full-corpus search grows with the wiki - not catastrophically, but noticeably, and that growth compounds across decomposed sub-queries.&lt;/p&gt;

&lt;p&gt;The other problem at scale is more subtle: a query about treatment protocols for hypertension shouldn't touch the pages about distributed consensus algorithms. Not just for performance reasons - also because irrelevant pages can drift into synthesis and dilute the answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feature 1: Routing Layer
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why It Matters for a Self-Growing Wiki
&lt;/h3&gt;

&lt;p&gt;The scale problem above gets worse in a specific way that's easy to underestimate: a well-configured Synthadoc wiki doesn't stay at its initial size. With nightly scheduled ingests pulling from web searches, PDFs, YouTube transcripts, and curated source lists, a wiki can double from 200 to 400 pages within a few weeks, then reach 1,000 within a few months without the user doing anything manually. That's exactly the point.&lt;/p&gt;

&lt;p&gt;But without routing, query quality degrades silently as that growth happens. Every new page increases the BM25 corpus, and the search engine has no way to know that "What are the treatment protocols for hypertension?" should not touch the 300 pages about software architecture. More pages means more irrelevant candidates competing to drift into synthesis, more false positives, and more latency - and none of it is visible until queries start returning noticeably diluted answers.&lt;/p&gt;

&lt;p&gt;Routing is what makes autonomous growth sustainable. It's the mechanism that keeps query scope bounded to what's actually relevant, regardless of how large the total wiki becomes. The branch taxonomy is defined once; IngestAgent maintains it automatically from that point forward - every new page created by ingest is auto-placed into the most relevant branch, so ROUTING.md stays accurate as the wiki grows without manual intervention.&lt;/p&gt;

&lt;p&gt;The routing layer introduces a file called &lt;code&gt;ROUTING.md&lt;/code&gt; at the wiki root. Its format is intentionally simple - the same &lt;code&gt;## H2 → [[slug]]&lt;/code&gt; structure already used in &lt;code&gt;index.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## People&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [[alan-turing]]
&lt;span class="p"&gt;-&lt;/span&gt; [[grace-hopper]]
&lt;span class="p"&gt;-&lt;/span&gt; [[ada-lovelace]]

&lt;span class="gu"&gt;## Hardware&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [[von-neumann-architecture]]
&lt;span class="p"&gt;-&lt;/span&gt; [[eniac-computer]]
&lt;span class="p"&gt;-&lt;/span&gt; [[transistor-and-microchip]]

&lt;span class="gu"&gt;## Networks&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [[internet-origins]]
&lt;span class="p"&gt;-&lt;/span&gt; [[arpanet-history]]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;User-owned. Scaffold creates it once from the current index structure and never rewrites it. The user defines the branch taxonomy; the system maintains it.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Routing Works at Query Time
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3y3wmg9ownompi9nd0se.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3y3wmg9ownompi9nd0se.png" alt=" " width="800" height="1246"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At query time, QueryAgent reads &lt;code&gt;ROUTING.md&lt;/code&gt; (cached per session, invalidated on write), passes the branch headings and the user's query to the LLM, and receives back a short JSON array of the 1–2 most relevant branch names. BM25 then runs only over the slugs listed under those branches.&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;ROUTING.md&lt;/code&gt; is absent, or if no branch scores above threshold, the system falls back to full-corpus search transparently - no error, no degraded output.&lt;/p&gt;

&lt;p&gt;IngestAgent also uses routing: when a new page is created, it's slotted into the most relevant branch automatically. &lt;code&gt;ROUTING.md&lt;/code&gt; stays consistent without manual maintenance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Aliases
&lt;/h3&gt;

&lt;p&gt;The other half of the routing feature is alias resolution. Anyone who maintains a personal knowledge base has personal terminology that diverges from canonical names. You might always call a concept by a shorthand, an acronym, or a translation - and BM25 will miss the connection because the strings don't match.&lt;/p&gt;

&lt;p&gt;Aliases live in each page's YAML frontmatter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Alan Turing&lt;/span&gt;
&lt;span class="na"&gt;aliases&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;turing&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;the turing paper&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;incomputability guy&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At query time, before BM25 runs, QueryAgent expands any alias matches in the query to their canonical slug. The user's internal vocabulary resolves to the wiki's vocabulary without re-learning what the LLM decided to call things.&lt;/p&gt;

&lt;p&gt;ScaffoldAgent suggests initial aliases when generating a page. Users refine them in Obsidian's Properties panel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Protected Scaffold Zone
&lt;/h3&gt;

&lt;p&gt;One problem that appeared as wikis grew: users would hand-edit &lt;code&gt;index.md&lt;/code&gt; to add an introduction, a personal note, a custom link - and the next scaffold run would erase it. Scaffold owned the whole file.&lt;/p&gt;

&lt;p&gt;v0.4.0 introduces a marker line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# My Research Wiki&lt;/span&gt;

My custom introduction and notes here.
Scaffold never touches anything above this line.

&lt;span class="c"&gt;&amp;lt;!-- synthadoc:scaffold --&amp;gt;&lt;/span&gt;

&lt;span class="gu"&gt;## People&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [[alan-turing]] — Theoretical foundations of computation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Scaffold only regenerates content below &lt;code&gt;&amp;lt;!-- synthadoc:scaffold --&amp;gt;&lt;/code&gt;. Everything above is preserved verbatim across every scaffold run. The marker is inserted automatically on the first scaffold run if absent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Routing CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc routing init           &lt;span class="c"&gt;# generate ROUTING.md from current index (one-time)&lt;/span&gt;
synthadoc routing validate       &lt;span class="c"&gt;# report dangling slugs — dry run, no changes&lt;/span&gt;
synthadoc routing clean          &lt;span class="c"&gt;# auto-remove dangling entries&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;routing validate&lt;/code&gt; is worth running after bulk ingests or manual page deletions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Dangling slugs in ROUTING.md (3):
  [Hardware]  [[eniac-computer]]
  [People]    [[konrad-zuse]]
  [Networks]  [[arpanet-history]]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scheduling for a Self-Growing Wiki
&lt;/h3&gt;

&lt;p&gt;There are two distinct scheduling patterns worth setting up: nightly growth and weekly housekeeping.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nightly growth&lt;/strong&gt; — ingest pulls in new sources automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc schedule add &lt;span class="nt"&gt;--op&lt;/span&gt; &lt;span class="s2"&gt;"ingest --batch raw_sources/"&lt;/span&gt; &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 2 * * *"&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; my-wiki
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As each ingest job completes, IngestAgent writes the new page to &lt;code&gt;wiki/&lt;/code&gt; and appends its slug to ROUTING.md under the most relevant branch. No manual step needed — the routing index grows alongside the wiki.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Weekly housekeeping&lt;/strong&gt; - three operations, run in sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc schedule add &lt;span class="nt"&gt;--op&lt;/span&gt; &lt;span class="s2"&gt;"lint run"&lt;/span&gt;      &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 3 * * 0"&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; my-wiki   &lt;span class="c"&gt;# Sunday 3 AM&lt;/span&gt;
synthadoc schedule add &lt;span class="nt"&gt;--op&lt;/span&gt; &lt;span class="s2"&gt;"scaffold"&lt;/span&gt;      &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 4 * * 0"&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; my-wiki   &lt;span class="c"&gt;# Sunday 4 AM&lt;/span&gt;
synthadoc schedule add &lt;span class="nt"&gt;--op&lt;/span&gt; &lt;span class="s2"&gt;"routing clean"&lt;/span&gt; &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 5 * * 0"&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; my-wiki   &lt;span class="c"&gt;# Sunday 5 AM&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The order matters. Lint runs first and removes dead wikilinks left behind by deleted pages. Scaffold runs next and regenerates &lt;code&gt;index.md&lt;/code&gt; to reflect the current page set - new categories get added, empty ones get removed. Routing clean runs last and prunes any dangling slug entries from ROUTING.md that no longer have a corresponding wiki page. After all three, the index and routing table are consistent with the actual state of the wiki.&lt;/p&gt;

&lt;p&gt;One thing to be clear about: &lt;code&gt;routing init&lt;/code&gt; is a one-time setup command, not something to schedule. Running it again would overwrite ROUTING.md and erase any branch customisations you've made since the initial setup. &lt;code&gt;routing clean&lt;/code&gt; is the recurring maintenance command - it only removes entries for missing pages and never touches branch structure.&lt;/p&gt;

&lt;p&gt;If you prefer to declare the schedule in config rather than via the CLI, add a &lt;code&gt;[schedule]&lt;/code&gt; block to &lt;code&gt;.synthadoc/config.toml&lt;/code&gt; and register everything in one step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[schedule]&lt;/span&gt;
&lt;span class="py"&gt;jobs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;op&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ingest --batch raw_sources/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;cron&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0 2 * * *"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;op&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"lint run"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="py"&gt;cron&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0 3 * * 0"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;op&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"scaffold"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="py"&gt;cron&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0 4 * * 0"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;op&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"routing clean"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;               &lt;span class="py"&gt;cron&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0 5 * * 0"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc schedule apply &lt;span class="nt"&gt;-w&lt;/span&gt; my-wiki
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With nightly ingests and a weekly maintenance trio in place, the wiki grows, stays accurate, and self-corrects - without requiring manual intervention.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feature 2: Candidates Staging
&lt;/h2&gt;

&lt;p&gt;The second feature addresses a different scale problem: quality at the write path.&lt;/p&gt;

&lt;p&gt;Before v0.4.0, IngestAgent wrote new pages directly to &lt;code&gt;wiki/&lt;/code&gt; with no review step. A high-confidence page about a well-structured source landed right next to a speculative page inferred from a thin web article. Both entered BM25, both appeared in orphan detection and contradiction checks, and both showed up in synthesis - with no signal to distinguish them.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Staging Concept
&lt;/h3&gt;

&lt;p&gt;Candidates staging introduces a fork in the write path. Pages go to &lt;code&gt;wiki/candidates/&lt;/code&gt; instead of &lt;code&gt;wiki/&lt;/code&gt; when they don't meet a configurable confidence threshold. They're excluded from BM25, orphan detection, and contradiction checks until explicitly promoted.&lt;/p&gt;

&lt;p&gt;The policy is configured in &lt;code&gt;.synthadoc/config.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[ingest]&lt;/span&gt;
&lt;span class="py"&gt;staging_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"threshold"&lt;/span&gt;      &lt;span class="c"&gt;# "off" | "all" | "threshold"&lt;/span&gt;
&lt;span class="py"&gt;staging_confidence_min&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"high"&lt;/span&gt;   &lt;span class="c"&gt;# "high" | "medium" | "low"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three policies:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy&lt;/th&gt;
&lt;th&gt;Behaviour&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;"off"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All new pages go directly to&lt;code&gt;wiki/&lt;/code&gt; - current behaviour, default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;"threshold"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pages meeting&lt;code&gt;staging_confidence_min&lt;/code&gt; auto-promote; lower confidence → &lt;code&gt;wiki/candidates/&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;"all"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Every new page requires explicit promotion, regardless of confidence&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The config is hot-reloaded - a policy change takes effect on the next ingest job with no server restart.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Staging Workflow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj5z61xmxn00k3lfhu3si.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj5z61xmxn00k3lfhu3si.png" alt=" " width="800" height="1477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Reviewing Candidates
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc candidates list

Candidates &lt;span class="o"&gt;(&lt;/span&gt;3&lt;span class="o"&gt;)&lt;/span&gt;:
  eniac-computer    confidence: low     ingested: 2026-05-05
  konrad-zuse       confidence: medium  ingested: 2026-05-05
  arpanet-history   confidence: high    ingested: 2026-05-04

synthadoc candidates promote arpanet-history
synthadoc candidates discard eniac-computer
synthadoc candidates promote &lt;span class="nt"&gt;--all&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Promotion does four things atomically: moves the file from &lt;code&gt;wiki/candidates/&lt;/code&gt; to &lt;code&gt;wiki/&lt;/code&gt;, appends the slug to &lt;code&gt;index.md&lt;/code&gt; under the best-matching category, appends it to &lt;code&gt;ROUTING.md&lt;/code&gt; under the best-matching branch, and records the promotion in the audit trail. Discard deletes the file and records the reason.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scheduling and Staging
&lt;/h3&gt;

&lt;p&gt;Unlike routing and lint, candidates staging doesn't have a useful scheduled action. The two available commands — &lt;code&gt;candidates promote --all&lt;/code&gt; and &lt;code&gt;candidates discard --all&lt;/code&gt; - are too blunt to schedule safely. Scheduling &lt;code&gt;promote --all&lt;/code&gt; on a timer would auto-approve exactly the pages the quality gate held back. Scheduling &lt;code&gt;discard --all&lt;/code&gt; would silently delete pages you haven't reviewed yet. Neither command has a &lt;code&gt;--older-than&lt;/code&gt; or &lt;code&gt;--confidence&lt;/code&gt; filter that would make scheduled execution sensible.&lt;/p&gt;

&lt;p&gt;The right integration is manual: run &lt;code&gt;candidates list&lt;/code&gt; after a large batch ingest, or once a week if the wiki is growing quickly. It takes seconds. The review step is the intentional human checkpoint in an otherwise automated pipeline, it's where you decide what enters the wiki's searchable knowledge base.&lt;/p&gt;

&lt;p&gt;If candidates are accumulating faster than you can review them, the right response is to adjust the policy rather than schedule a cleanup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Widen the auto-promote threshold - fewer pages need review&lt;/span&gt;
synthadoc staging policy threshold &lt;span class="nt"&gt;--min-confidence&lt;/span&gt; medium

&lt;span class="c"&gt;# Or turn staging off entirely - if you trust all your sources&lt;/span&gt;
synthadoc staging policy off
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The staging policy is the dial; the CLI review commands are for the remainder that needs a human decision.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;The practical effect of &lt;code&gt;staging_policy = "threshold"&lt;/code&gt; on a high-volume wiki is significant. After ingesting 30 web articles from varied sources, typical results look like: 22 high-confidence pages auto-promote and enter the main wiki immediately; 8 lower-confidence pages wait in candidates. Those 8 include speculative pages, ambiguous slug assignments, and pages where the source was thin enough that the LLM wasn't confident what it was describing.&lt;/p&gt;

&lt;p&gt;In the full-corpus approach, those 8 pages would be silently diluting every query that touched their topic. In the staged approach, they're visible and actionable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feature 3: Context Packs
&lt;/h2&gt;

&lt;p&gt;The third feature came from a different direction. Users weren't asking "how do I get a better answer?" They were asking "how do I get the raw evidence so I can do something with it myself?"&lt;/p&gt;

&lt;p&gt;QueryAgent synthesises. That's its job. But synthesis isn't always what you want. Sometimes you want the actual page excerpts - cited, bounded, ranked by relevance - to paste into an agent prompt, to attach to a report, to review before a meeting, or to feed into an automated pipeline.&lt;/p&gt;

&lt;p&gt;Context packs are the answer.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Context Packs Work
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpm6nku6q0jxx8nxg02zk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpm6nku6q0jxx8nxg02zk.png" alt=" " width="800" height="1268"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;ContextAgent&lt;/code&gt; reuses &lt;code&gt;QueryAgent.decompose()&lt;/code&gt; and &lt;code&gt;HybridSearch&lt;/code&gt;. It doesn't synthesise - it packs. Each page excerpt is included verbatim (up to its per-page limit), with attribution: slug, relevance score, confidence level, tags, source path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Output Format
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Context Pack: history of early computing pioneers&lt;/span&gt;
Generated: 2026-05-09T14:22:01
Token budget: 4000 | Used: 3847 | Omitted: 2 pages (budget exceeded)
&lt;span class="p"&gt;
---
&lt;/span&gt;
&lt;span class="gu"&gt;## [[alan-turing]] - relevance: 0.92&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Alan Turing developed the theoretical foundations of computation with his 1936&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; paper "On Computable Numbers." The paper introduced the abstract Turing machine&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; and proved the existence of undecidable problems...&lt;/span&gt;
Source: &lt;span class="sb"&gt;`wiki/alan-turing.md`&lt;/span&gt; | Confidence: high | Tags: people, mathematics

&lt;span class="gu"&gt;## [[grace-hopper]] - relevance: 0.87&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Grace Hopper pioneered compiler development and made programming accessible to&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; humans. She developed the first compiler (A-0) in 1952 and later led the team&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; that created COBOL...&lt;/span&gt;
Source: &lt;span class="sb"&gt;`wiki/grace-hopper.md`&lt;/span&gt; | Confidence: high | Tags: people, software
&lt;span class="p"&gt;
---
&lt;/span&gt;
&lt;span class="gu"&gt;## Omitted — token budget exceeded&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [[von-neumann-architecture]] — ~820 tokens
&lt;span class="p"&gt;-&lt;/span&gt; [[eniac-computer]] — ~650 tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CLI Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc context build &lt;span class="s2"&gt;"microservices patterns"&lt;/span&gt;
synthadoc context build &lt;span class="s2"&gt;"microservices patterns"&lt;/span&gt; &lt;span class="nt"&gt;--tokens&lt;/span&gt; 8000
synthadoc context build &lt;span class="s2"&gt;"microservices patterns"&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; context.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Token Budget Control
&lt;/h3&gt;

&lt;p&gt;This is where context packs differ from simply querying the wiki. An external agent doesn't want an unbounded blob of text - it has its own context window to manage, its own prompt already taking up tokens, and its own cost constraints. The &lt;code&gt;token_budget&lt;/code&gt; parameter gives the caller precise control over how much of the wiki's knowledge gets included.&lt;/p&gt;

&lt;p&gt;Synthadoc packs pages greedily by relevance score until the budget is exhausted, then lists everything that didn't fit with estimated token counts. The calling agent knows exactly what it got and what it didn't - and can decide whether to request a larger budget, run a second more focused query, or proceed with what's there. There are no surprises about how much context the call will consume.&lt;/p&gt;

&lt;p&gt;This predictability is what makes context packs suitable for production agentic pipelines. An agent orchestrator can reserve a fixed token slice for domain knowledge, call &lt;code&gt;context/build&lt;/code&gt; with that exact budget, and get back a response that fits - every time, regardless of how large the underlying wiki has grown.&lt;/p&gt;

&lt;h3&gt;
  
  
  The REST API - Synthadoc as a Knowledge Backend
&lt;/h3&gt;

&lt;p&gt;Context packs expose a &lt;code&gt;POST /context/build&lt;/code&gt; endpoint that returns the same data as structured JSON. This is where the pattern becomes interesting.&lt;/p&gt;

&lt;p&gt;An agent that needs grounding context before reasoning can call Synthadoc's REST API directly, get back a bounded, cited, ranked set of page excerpts, and inject them into its own prompt - without going through a synthesis step. Synthadoc becomes the knowledge layer, the agent provides the reasoning.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/context/build&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"goal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"microservices patterns for high-throughput event processing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"token_budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;OK&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"goal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"token_budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tokens_used"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3847&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"pages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"slug"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"event-driven-architecture"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"relevance"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"excerpt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Event-driven architecture decouples producers from consumers..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"wiki/event-driven-architecture.md"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"architecture"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"distributed-systems"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"omitted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"slug"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kafka-internals"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"estimated_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;980&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The existing MCP server exposes this as a native tool call. An agent running in any MCP-compatible host can call &lt;code&gt;context/build&lt;/code&gt; before it reasons, get structured evidence back, and proceed with grounding it wouldn't otherwise have.&lt;/p&gt;

&lt;p&gt;This is what we mean by the "knowledge backend pattern": Synthadoc manages accumulation, deduplication, contradiction detection, and retrieval. The calling agent manages reasoning and action. The division of labour is clean, the token envelope is caller-controlled, and the knowledge layer is persistent across agent sessions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Other v0.4.0 Changes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Plugin Install CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc plugin &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In earlier versions, installing the Obsidian plugin required locating the plugins directory manually, copying files, and restarting Obsidian. v0.4.0 adds a single CLI command that installs the plugin directly into the active Obsidian vault. That's the entire CLI step - the rest is done in Obsidian.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI Research Demo: Contradiction Detection End-to-End
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;demos/ai-research/&lt;/code&gt; demo now includes a PDF source (&lt;code&gt;llm-benchmarks-q1-2026.pdf&lt;/code&gt;) that explicitly disputes a claim in the existing wiki. Running the demo shows the full contradiction detection and flagging lifecycle: source ingested, existing page status updated to &lt;code&gt;contradicted&lt;/code&gt;, audit event recorded. The demo now covers all five IngestAgent decision paths - create, update, skip, flag, and contradiction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Decision Cache Prompt-Awareness
&lt;/h3&gt;

&lt;p&gt;A quiet but consequential fix: the LLM decision cache key previously included only the content hash and existing slugs. This meant that changes to &lt;code&gt;purpose.md&lt;/code&gt; - the file that scopes what belongs in the wiki - were invisible to the cache. An ingest run after a purpose change would serve stale decisions for any source whose content hadn't changed. v0.4.0 includes the decision prompt itself in the cache key, so any change to &lt;code&gt;purpose.md&lt;/code&gt; automatically busts the cache for all affected sources.&lt;/p&gt;




&lt;h2&gt;
  
  
  v0.1 to v0.4: What Changed
&lt;/h2&gt;

&lt;p&gt;It's worth stepping back to see what the four versions have built:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Core addition&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;v0.1.0&lt;/td&gt;
&lt;td&gt;Ingest-time synthesis: sources become a structured wiki, not raw chunks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v0.2.0&lt;/td&gt;
&lt;td&gt;Hybrid BM25 + vector search; query decomposition; knowledge gap detection; full audit trail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v0.3.0&lt;/td&gt;
&lt;td&gt;YouTube and web search ingestion; CLI provider integration (Claude Code &amp;amp; Opencode support); CJK support; contradiction detection improvements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v0.4.0&lt;/td&gt;
&lt;td&gt;Routing at scale; candidates staging for quality control; context packs as a knowledge backend API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The first two versions established the ingest-then-query model and made it trustworthy - every action audited, every contradiction surfaced. The third version expanded what you could ingest and who could afford to run it. The fourth version addresses what happens when the wiki grows to a size where the original flat model starts to show cracks.&lt;/p&gt;

&lt;p&gt;The routing benchmarks are honest about where we are: 191ms for a full-corpus query across 10,000 pages is fine for interactive use, but it compounds across sub-questions and becomes a real cost in high-throughput agentic pipelines. The routed 24ms figure at the same corpus size is where we want the system to be. The benchmark-gated release process we introduced in v0.4.0 is the mechanism that ensures it stays there as the codebase evolves.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try Synthadoc v0.4.0
&lt;/h2&gt;

&lt;p&gt;Synthadoc v0.4.0 is available now on GitHub under the AGPL-3.0 licence. The quickest path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/axoviq-ai/synthadoc.git
&lt;span class="nb"&gt;cd &lt;/span&gt;synthadoc
pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;
synthadoc &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing &lt;span class="nt"&gt;--target&lt;/span&gt; ~/wikis &lt;span class="nt"&gt;--demo&lt;/span&gt;
synthadoc plugin &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing
synthadoc use history-of-computing
synthadoc serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then open Obsidian, open &lt;code&gt;~/wikis/history-of-computing&lt;/code&gt; as a vault, install the Dataview community plugin, and enable the Synthadoc plugin. The demo wiki runs against Gemini Flash 2.0, which is free-tier eligible — no cost to run a full ingest-query-lint cycle.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quick-start guide:&lt;/strong&gt; &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design document:&lt;/strong&gt; &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/design.md" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc/blob/main/docs/design.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Release notes:&lt;/strong&gt; &lt;a href="https://github.com/axoviq-ai/synthadoc/releases/tag/v0.4.0" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc/releases/tag/v0.4.0&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Feedbacks are welcome. The routing taxonomy and context pack output format are both early - if your use case pushes against their current shape, we want to know.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>rag</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Synthadoc: Your Coding Tool Is Now Your Wiki Brain</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Tue, 05 May 2026 16:02:35 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-your-coding-tool-is-now-your-wiki-brain-41ib</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-your-coding-tool-is-now-your-wiki-brain-41ib</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk58cng7aldg2hl5s9c8p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk58cng7aldg2hl5s9c8p.png" alt=" " width="800" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you use Claude Code or Opencode, you are already paying for an LLM subscription.&lt;/p&gt;

&lt;p&gt;Before v0.3.0, running Synthadoc also required a separate API key - Anthropic, OpenAI, Gemini, or one of the others.&lt;/p&gt;

&lt;p&gt;v0.3.0 removes that requirement. Set &lt;code&gt;provider = "claude-code"&lt;/code&gt; in one config file and your coding tool subscription becomes the brain of your personal wiki. No additional API key. No additional cost.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters Beyond Convenience
&lt;/h2&gt;

&lt;p&gt;The obvious win is billing consolidation. But there is a more interesting point underneath it.&lt;/p&gt;

&lt;p&gt;Coding tools like Claude Code and Opencode started as pair programmers. They answer questions about your codebase. They write functions, fix bugs, explain unfamiliar code. That is what the marketing says.&lt;/p&gt;

&lt;p&gt;What they actually are is a general-purpose LLM with a subscription model - capable of anything the underlying model can do, not just code. The coding framing is a UI convention, not a capability limit.&lt;/p&gt;

&lt;p&gt;Synthadoc v0.3.0 makes that explicit. The same Claude subscription you use to navigate a TypeScript monorepo can now synthesize your research documents, detect contradictions in your notes, and answer structured questions about your domain knowledge. The subscription becomes infrastructure, not a point tool.&lt;/p&gt;

&lt;p&gt;This is the same pattern that made cloud storage interesting once it stopped being "backup for photos" and became "filesystem for any application." The value is in the generality.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;Synthadoc's LLM providers are abstracted behind a single &lt;code&gt;LLMProvider&lt;/code&gt; interface. Every agent - IngestAgent, QueryAgent, LintAgent - calls &lt;code&gt;provider.complete()&lt;/code&gt; and receives a structured response. The provider implementation handles everything underneath: API calls, authentication, response parsing, error handling, quota detection.&lt;/p&gt;

&lt;p&gt;For cloud APIs (Anthropic, OpenAI, Gemini), the provider sends an HTTP request. For Claude Code and Opencode, a new &lt;code&gt;CodingToolCLIProvider&lt;/code&gt; class takes a different route: it launches the CLI tool as a subprocess, passes the prompt via stdin, reads the response from stdout, and parses the output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# .synthadoc/config.toml&lt;/span&gt;
&lt;span class="nn"&gt;[agents]&lt;/span&gt;
&lt;span class="py"&gt;default&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"claude-code"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="py"&gt;lint&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"claude-code"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the entire configuration change. No API keys to set. No environment variables.&lt;/p&gt;

&lt;p&gt;The server also accepts a runtime override:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc serve &lt;span class="nt"&gt;-w&lt;/span&gt; my-wiki &lt;span class="nt"&gt;--provider&lt;/span&gt; claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This overrides &lt;code&gt;config.toml&lt;/code&gt; for the lifetime of the server process - useful for quickly switching providers without editing files.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrow6l186yn4q604v2r1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrow6l186yn4q604v2r1.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 1 - All agents share one provider interface. CLI providers slot in without changing anything else in the stack.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Works, What Does Not
&lt;/h2&gt;

&lt;p&gt;The CLI provider route is not a perfect substitute for a direct API connection. Two limitations matter:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No vector embeddings.&lt;/strong&gt; Synthadoc's optional semantic re-ranking (hybrid BM25 + vector search) requires an &lt;code&gt;embed()&lt;/code&gt; call, which the CLI tools do not expose. When using a CLI provider, search falls back to BM25-only. For most wikis up to a few hundred pages, BM25 is fast and accurate - this only matters if you have enabled vector search and are running a large corpus.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quota is shared.&lt;/strong&gt; Your CLI provider subscription has a daily or session quota. Heavy ingest operations - batch-ingesting 50 documents, for example - consume from the same budget as your coding work. The engine detects quota exhaustion, permanently fails the job with a clear message, and does not retry. You resume after the quota resets.&lt;/p&gt;

&lt;p&gt;Both are manageable trade-offs for a personal wiki. If you need vector search or are running a high-volume ingest, a direct API key is the better choice — pick from any of the eight providers based on your quality, vision, and budget requirements. If you want zero additional configuration and are comfortable with BM25 search, the CLI provider path is clean.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Provider Landscape in v0.3.0
&lt;/h2&gt;

&lt;p&gt;One of the quieter design decisions in Synthadoc is that the LLM provider is not baked into the product. You pick the one that fits your constraints:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;API Key Required&lt;/th&gt;
&lt;th&gt;Free Tier&lt;/th&gt;
&lt;th&gt;Vision&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemini Flash&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;1M tokens/day, no credit card&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Default; best free option&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Groq&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Rate-limited&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Fast, good for text-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Fully local&lt;/td&gt;
&lt;td&gt;Model-dependent&lt;/td&gt;
&lt;td&gt;Runs on your machine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No (cheap)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Very low cost per token&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MiniMax&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Included with subscription&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;New in v0.3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Opencode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Included with subscription&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;New in v0.3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0db5vay76t97p8x28mpf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0db5vay76t97p8x28mpf.png" alt=" " width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 2 — Per-agent model routing from a single subscription. Ingest and query use Opus for synthesis quality; lint uses Haiku for speed and lower quota consumption. Two config lines; one bill.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;DeepSeek is also new in v0.3.0 - routes through the OpenAI-compatible endpoint, very low cost per token for text-heavy ingest workloads, and the R1 reasoning model &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; tags are stripped automatically before the response reaches Synthadoc.&lt;/p&gt;

&lt;p&gt;The point of this table is not that any one provider is the best. It is that you should not need to change your workflow to use the tool. If you are already set up with Anthropic for Claude Code, switching your wiki to the same provider takes one config line.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quota Exhaustion and Reliability
&lt;/h2&gt;

&lt;p&gt;One detail the implementation gets right: quota exhaustion is a permanent failure, not a retryable one.&lt;/p&gt;

&lt;p&gt;Most LLM API errors are transient - a timeout, a 5xx from an overloaded endpoint. Synthadoc retries those with exponential backoff. Quota exhaustion is different. Retrying a quota-exhausted call wastes the next request, then the one after that, burning whatever small remaining budget exists.&lt;/p&gt;

&lt;p&gt;When a &lt;code&gt;CodingToolCLIProvider&lt;/code&gt; detects that the subprocess output indicates quota exhaustion, it raises &lt;code&gt;CodingToolQuotaExhaustedException&lt;/code&gt;. The orchestrator catches this and calls &lt;code&gt;fail_permanent()&lt;/code&gt; on the job - no retries, no backoff. The job sits in &lt;code&gt;failed&lt;/code&gt; state with a clear message: "Claude Code quota exhausted. Resume after quota resets." You do not return to find a queue of 50 failed retries.&lt;/p&gt;

&lt;p&gt;Small detail. Meaningful when it happens.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started with Claude Code or Opencode Provider
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Prerequisite:&lt;/strong&gt; Claude Code or Opencode must already be installed and logged in on your machine. If not, follow the &lt;a href="https://docs.anthropic.com/en/docs/claude-code" rel="noopener noreferrer"&gt;Claude Code setup guide&lt;/a&gt; or the &lt;a href="https://opencode.ai/docs" rel="noopener noreferrer"&gt;Opencode setup guide&lt;/a&gt; first.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Clone and install&lt;/span&gt;
git clone https://github.com/axoviq-ai/synthadoc.git
&lt;span class="nb"&gt;cd &lt;/span&gt;synthadoc
pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 2. Install the demo wiki&lt;/span&gt;
&lt;span class="c"&gt;# Linux / macOS&lt;/span&gt;
synthadoc &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing &lt;span class="nt"&gt;--target&lt;/span&gt; ~/wikis &lt;span class="nt"&gt;--demo&lt;/span&gt;

&lt;span class="c"&gt;# Windows&lt;/span&gt;
synthadoc &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing &lt;span class="nt"&gt;--target&lt;/span&gt; %USERPROFILE%&lt;span class="se"&gt;\w&lt;/span&gt;ikis &lt;span class="nt"&gt;--demo&lt;/span&gt;

&lt;span class="c"&gt;# 3. Set as the active wiki (no -w needed from here on)&lt;/span&gt;
synthadoc use history-of-computing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Configure the CLI provider&lt;/strong&gt; - edit &lt;code&gt;~/wikis/history-of-computing/.synthadoc/config.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[agents]&lt;/span&gt;
&lt;span class="py"&gt;default&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"claude-code"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or skip editing config.toml entirely and pass &lt;code&gt;--provider&lt;/code&gt; at startup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 5. Start the engine - no API key needed&lt;/span&gt;
synthadoc serve &lt;span class="nt"&gt;--provider&lt;/span&gt; claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;6. Install the Obsidian plugin&lt;/strong&gt; - from the cloned &lt;code&gt;synthadoc/&lt;/code&gt; repo directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux / macOS&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/synthadoc   &lt;span class="c"&gt;# or wherever you cloned it&lt;/span&gt;
&lt;span class="nv"&gt;vault&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;~/wikis/history-of-computing
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$vault&lt;/span&gt;&lt;span class="s2"&gt;/.obsidian/plugins/synthadoc"&lt;/span&gt;
&lt;span class="nb"&gt;cp &lt;/span&gt;obsidian-plugin/main.js obsidian-plugin/manifest.json &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$vault&lt;/span&gt;&lt;span class="s2"&gt;/.obsidian/plugins/synthadoc/"&lt;/span&gt;

&lt;span class="c"&gt;# Windows (cmd.exe)&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; %USERPROFILE%&lt;span class="se"&gt;\s&lt;/span&gt;ynthadoc
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="s2"&gt;"%USERPROFILE%&lt;/span&gt;&lt;span class="se"&gt;\w&lt;/span&gt;&lt;span class="s2"&gt;ikis&lt;/span&gt;&lt;span class="se"&gt;\h&lt;/span&gt;&lt;span class="s2"&gt;istory-of-computing&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;obsidian&lt;/span&gt;&lt;span class="se"&gt;\p&lt;/span&gt;&lt;span class="s2"&gt;lugins&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="s2"&gt;ynthadoc"&lt;/span&gt;
copy obsidian-plugin&lt;span class="se"&gt;\m&lt;/span&gt;ain.js &lt;span class="s2"&gt;"%USERPROFILE%&lt;/span&gt;&lt;span class="se"&gt;\w&lt;/span&gt;&lt;span class="s2"&gt;ikis&lt;/span&gt;&lt;span class="se"&gt;\h&lt;/span&gt;&lt;span class="s2"&gt;istory-of-computing&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;obsidian&lt;/span&gt;&lt;span class="se"&gt;\p&lt;/span&gt;&lt;span class="s2"&gt;lugins&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="s2"&gt;ynthadoc&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;
copy obsidian-plugin&lt;/span&gt;&lt;span class="se"&gt;\m&lt;/span&gt;&lt;span class="s2"&gt;anifest.json "&lt;/span&gt;%USERPROFILE%&lt;span class="se"&gt;\w&lt;/span&gt;ikis&lt;span class="se"&gt;\h&lt;/span&gt;istory-of-computing&lt;span class="se"&gt;\.&lt;/span&gt;obsidian&lt;span class="se"&gt;\p&lt;/span&gt;lugins&lt;span class="se"&gt;\s&lt;/span&gt;ynthadoc&lt;span class="se"&gt;\"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in Obsidian: fully quit and reopen, then:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Settings → Community plugins → Synthadoc → Enable&lt;/strong&gt;, set Server URL to &lt;code&gt;http://127.0.0.1:7070&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Settings → Community plugins → Browse → search "Dataview" → Install → Enable&lt;/strong&gt; (required for the live dashboard)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The full configuration reference is in &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/design.md#coding-tool-cli-providers--no-api-key-needed" rel="noopener noreferrer"&gt;docs/design.md - Coding tool CLI providers&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Else Is in v0.3.0
&lt;/h2&gt;

&lt;p&gt;The CLI provider work is one piece of a larger release. v0.3.0 also ships:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;YouTube video ingestion&lt;/strong&gt; - paste a URL, get a structured wiki page with executive summary and &lt;code&gt;[MM:SS]&lt;/code&gt; timestamped transcript&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web search fan-out&lt;/strong&gt; - one search query decomposes into sub-questions, ingests multiple sources, builds cross-references automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CJK multilingual support&lt;/strong&gt; - Chinese, Japanese, and Korean queries no longer trigger false knowledge-gap reports&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge gap detection hardening&lt;/strong&gt; - signal 5 redesigned for deterministic multi-aspect query scoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DeepSeek provider&lt;/strong&gt; - eighth provider, OpenAI-compatible, very low cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full release notes: &lt;a href="https://github.com/axoviq-ai/synthadoc/releases/tag/v0.3.0" rel="noopener noreferrer"&gt;github.com/axoviq-ai/synthadoc/releases/tag/v0.3.0&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Synthadoc is open source under AGPL-3.0 at &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;github.com/axoviq-ai/synthadoc&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you are already using Claude Code or Opencode daily, the marginal cost of running a structured personal wiki on top of the same subscription is zero. The marginal benefit compounds over time. That is a trade worth making.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>showdev</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Synthadoc: From YouTube to Wiki: How v0.3.0 Turns Any Content into Structured Knowledge</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Mon, 04 May 2026 19:10:00 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-from-youtube-to-wiki-how-v030-turns-any-content-into-structured-knowledge-4l38</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/synthadoc-from-youtube-to-wiki-how-v030-turns-any-content-into-structured-knowledge-4l38</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ficenc67hvtxdxuqc6yso.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ficenc67hvtxdxuqc6yso.png" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You have a YouTube playlist of 40 conference talks. You have bookmarked 200 web articles. You have a folder of PDFs you keep meaning to read.&lt;/p&gt;

&lt;p&gt;None of that is knowledge yet. It is a queue.&lt;/p&gt;

&lt;p&gt;Synthadoc v0.3.0 drains the queue.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with "Saving" Things
&lt;/h2&gt;

&lt;p&gt;Most people have a system for collecting information. Bookmarks, Notion pages, Pocket, starred emails. The collection grows. The retrieval never quite works. You remember you saved something about transformer attention mechanisms six months ago but cannot find it. You watch a 45-minute conference talk, absorb maybe 30% of it, and have no structured record of the rest.&lt;/p&gt;

&lt;p&gt;The issue is not storage. It is synthesis. Saving a link preserves a pointer. It does not extract the claim, connect it to what you already know, or surface the contradiction with something you read last week.&lt;/p&gt;

&lt;p&gt;Synthadoc v0.1.0 solved this for documents - PDFs, Word files, spreadsheets, images. v0.2.0 added hybrid BM25 + vector search so retrieval stayed sharp as the wiki grew. v0.3.0 extends the ingest surface to the two sources where most knowledge actually lives in 2026: &lt;strong&gt;video&lt;/strong&gt; and &lt;strong&gt;the live web&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ingesting a YouTube Video
&lt;/h2&gt;

&lt;p&gt;The workflow is a single command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc ingest &lt;span class="s2"&gt;"https://www.youtube.com/watch?v=dQw4w9WgXcQ"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or from Obsidian: open the &lt;strong&gt;Ingest: from URL...&lt;/strong&gt; modal, paste the YouTube link, press Ingest.&lt;/p&gt;

&lt;p&gt;What happens next:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Synthadoc fetches the video's caption track - no audio download, no third-party transcription API, no API key required.&lt;/li&gt;
&lt;li&gt;The transcript is chunked with embedded &lt;code&gt;[MM:SS]&lt;/code&gt; timestamps preserved so every claim is traceable to a specific moment in the video.&lt;/li&gt;
&lt;li&gt;The LLM generates an &lt;strong&gt;executive summary&lt;/strong&gt;: what the video is about, the main topics covered, and the key takeaway - in three to five sentences.&lt;/li&gt;
&lt;li&gt;The full timestamped transcript follows the summary in the wiki page.&lt;/li&gt;
&lt;li&gt;Cross-references to existing wiki pages are built automatically during ingest. If your wiki already has a page on "attention mechanisms" and the video mentions it, a &lt;code&gt;[[attention-mechanisms]]&lt;/code&gt; wikilink appears in the new page.&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnpltn6m04pf84739x0gj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnpltn6m04pf84739x0gj.png" alt=" " width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 1 — The YouTube ingest pipeline. One URL in; a structured, cross-referenced wiki page out.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The result is a wiki page that looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Illustrated&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Transformer"&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;active&lt;/span&gt;
&lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;medium&lt;/span&gt;
&lt;span class="na"&gt;created&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2026-05-03T14:22:01&lt;/span&gt;
&lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;youtube.com/watch&lt;/span&gt;&lt;span class="pi"&gt;?&lt;/span&gt;&lt;span class="nv"&gt;v=...&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gs"&gt;**Executive summary:**&lt;/span&gt; Jay Alammar's visual walkthrough of the Transformer
architecture. Covers self-attention, multi-head attention, positional
encoding, and the encoder-decoder structure using animated diagrams.
Key takeaway: the attention mechanism allows each token to "look at" every
other token in the sequence simultaneously, which is what enables parallelism
over RNNs.
&lt;span class="p"&gt;
---
&lt;/span&gt;
[00:42] The problem with sequence models is that they process tokens
one at a time, making parallelisation during training difficult...

[03:15] Self-attention computes a weighted sum of all values in the
sequence. The weights come from a compatibility function between
a query and all keys...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No manual work. The video is now part of your wiki.&lt;/p&gt;




&lt;h2&gt;
  
  
  Web Search Fan-Out
&lt;/h2&gt;

&lt;p&gt;The YouTube capability sits alongside a web search feature that works differently from a standard web search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc ingest &lt;span class="s2"&gt;"search for: transformer attention mechanisms 2025"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Synthadoc does not return ten blue links. It:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Decomposes the query into 3–5 sub-questions covering different facets of the topic.&lt;/li&gt;
&lt;li&gt;Searches the web for each sub-question independently.&lt;/li&gt;
&lt;li&gt;Ingests the top results for each, synthesizing each source into a wiki page.&lt;/li&gt;
&lt;li&gt;Builds cross-references across all the newly created pages.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A single web search command can add eight to fifteen structured pages to your wiki in one operation. The result is not a reading list - it is synthesized knowledge, cross-referenced and ready to query.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Timestamps Matter
&lt;/h2&gt;

&lt;p&gt;One detail worth pausing on: the &lt;code&gt;[MM:SS]&lt;/code&gt; timestamps in the transcript are not decoration.&lt;/p&gt;

&lt;p&gt;When you later ask:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;synthadoc query &lt;span class="s2"&gt;"what did the transformer paper say about positional encoding?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The answer includes the source citation. Because the timestamp is embedded in the page body, the citation points not just to the video but to the &lt;em&gt;moment&lt;/em&gt; in the video. You can verify the claim in thirty seconds by jumping to that timestamp.&lt;/p&gt;

&lt;p&gt;This is the same principle that makes citations in academic papers useful. The claim is not just "somewhere in this source." It is traceable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Ingest Surface in v0.3.0
&lt;/h2&gt;

&lt;p&gt;With v0.3.0, Synthadoc can ingest from the following source types in a single unified pipeline:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;How&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PDF, Word, XLSX, CSV, TXT&lt;/td&gt;
&lt;td&gt;Direct file extraction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Images (PNG, JPG, WEBP, etc.)&lt;/td&gt;
&lt;td&gt;Vision LLM extracts text and structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web pages and articles&lt;/td&gt;
&lt;td&gt;URL fetch + synthesis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YouTube videos&lt;/td&gt;
&lt;td&gt;Caption extraction + executive summary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web search results&lt;/td&gt;
&lt;td&gt;Multi-query fan-out + synthesis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PowerPoint / presentations&lt;/td&gt;
&lt;td&gt;Slide text extraction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every source type produces the same output: a structured Markdown wiki page with frontmatter, wikilinks to related pages, and a traceable source reference.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4d2aqxp61w30z9dl86o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4d2aqxp61w30z9dl86o.png" alt=" " width="800" height="1421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 2 — Every source type feeds the same pipeline. The wiki grows; the query quality compounds.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Wiki Looks Like After 30 Days
&lt;/h2&gt;

&lt;p&gt;The compounding effect is the real story. After 30 days of normal usage — ingesting the things you would have saved anyway - a Synthadoc wiki typically contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;50–150 pages&lt;/strong&gt; covering your domain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic cross-references&lt;/strong&gt; linking related concepts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contradiction flags&lt;/strong&gt; where two sources disagree (Synthadoc surfaces these, you resolve them)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orphan detection&lt;/strong&gt; for pages no other page links to yet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full audit trail&lt;/strong&gt; of what was ingested, when, and at what cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The 50th query to this wiki is dramatically smarter than the first, because every previous ingest has built the structure the query runs against.&lt;/p&gt;

&lt;p&gt;That is the core idea from v0.1.0, still true - but now the inputs include everything.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Clone and install&lt;/span&gt;
git clone https://github.com/axoviq-ai/synthadoc.git
&lt;span class="nb"&gt;cd &lt;/span&gt;synthadoc
pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Set an API key&lt;/strong&gt; — Gemini Flash is the default (free tier, 1M tokens/day, no credit card).&lt;br&gt;
Get a key at &lt;a href="https://aistudio.google.com/app/apikey" rel="noopener noreferrer"&gt;aistudio.google.com/app/apikey&lt;/a&gt;, then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS / Linux&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;AIza…

&lt;span class="c"&gt;# Windows&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;AIza…
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No API key? If you already have Claude Code or Opencode, set &lt;code&gt;provider = "claude-code"&lt;/code&gt; in your wiki's &lt;code&gt;config.toml&lt;/code&gt; instead - see &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/design.md#coding-tool-cli-providers--no-api-key-needed" rel="noopener noreferrer"&gt;docs/design.md&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 3. Install the demo wiki&lt;/span&gt;
&lt;span class="c"&gt;# Linux / macOS&lt;/span&gt;
synthadoc &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing &lt;span class="nt"&gt;--target&lt;/span&gt; ~/wikis &lt;span class="nt"&gt;--demo&lt;/span&gt;

&lt;span class="c"&gt;# Windows&lt;/span&gt;
synthadoc &lt;span class="nb"&gt;install &lt;/span&gt;history-of-computing &lt;span class="nt"&gt;--target&lt;/span&gt; %USERPROFILE%&lt;span class="se"&gt;\w&lt;/span&gt;ikis &lt;span class="nt"&gt;--demo&lt;/span&gt;

&lt;span class="c"&gt;# 4. Set as the active wiki (no -w needed from here on)&lt;/span&gt;
synthadoc use history-of-computing

&lt;span class="c"&gt;# 5. Start the engine&lt;/span&gt;
synthadoc serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;6. Install the Obsidian plugin&lt;/strong&gt; — from the cloned &lt;code&gt;synthadoc/&lt;/code&gt; repo directory, copy the pre-built plugin into your vault:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Linux / macOS (run from the synthadoc/ repo root)&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/synthadoc   &lt;span class="c"&gt;# or wherever you cloned it&lt;/span&gt;
&lt;span class="nv"&gt;vault&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;~/wikis/history-of-computing
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$vault&lt;/span&gt;&lt;span class="s2"&gt;/.obsidian/plugins/synthadoc"&lt;/span&gt;
&lt;span class="nb"&gt;cp &lt;/span&gt;obsidian-plugin/main.js obsidian-plugin/manifest.json &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$vault&lt;/span&gt;&lt;span class="s2"&gt;/.obsidian/plugins/synthadoc/"&lt;/span&gt;

&lt;span class="c"&gt;# Windows (cmd.exe — run from the synthadoc\ repo root)&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; %USERPROFILE%&lt;span class="se"&gt;\s&lt;/span&gt;ynthadoc
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="s2"&gt;"%USERPROFILE%&lt;/span&gt;&lt;span class="se"&gt;\w&lt;/span&gt;&lt;span class="s2"&gt;ikis&lt;/span&gt;&lt;span class="se"&gt;\h&lt;/span&gt;&lt;span class="s2"&gt;istory-of-computing&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;obsidian&lt;/span&gt;&lt;span class="se"&gt;\p&lt;/span&gt;&lt;span class="s2"&gt;lugins&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="s2"&gt;ynthadoc"&lt;/span&gt;
copy obsidian-plugin&lt;span class="se"&gt;\m&lt;/span&gt;ain.js &lt;span class="s2"&gt;"%USERPROFILE%&lt;/span&gt;&lt;span class="se"&gt;\w&lt;/span&gt;&lt;span class="s2"&gt;ikis&lt;/span&gt;&lt;span class="se"&gt;\h&lt;/span&gt;&lt;span class="s2"&gt;istory-of-computing&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;obsidian&lt;/span&gt;&lt;span class="se"&gt;\p&lt;/span&gt;&lt;span class="s2"&gt;lugins&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="s2"&gt;ynthadoc&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;
copy obsidian-plugin&lt;/span&gt;&lt;span class="se"&gt;\m&lt;/span&gt;&lt;span class="s2"&gt;anifest.json "&lt;/span&gt;%USERPROFILE%&lt;span class="se"&gt;\w&lt;/span&gt;ikis&lt;span class="se"&gt;\h&lt;/span&gt;istory-of-computing&lt;span class="se"&gt;\.&lt;/span&gt;obsidian&lt;span class="se"&gt;\p&lt;/span&gt;lugins&lt;span class="se"&gt;\s&lt;/span&gt;ynthadoc&lt;span class="se"&gt;\"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in Obsidian: fully quit and reopen Obsidian, then:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Settings → Community plugins → Synthadoc → Enable&lt;/strong&gt;, set Server URL to &lt;code&gt;http://127.0.0.1:7070&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Settings → Community plugins → Browse → search "Dataview" → Install → Enable&lt;/strong&gt; (required for the live dashboard)
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 7. Ingest a YouTube video&lt;/span&gt;
synthadoc ingest &lt;span class="s2"&gt;"https://www.youtube.com/watch?v=YOUR_VIDEO_ID"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Synthadoc is open source under AGPL-3.0. The full quick-start guide, architecture docs, and demo wiki are at &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;github.com/axoviq-ai/synthadoc&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;👉 README: &lt;a href="https://github.com/axoviq-ai/synthadoc#readme" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc#readme&lt;/a&gt;&lt;br&gt;
👉 Quick-start guide: &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;*Synthadoc v0.3.0 also ships CJK multilingual query support, knowledge gap detection hardening, a DeepSeek provider, and coding tool CLI providers (Claude Code, Opencode) - no separate API key needed if you already have a coding tool subscription. Full release notes in &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/design.md" rel="noopener noreferrer"&gt;docs/design.md&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>showdev</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Synthadoc: Beyond Keyword Search -How Combines BM25 and Vector Search to Build a Smarter Domain Wiki</title>
      <dc:creator>Paul Chen</dc:creator>
      <pubDate>Mon, 27 Apr 2026 00:21:34 +0000</pubDate>
      <link>https://dev.to/paul_chen_90371fe7426cb44/beyond-keyword-search-how-synthadoc-v020-combines-bm25-and-vector-search-to-build-a-smarter-43l7</link>
      <guid>https://dev.to/paul_chen_90371fe7426cb44/beyond-keyword-search-how-synthadoc-v020-combines-bm25-and-vector-search-to-build-a-smarter-43l7</guid>
      <description>&lt;h1&gt;
  
  
  What is Synthadoc?
&lt;/h1&gt;

&lt;p&gt;Synthadoc is an open-source, LLM-powered wiki engine. Point it at your&lt;br&gt;
organisation's documents - PDFs, PPTX, spreadsheets, DOCX, images, or web pages - and it builds a persistent, structured knowledge base your team can query, audit, and extend over time.&lt;/p&gt;

&lt;p&gt;Unlike general-purpose RAG pipelines that retrieve raw chunks at query time and discard results afterwards, Synthadoc compiles knowledge at ingest time into a living wiki that grows smarter and more consistent with every new source. The core lifecycle is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ingest: extract and synthesise facts from any source format (PDF,
XLSX, PNG, web URL)&lt;/li&gt;
&lt;li&gt;Detect: flag contradictions with existing pages and quarantine them
for review&lt;/li&gt;
&lt;li&gt;Link: connect related pages and surface knowledge gaps&lt;/li&gt;
&lt;li&gt;Query: answer questions with hybrid BM25 + optional vector search, citing the pages used&lt;/li&gt;
&lt;li&gt;Lint: resolve contradictions and surface orphan pages for human or
automated action&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Synthadoc is designed for organisations that need domain-specific, auditable knowledge management: legal teams tracking regulatory&lt;br&gt;
precedent, financial analysts maintaining market research, engineering&lt;br&gt;
groups documenting system behaviour, and research teams building&lt;br&gt;
institutional memory that persists beyond individual contributors.&lt;/p&gt;

&lt;p&gt;Synthadoc v0.2.0 is released last week, it scales seamlessly while maintaining accuracy through autonomous self-optimization.&lt;/p&gt;

&lt;p&gt;👉 Synthadoc GitHub: &lt;a href="https://github.com/axoviq-ai/synthadoc" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;Most LLM knowledge tools take one of two approaches to retrieval: pure keyword search (fast but vocabulary-dependent) or pure vector/semantic search (flexible but resource-intensive). In practice, both have meaningful blind spots.&lt;/p&gt;

&lt;p&gt;Synthadoc v0.2.0 ships a hybrid retrieval pipeline that uses BM25 as a fast, precise first-pass filter and optional vector re-ranking as a semantic second pass. The result is a system that is accurate on exact-match queries, robust on paraphrased or conceptual queries, and fast enough to run on a laptop with no cloud dependency.&lt;/p&gt;

&lt;p&gt;This post explains how each technique works, where each one falls short alone, why the hybrid matters for a persistent domain wiki, and how Synthadoc v0.2.0 layers query decomposition and knowledge gap detection on top.&lt;/p&gt;

&lt;h1&gt;
  
  
  What Is BM25?
&lt;/h1&gt;

&lt;p&gt;BM25 (Best Match 25) is a probabilistic ranking function. It scores a page relative to a query by counting how often query terms appear in the page, discounting terms that appear in almost every page, and penalising very long pages for artificially inflated counts. BM25 is the retrieval backbone of Elasticsearch, Lucene, and most production search systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scoring intuition
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fegaoeg5kyikmfkmz5481.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fegaoeg5kyikmfkmz5481.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where BM25 falls short
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Vocabulary mismatch: query says "contributions", page says "pioneered". Score: near zero.&lt;/li&gt;
&lt;li&gt;Synonyms: "ML" and "machine learning" are different tokens.&lt;/li&gt;
&lt;li&gt;Conceptual distance: "reasoning under uncertainty" and "probabilistic inference" are semantically identical but lexically distant.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On a domain wiki ingested from diverse sources (papers, docs, blog posts, PDFs), the same concept will be described in many different vocabularies. BM25 alone misses a meaningful fraction of relevant pages.&lt;/p&gt;

&lt;h1&gt;
  
  
  What Is Vector Search?
&lt;/h1&gt;

&lt;p&gt;Vector (semantic) search encodes text into dense numerical embeddings using a neural language model. Semantically similar texts land close together in that high-dimensional space regardless of surface wording. Similarity is measured as cosine distance between the query vector and each page vector.&lt;/p&gt;

&lt;h2&gt;
  
  
  Embedding intuition
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakwlpeoe9thsuu831zza.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakwlpeoe9thsuu831zza.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The two sentences share almost no keywords, but their vectors point in nearly the same direction because the model understands they describe the same concept.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where vector search falls short alone
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Cold-start penalty: embedding thousands of pages takes time and compute; BM25 is instant.&lt;/li&gt;
&lt;li&gt;Exact-match dilution: specific product names or identifiers can be blurred by semantic proximity.&lt;/li&gt;
&lt;li&gt;Domain drift: general-purpose models may not distinguish highly specific domain terminology.&lt;/li&gt;
&lt;li&gt;Resource requirement: needs a model (~130 MB for bge-small-en-v1.5) and inference at query time.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  BM25 vs. Vector Search: Side-by-Side
&lt;/h1&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;BM25&lt;/th&gt;
&lt;th&gt;Vector / Semantic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Matching strategy&lt;/td&gt;
&lt;td&gt;Matching strategy Exact term overlap (TF x IDF)&lt;/td&gt;
&lt;td&gt;Semantic similarity (cosine distance)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vocabulary required&lt;/td&gt;
&lt;td&gt;Query words must appear in page&lt;/td&gt;
&lt;td&gt;Paraphrases and synonyms handled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Microseconds -- no model needed&lt;/td&gt;
&lt;td&gt;Milliseconds -- model inference required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setup cost&lt;/td&gt;
&lt;td&gt;Zero -- pure algorithm&lt;/td&gt;
&lt;td&gt;~130 MB model download (one-time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exact-match queries&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Synonym / paraphrase&lt;/td&gt;
&lt;td&gt;Often misses&lt;/td&gt;
&lt;td&gt;Handles well&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Domain terminology&lt;/td&gt;
&lt;td&gt;Good if terms match&lt;/td&gt;
&lt;td&gt;Depends on model training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Interpretability&lt;/td&gt;
&lt;td&gt;Score is explainable&lt;/td&gt;
&lt;td&gt;Black-box similarity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Known vocabulary, structured content&lt;/td&gt;
&lt;td&gt;Conceptual queries, diverse sources&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h1&gt;
  
  
  How Synthadoc Combines Both
&lt;/h1&gt;

&lt;p&gt;Synthadoc uses a hybrid pipeline where BM25 and vector search are not alternatives - they are sequential layers. BM25 does the heavy filtering; vector re-ranks the survivors.&lt;/p&gt;

&lt;h2&gt;
  
  
  The retrieval pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg4ydqh6gcu86b8ffu8qp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg4ydqh6gcu86b8ffu8qp.png" alt=" " width="800" height="1200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Query decomposition: why it matters for retrieval
&lt;/h2&gt;

&lt;p&gt;Before any search happens, Synthadoc v0.2.0 breaks compound questions into focused sub-questions via an LLM call. Each sub-question runs its own BM25 (and vector) search in parallel. Results are merged by best score per page before synthesis. One complex query can retrieve from multiple distinct parts of the wiki simultaneously.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Example: Query: "Compare Turing's contributions with Von Neumann's &lt;br&gt;
architecture"&lt;br&gt;
-&amp;gt; Decomposed: ["Turing contributions computing"] | ["Von Neumann &lt;br&gt;
architecture design"]&lt;br&gt;
-&amp;gt; Two parallel BM25 searches -&amp;gt; merged candidates -&amp;gt; one synthesised&lt;br&gt;
answer&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Knowledge gap detection
&lt;/h2&gt;

&lt;p&gt;After retrieval, Synthadoc evaluates three independent signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fewer than 3 pages retrieved - the wiki barely covers the topic&lt;/li&gt;
&lt;li&gt;Max BM25 score below configurable threshold (default: 2.0) - weak keyword overlap&lt;/li&gt;
&lt;li&gt;Fewer than 2 candidates contain key nouns from the question - off-topic matches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a gap fires, Synthadoc generates targeted web search suggestions and surfaces them as an Obsidian callout or CLI tip, creating a feedback loop that makes the wiki progressively denser over time.&lt;/p&gt;

&lt;h1&gt;
  
  
  Practical Examples in Synthadoc
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Example 1: BM25 exact match (no vector needed)
&lt;/h2&gt;

&lt;p&gt;Wiki page: "Alan Turing - Enigma and the Bombe Machine"&lt;/p&gt;

&lt;p&gt;Query: "Bombe machine Enigma decryption"&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;BM25 succeeds: BM25 score: HIGH - "Bombe", "machine", "Enigma", &lt;br&gt;
"decryption" all present.&lt;br&gt;
Result: page retrieved correctly. Vector re-ranking not required.`**&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Example 2: BM25 misses, vector rescues
&lt;/h2&gt;

&lt;p&gt;Wiki page: "Alan Turing - Theoretical Foundations of Modern Computers"&lt;/p&gt;

&lt;p&gt;Query: "What were Turing's contributions to computing?"&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;BM25 misses, Vector rescues: BM25 score: LOW - "contributions" and &lt;br&gt;
"computing" absent from the page.&lt;br&gt;
Vector cosine score: HIGH - embeddings are semantically close.&lt;br&gt;
Result: page retrieved correctly after re-ranking. BM25 alone would have &amp;gt; missed it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Example 3: Knowledge gap fires, ingest suggestion generated
&lt;/h2&gt;

&lt;p&gt;Wiki: finance domain. Query: "What is the impact of quantitative easing on inflation?"&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Gap detected: BM25: 1 page returned, score 0.8 (below threshold 2.0) &lt;br&gt;
Knowledge gap detected. Synthadoc generates:&lt;br&gt;
synthadoc ingest "search for: quantitative easing inflation impact" -w finance-wiki&lt;br&gt;
synthadoc ingest "search for: central bank monetary policy effects" -w finance-wiki&lt;br&gt;
After ingest and re-query: 7 pages returned, fully synthesised answer.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Enabling Vector Search in Synthadoc
&lt;/h1&gt;

&lt;p&gt;BM25 is the default - zero setup, zero dependencies. To add vector re-ranking:&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Install fastembed
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;pip install fastembed&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Enable in config
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;[search]&lt;br&gt;
vector = true&lt;br&gt;
vector_top_candidates = 20 # BM25 pool size before re-ranking&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Restart the server
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;synthadoc serve -w my-wiki&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;On first start, Synthadoc downloads BAAI/bge-small-en-v1.5 (~130 MB) once and embeds existing pages in the background. BM25 stays active throughout - no downtime. If the model is unavailable, the system falls back to BM25 silently.&lt;/p&gt;

&lt;h1&gt;
  
  
  Synthadoc v0.2.0: Full Feature Summary
&lt;/h1&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Query decomposition&lt;/td&gt;
&lt;td&gt;Compound questions split into parallel BM25 sub-queries, merged before synthesis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector re-ranking&lt;/td&gt;
&lt;td&gt;Opt-in semantic re-ranking (BAAI/bge-small-en-v1.5 via fastembed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Knowledge gap detection&lt;/td&gt;
&lt;td&gt;3-signal gap check; auto-generates targeted ingest suggestions as Obsidian callout&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web search decomposition&lt;/td&gt;
&lt;td&gt;Broad search topics split into focused Tavily queries; URL deduplication and cap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-model cost tracking&lt;/td&gt;
&lt;td&gt;Per-token rate table; ingest + query cost in audit.db, CLI, and Obsidian&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query audit trail&lt;/td&gt;
&lt;td&gt;Full query history with sub-question count, tokens, cost, timestamp&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Obsidian live web search view&lt;/td&gt;
&lt;td&gt;Real-time polling panel: phase, pages created, URL errors as fan-out completes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8 new Obsidian commands&lt;/td&gt;
&lt;td&gt;15 commands total: lint, auto-resolve, job retry/purge, audit history, scaffold&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MiniMax support&lt;/td&gt;
&lt;td&gt;M2.5/M2.7 reasoning models with reasoning_content fallback for structured output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rate-limit requeue&lt;/td&gt;
&lt;td&gt;429 responses requeue job (retry budget preserved); fail-fast on daily quota&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Job crash recovery&lt;/td&gt;
&lt;td&gt;in_progress jobs at shutdown auto-reset to pending on next startup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bulk job cancel&lt;/td&gt;
&lt;td&gt;Cancel all pending jobs in one operation via CLI or API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h1&gt;
  
  
  How Synthadoc Compares to Alternatives
&lt;/h1&gt;

&lt;p&gt;Most LLM knowledge tools are general-purpose RAG pipelines that retrieve raw chunks at query time with no persistent synthesis. Synthadoc compiles knowledge at ingest time, maintains a living wiki, and is designed for domain-specific, auditable deployments.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;Synthadoc v0.2.0&lt;/th&gt;
&lt;th&gt;LlamaIndex / LangChain&lt;/th&gt;
&lt;th&gt;Notion AI&lt;/th&gt;
&lt;th&gt;Obsidian Copilot&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ingest-time synthesis&lt;/td&gt;
&lt;td&gt;Compiled wiki&lt;/td&gt;
&lt;td&gt;Raw chunks at query time&lt;/td&gt;
&lt;td&gt;Page-level only&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Domain scope filtering&lt;/td&gt;
&lt;td&gt;purpose.md&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-model support&lt;/td&gt;
&lt;td&gt;6 providers&lt;/td&gt;
&lt;td&gt;Many providers&lt;/td&gt;
&lt;td&gt;OpenAI only&lt;/td&gt;
&lt;td&gt;OpenAI / Ollama&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audit trail&lt;/td&gt;
&lt;td&gt;Full SQLite audit&lt;/td&gt;
&lt;td&gt;None built-in&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost tracking&lt;/td&gt;
&lt;td&gt;Per-token, per-op&lt;/td&gt;
&lt;td&gt;Manual / callback&lt;/td&gt;
&lt;td&gt;Opaque&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Offline / local&lt;/td&gt;
&lt;td&gt;Fully local&lt;/td&gt;
&lt;td&gt;Depends on provider&lt;/td&gt;
&lt;td&gt;Cloud only&lt;/td&gt;
&lt;td&gt;Ollama&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Obsidian-native output&lt;/td&gt;
&lt;td&gt;Wikilinks, Dataview&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Notion-only&lt;/td&gt;
&lt;td&gt;Read-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HTTP API + MCP server&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Manual wiring&lt;/td&gt;
&lt;td&gt;Proprietary API&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Contradiction detection&lt;/td&gt;
&lt;td&gt;Automated&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query decomposition&lt;/td&gt;
&lt;td&gt;Parallel BM25&lt;/td&gt;
&lt;td&gt;Manual chains&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Knowledge gap detection&lt;/td&gt;
&lt;td&gt;Auto-suggestions&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extensible skills&lt;/td&gt;
&lt;td&gt;Drop-in folders&lt;/td&gt;
&lt;td&gt;Custom loaders&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Licence&lt;/td&gt;
&lt;td&gt;AGPL-3.0 open source&lt;/td&gt;
&lt;td&gt;Apache-2.0&lt;/td&gt;
&lt;td&gt;Proprietary SaaS&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h1&gt;
  
  
  Enterprise and Domain-Specific Readiness
&lt;/h1&gt;

&lt;p&gt;Synthadoc is built for organisations that need a knowledge system they control, audit, and deploy into existing infrastructure - not a SaaS black box.&lt;/p&gt;

&lt;p&gt;Concrete use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Legal - track regulatory updates, case precedents, and compliance requirements across jurisdictions. New ruling ingested, old page flagged contradicted, compliance team reviews.&lt;/li&gt;
&lt;li&gt;Finance - build a living market research wiki from analyst reports, earnings calls, and regulatory filings. Query with natural language, get cited answers with full audit trail.&lt;/li&gt;
&lt;li&gt;Engineering - maintain a persistent runbook that absorbs incident post-mortems, architecture decision records, and API docs. Contradiction detection prevents stale documentation from accumulating.&lt;/li&gt;
&lt;li&gt;Research - aggregate papers, datasets, and notes into a structured knowledge base. Knowledge gap detection surfaces what the team does not yet know and generates targeted ingest suggestions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Domain specificity
&lt;/h2&gt;

&lt;p&gt;Every wiki defines its own scope via purpose.md. The LLM reads this before every ingest decision and rejects out-of-scope sources cleanly. A legal wiki does not absorb marketing copy. A financial wiki does not absorb engineering runbooks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Auditability
&lt;/h2&gt;

&lt;p&gt;Every ingest, query, contradiction detection, and auto-resolution is written to an append-only SQLite audit trail with token counts, cost, timestamps, and page-level actions -- all queryable from the CLI or Obsidian audit commands.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;synthadoc audit history -w my-wiki # ingest records&lt;/p&gt;

&lt;p&gt;synthadoc audit cost -w my-wiki # token spend breakdown&lt;/p&gt;

&lt;p&gt;synthadoc audit events -w my-wiki # contradiction, gate, resolution&lt;br&gt;
events&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Product and grid readiness
&lt;/h2&gt;

&lt;p&gt;Synthadoc exposes the same operations across four surfaces sharing a single agent and storage layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CLI: for operators, automation scripts, and CI pipelines&lt;/li&gt;
&lt;li&gt;HTTP REST API: for product integrations and custom front-ends&lt;/li&gt;
&lt;li&gt;MCP server: for direct agent-to-agent communication&lt;/li&gt;
&lt;li&gt;Obsidian plugin: for knowledge workers doing active research&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hook scripts fire on lifecycle events (on_ingest_complete, on_lint_complete), enabling event-driven automation: post a Slack summary when ingest completes, trigger a downstream build when a key page changes, or chain into a broader orchestration pipeline. Cron scheduling is built in, and multi-wiki isolation means each team or domain runs on its own port with its own audit trail.&lt;/p&gt;

&lt;h1&gt;
  
  
  Synthadoc in Agentic Autonomous Systems
&lt;/h1&gt;

&lt;p&gt;Synthadoc is purpose-built to serve as the persistent knowledge layer for LLM agent systems. Where an agent's context window is ephemeral and limited, Synthadoc's wiki is persistent, structured, and queryable -it gives agents a long-term memory that survives across sessions, scales to millions of tokens of accumulated knowledge, and is fully auditable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent integration via MCP
&lt;/h2&gt;

&lt;p&gt;The built-in MCP (Model Context Protocol) server exposes ingest, query, and lint as native tool calls. An agent running in any MCP-compatible host - Claude, GPT-4o, a custom LangChain pipeline - can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Call query to retrieve cited, synthesised answers from accumulated knowledge before acting&lt;/li&gt;
&lt;li&gt;Call ingest to push new findings, research results, or external documents back into the wiki&lt;/li&gt;
&lt;li&gt;Call lint to check for contradictions introduced by new data before committing to a decision&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Event-driven agent pipelines
&lt;/h2&gt;

&lt;p&gt;Hook scripts fire on lifecycle events and can trigger downstream agent actions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;on_ingest_complete: a downstream agent reads newly created pages
and decides whether to trigger follow-up ingests or alert a human
reviewer&lt;/li&gt;
&lt;li&gt;on_lint_complete: an orchestrator agent receives contradiction and orphan reports and routes resolution tasks to specialised sub-agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example pipeline: a web-crawling agent ingests raw URLs; Synthadoc synthesises and deduplicates; a reporting agent queries the updated wiki and posts a daily briefing - all without human intervention.&lt;/p&gt;

&lt;h2&gt;
  
  
  Persistent domain memory for multi-agent systems
&lt;/h2&gt;

&lt;p&gt;In multi-agent architectures, shared knowledge is a coordination bottleneck. Synthadoc solves this by acting as a shared, structured memory store:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple agents read from the same wiki via parallel HTTP queries - no shared state management required&lt;/li&gt;
&lt;li&gt;One agent's ingest results are immediately available to all agents querying the same wiki&lt;/li&gt;
&lt;li&gt;Multi-wiki isolation means separate agent clusters maintain scoped knowledge without interference&lt;/li&gt;
&lt;li&gt;The audit trail provides a complete record of which agent ingested what, when, at what cost - making multi-agent systems auditable by design&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because Synthadoc is self-hosted and open-source, teams building autonomous systems retain full control over data residency, model selection, and cost - a critical requirement for enterprise agentic deployments.&lt;/p&gt;

&lt;h1&gt;
  
  
  Try Synthadoc v0.2.0
&lt;/h1&gt;

&lt;p&gt;Synthadoc v0.2.0 is available now on GitHub under the AGPL-3.0 licence. BM25 search works out of the box. Vector re-ranking is one pip install away. The Gemini free tier means you can run a full ingest-and-query cycle at zero cost.&lt;/p&gt;

&lt;p&gt;Feedback welcome: Feedback, issues, and contributions are very welcome. Open an issue on GitHub or start a discussion - the roadmap is shaped by what users need.&lt;/p&gt;

&lt;p&gt;👉 README: &lt;a href="https://github.com/axoviq-ai/synthadoc#readme" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc#readme&lt;/a&gt;&lt;br&gt;
👉 Quick-start guide: &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md&lt;/a&gt;&lt;br&gt;
👉 Design document: &lt;a href="https://github.com/axoviq-ai/synthadoc/blob/main/docs/design.md" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc//blob/main/docs/design.md&lt;/a&gt;&lt;br&gt;
👉 Release notes: &lt;a href="https://github.com/axoviq-ai/synthadoc/releases/tag/v0.2.0" rel="noopener noreferrer"&gt;https://github.com/axoviq-ai/synthadoc/releases/tag/v0.2.0&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>agents</category>
      <category>agentskills</category>
    </item>
  </channel>
</rss>
