You have a YouTube playlist of 40 conference talks. You have bookmarked 200 web articles. You have a folder of PDFs you keep meaning to read.
None of that is knowledge yet. It is a queue.
Synthadoc v0.3.0 drains the queue.
The Problem with "Saving" Things
Most people have a system for collecting information. Bookmarks, Notion pages, Pocket, starred emails. The collection grows. The retrieval never quite works. You remember you saved something about transformer attention mechanisms six months ago but cannot find it. You watch a 45-minute conference talk, absorb maybe 30% of it, and have no structured record of the rest.
The issue is not storage. It is synthesis. Saving a link preserves a pointer. It does not extract the claim, connect it to what you already know, or surface the contradiction with something you read last week.
Synthadoc v0.1.0 solved this for documents - PDFs, Word files, spreadsheets, images. v0.2.0 added hybrid BM25 + vector search so retrieval stayed sharp as the wiki grew. v0.3.0 extends the ingest surface to the two sources where most knowledge actually lives in 2026: video and the live web.
Ingesting a YouTube Video
The workflow is a single command:
synthadoc ingest "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
Or from Obsidian: open the Ingest: from URL... modal, paste the YouTube link, press Ingest.
What happens next:
- Synthadoc fetches the video's caption track - no audio download, no third-party transcription API, no API key required.
- The transcript is chunked with embedded
[MM:SS]timestamps preserved so every claim is traceable to a specific moment in the video. - The LLM generates an executive summary: what the video is about, the main topics covered, and the key takeaway - in three to five sentences.
- The full timestamped transcript follows the summary in the wiki page.
- Cross-references to existing wiki pages are built automatically during ingest. If your wiki already has a page on "attention mechanisms" and the video mentions it, a
[[attention-mechanisms]]wikilink appears in the new page.
Figure 1 — The YouTube ingest pipeline. One URL in; a structured, cross-referenced wiki page out.
The result is a wiki page that looks like this:
---
title: "The Illustrated Transformer"
status: active
confidence: medium
created: 2026-05-03T14:22:01
sources: [youtube.com/watch?v=...]
---
**Executive summary:** Jay Alammar's visual walkthrough of the Transformer
architecture. Covers self-attention, multi-head attention, positional
encoding, and the encoder-decoder structure using animated diagrams.
Key takeaway: the attention mechanism allows each token to "look at" every
other token in the sequence simultaneously, which is what enables parallelism
over RNNs.
---
[00:42] The problem with sequence models is that they process tokens
one at a time, making parallelisation during training difficult...
[03:15] Self-attention computes a weighted sum of all values in the
sequence. The weights come from a compatibility function between
a query and all keys...
No manual work. The video is now part of your wiki.
Web Search Fan-Out
The YouTube capability sits alongside a web search feature that works differently from a standard web search.
synthadoc ingest "search for: transformer attention mechanisms 2025"
Synthadoc does not return ten blue links. It:
- Decomposes the query into 3–5 sub-questions covering different facets of the topic.
- Searches the web for each sub-question independently.
- Ingests the top results for each, synthesizing each source into a wiki page.
- Builds cross-references across all the newly created pages.
A single web search command can add eight to fifteen structured pages to your wiki in one operation. The result is not a reading list - it is synthesized knowledge, cross-referenced and ready to query.
Why Timestamps Matter
One detail worth pausing on: the [MM:SS] timestamps in the transcript are not decoration.
When you later ask:
synthadoc query "what did the transformer paper say about positional encoding?"
The answer includes the source citation. Because the timestamp is embedded in the page body, the citation points not just to the video but to the moment in the video. You can verify the claim in thirty seconds by jumping to that timestamp.
This is the same principle that makes citations in academic papers useful. The claim is not just "somewhere in this source." It is traceable.
The Ingest Surface in v0.3.0
With v0.3.0, Synthadoc can ingest from the following source types in a single unified pipeline:
| Source | How |
|---|---|
| PDF, Word, XLSX, CSV, TXT | Direct file extraction |
| Images (PNG, JPG, WEBP, etc.) | Vision LLM extracts text and structure |
| Web pages and articles | URL fetch + synthesis |
| YouTube videos | Caption extraction + executive summary |
| Web search results | Multi-query fan-out + synthesis |
| PowerPoint / presentations | Slide text extraction |
Every source type produces the same output: a structured Markdown wiki page with frontmatter, wikilinks to related pages, and a traceable source reference.
Figure 2 — Every source type feeds the same pipeline. The wiki grows; the query quality compounds.
What the Wiki Looks Like After 30 Days
The compounding effect is the real story. After 30 days of normal usage — ingesting the things you would have saved anyway - a Synthadoc wiki typically contains:
- 50–150 pages covering your domain
- Automatic cross-references linking related concepts
- Contradiction flags where two sources disagree (Synthadoc surfaces these, you resolve them)
- Orphan detection for pages no other page links to yet
- Full audit trail of what was ingested, when, and at what cost
The 50th query to this wiki is dramatically smarter than the first, because every previous ingest has built the structure the query runs against.
That is the core idea from v0.1.0, still true - but now the inputs include everything.
Getting Started
# 1. Clone and install
git clone https://github.com/axoviq-ai/synthadoc.git
cd synthadoc
pip3 install -e ".[dev]"
2. Set an API key — Gemini Flash is the default (free tier, 1M tokens/day, no credit card).
Get a key at aistudio.google.com/app/apikey, then:
# macOS / Linux
export GEMINI_API_KEY=AIza…
# Windows
set GEMINI_API_KEY=AIza…
No API key? If you already have Claude Code or Opencode, set provider = "claude-code" in your wiki's config.toml instead - see docs/design.md.
# 3. Install the demo wiki
# Linux / macOS
synthadoc install history-of-computing --target ~/wikis --demo
# Windows
synthadoc install history-of-computing --target %USERPROFILE%\wikis --demo
# 4. Set as the active wiki (no -w needed from here on)
synthadoc use history-of-computing
# 5. Start the engine
synthadoc serve
6. Install the Obsidian plugin — from the cloned synthadoc/ repo directory, copy the pre-built plugin into your vault:
# Linux / macOS (run from the synthadoc/ repo root)
cd ~/synthadoc # or wherever you cloned it
vault=~/wikis/history-of-computing
mkdir -p "$vault/.obsidian/plugins/synthadoc"
cp obsidian-plugin/main.js obsidian-plugin/manifest.json "$vault/.obsidian/plugins/synthadoc/"
# Windows (cmd.exe — run from the synthadoc\ repo root)
cd %USERPROFILE%\synthadoc
mkdir "%USERPROFILE%\wikis\history-of-computing\.obsidian\plugins\synthadoc"
copy obsidian-plugin\main.js "%USERPROFILE%\wikis\history-of-computing\.obsidian\plugins\synthadoc\"
copy obsidian-plugin\manifest.json "%USERPROFILE%\wikis\history-of-computing\.obsidian\plugins\synthadoc\"
Then in Obsidian: fully quit and reopen Obsidian, then:
-
Settings → Community plugins → Synthadoc → Enable, set Server URL to
http://127.0.0.1:7070 - Settings → Community plugins → Browse → search "Dataview" → Install → Enable (required for the live dashboard)
# 7. Ingest a YouTube video
synthadoc ingest "https://www.youtube.com/watch?v=YOUR_VIDEO_ID"
Synthadoc is open source under AGPL-3.0. The full quick-start guide, architecture docs, and demo wiki are at github.com/axoviq-ai/synthadoc.
👉 README: https://github.com/axoviq-ai/synthadoc#readme
👉 Quick-start guide: https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md
*Synthadoc v0.3.0 also ships CJK multilingual query support, knowledge gap detection hardening, a DeepSeek provider, and coding tool CLI providers (Claude Code, Opencode) - no separate API key needed if you already have a coding tool subscription. Full release notes in docs/design.md.



Top comments (0)