DEV Community

Paul Chen
Paul Chen

Posted on

Synthadoc: From YouTube to Wiki: How v0.3.0 Turns Any Content into Structured Knowledge

You have a YouTube playlist of 40 conference talks. You have bookmarked 200 web articles. You have a folder of PDFs you keep meaning to read.

None of that is knowledge yet. It is a queue.

Synthadoc v0.3.0 drains the queue.


The Problem with "Saving" Things

Most people have a system for collecting information. Bookmarks, Notion pages, Pocket, starred emails. The collection grows. The retrieval never quite works. You remember you saved something about transformer attention mechanisms six months ago but cannot find it. You watch a 45-minute conference talk, absorb maybe 30% of it, and have no structured record of the rest.

The issue is not storage. It is synthesis. Saving a link preserves a pointer. It does not extract the claim, connect it to what you already know, or surface the contradiction with something you read last week.

Synthadoc v0.1.0 solved this for documents - PDFs, Word files, spreadsheets, images. v0.2.0 added hybrid BM25 + vector search so retrieval stayed sharp as the wiki grew. v0.3.0 extends the ingest surface to the two sources where most knowledge actually lives in 2026: video and the live web.


Ingesting a YouTube Video

The workflow is a single command:

synthadoc ingest "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
Enter fullscreen mode Exit fullscreen mode

Or from Obsidian: open the Ingest: from URL... modal, paste the YouTube link, press Ingest.

What happens next:

  1. Synthadoc fetches the video's caption track - no audio download, no third-party transcription API, no API key required.
  2. The transcript is chunked with embedded [MM:SS] timestamps preserved so every claim is traceable to a specific moment in the video.
  3. The LLM generates an executive summary: what the video is about, the main topics covered, and the key takeaway - in three to five sentences.
  4. The full timestamped transcript follows the summary in the wiki page.
  5. Cross-references to existing wiki pages are built automatically during ingest. If your wiki already has a page on "attention mechanisms" and the video mentions it, a [[attention-mechanisms]] wikilink appears in the new page.

Figure 1 — The YouTube ingest pipeline. One URL in; a structured, cross-referenced wiki page out.


The result is a wiki page that looks like this:

---
title: "The Illustrated Transformer"
status: active
confidence: medium
created: 2026-05-03T14:22:01
sources: [youtube.com/watch?v=...]
---

**Executive summary:** Jay Alammar's visual walkthrough of the Transformer
architecture. Covers self-attention, multi-head attention, positional
encoding, and the encoder-decoder structure using animated diagrams.
Key takeaway: the attention mechanism allows each token to "look at" every
other token in the sequence simultaneously, which is what enables parallelism
over RNNs.

---

[00:42] The problem with sequence models is that they process tokens
one at a time, making parallelisation during training difficult...

[03:15] Self-attention computes a weighted sum of all values in the
sequence. The weights come from a compatibility function between
a query and all keys...
Enter fullscreen mode Exit fullscreen mode

No manual work. The video is now part of your wiki.


Web Search Fan-Out

The YouTube capability sits alongside a web search feature that works differently from a standard web search.

synthadoc ingest "search for: transformer attention mechanisms 2025"
Enter fullscreen mode Exit fullscreen mode

Synthadoc does not return ten blue links. It:

  1. Decomposes the query into 3–5 sub-questions covering different facets of the topic.
  2. Searches the web for each sub-question independently.
  3. Ingests the top results for each, synthesizing each source into a wiki page.
  4. Builds cross-references across all the newly created pages.

A single web search command can add eight to fifteen structured pages to your wiki in one operation. The result is not a reading list - it is synthesized knowledge, cross-referenced and ready to query.


Why Timestamps Matter

One detail worth pausing on: the [MM:SS] timestamps in the transcript are not decoration.

When you later ask:

synthadoc query "what did the transformer paper say about positional encoding?"
Enter fullscreen mode Exit fullscreen mode

The answer includes the source citation. Because the timestamp is embedded in the page body, the citation points not just to the video but to the moment in the video. You can verify the claim in thirty seconds by jumping to that timestamp.

This is the same principle that makes citations in academic papers useful. The claim is not just "somewhere in this source." It is traceable.


The Ingest Surface in v0.3.0

With v0.3.0, Synthadoc can ingest from the following source types in a single unified pipeline:

Source How
PDF, Word, XLSX, CSV, TXT Direct file extraction
Images (PNG, JPG, WEBP, etc.) Vision LLM extracts text and structure
Web pages and articles URL fetch + synthesis
YouTube videos Caption extraction + executive summary
Web search results Multi-query fan-out + synthesis
PowerPoint / presentations Slide text extraction

Every source type produces the same output: a structured Markdown wiki page with frontmatter, wikilinks to related pages, and a traceable source reference.


Figure 2 — Every source type feeds the same pipeline. The wiki grows; the query quality compounds.


What the Wiki Looks Like After 30 Days

The compounding effect is the real story. After 30 days of normal usage — ingesting the things you would have saved anyway - a Synthadoc wiki typically contains:

  • 50–150 pages covering your domain
  • Automatic cross-references linking related concepts
  • Contradiction flags where two sources disagree (Synthadoc surfaces these, you resolve them)
  • Orphan detection for pages no other page links to yet
  • Full audit trail of what was ingested, when, and at what cost

The 50th query to this wiki is dramatically smarter than the first, because every previous ingest has built the structure the query runs against.

That is the core idea from v0.1.0, still true - but now the inputs include everything.


Getting Started

# 1. Clone and install
git clone https://github.com/axoviq-ai/synthadoc.git
cd synthadoc
pip3 install -e ".[dev]"
Enter fullscreen mode Exit fullscreen mode

2. Set an API key — Gemini Flash is the default (free tier, 1M tokens/day, no credit card).
Get a key at aistudio.google.com/app/apikey, then:

# macOS / Linux
export GEMINI_API_KEY=AIza…

# Windows
set GEMINI_API_KEY=AIza…
Enter fullscreen mode Exit fullscreen mode

No API key? If you already have Claude Code or Opencode, set provider = "claude-code" in your wiki's config.toml instead - see docs/design.md.

# 3. Install the demo wiki
# Linux / macOS
synthadoc install history-of-computing --target ~/wikis --demo

# Windows
synthadoc install history-of-computing --target %USERPROFILE%\wikis --demo

# 4. Set as the active wiki (no -w needed from here on)
synthadoc use history-of-computing

# 5. Start the engine
synthadoc serve
Enter fullscreen mode Exit fullscreen mode

6. Install the Obsidian plugin — from the cloned synthadoc/ repo directory, copy the pre-built plugin into your vault:

# Linux / macOS (run from the synthadoc/ repo root)
cd ~/synthadoc   # or wherever you cloned it
vault=~/wikis/history-of-computing
mkdir -p "$vault/.obsidian/plugins/synthadoc"
cp obsidian-plugin/main.js obsidian-plugin/manifest.json "$vault/.obsidian/plugins/synthadoc/"

# Windows (cmd.exe — run from the synthadoc\ repo root)
cd %USERPROFILE%\synthadoc
mkdir "%USERPROFILE%\wikis\history-of-computing\.obsidian\plugins\synthadoc"
copy obsidian-plugin\main.js "%USERPROFILE%\wikis\history-of-computing\.obsidian\plugins\synthadoc\"
copy obsidian-plugin\manifest.json "%USERPROFILE%\wikis\history-of-computing\.obsidian\plugins\synthadoc\"
Enter fullscreen mode Exit fullscreen mode

Then in Obsidian: fully quit and reopen Obsidian, then:

  1. Settings → Community plugins → Synthadoc → Enable, set Server URL to http://127.0.0.1:7070
  2. Settings → Community plugins → Browse → search "Dataview" → Install → Enable (required for the live dashboard)
# 7. Ingest a YouTube video
synthadoc ingest "https://www.youtube.com/watch?v=YOUR_VIDEO_ID"
Enter fullscreen mode Exit fullscreen mode

Synthadoc is open source under AGPL-3.0. The full quick-start guide, architecture docs, and demo wiki are at github.com/axoviq-ai/synthadoc.

👉 README: https://github.com/axoviq-ai/synthadoc#readme
👉 Quick-start guide: https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md


*Synthadoc v0.3.0 also ships CJK multilingual query support, knowledge gap detection hardening, a DeepSeek provider, and coding tool CLI providers (Claude Code, Opencode) - no separate API key needed if you already have a coding tool subscription. Full release notes in docs/design.md.

Top comments (0)