<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nick Yeo</title>
    <description>The latest articles on DEV Community by Nick Yeo (@nickyeolk).</description>
    <link>https://dev.to/nickyeolk</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3938354%2Fd8fc4775-8272-4491-af36-1e4f7448e9e0.jpeg</url>
      <title>DEV Community: Nick Yeo</title>
      <link>https://dev.to/nickyeolk</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nickyeolk"/>
    <language>en</language>
    <item>
      <title>Think with your second brain: a proper Claude Code harness for Obsidian</title>
      <dc:creator>Nick Yeo</dc:creator>
      <pubDate>Wed, 20 May 2026 13:56:29 +0000</pubDate>
      <link>https://dev.to/nickyeolk/think-with-your-second-brain-a-proper-claude-code-harness-for-obsidian-2c0o</link>
      <guid>https://dev.to/nickyeolk/think-with-your-second-brain-a-proper-claude-code-harness-for-obsidian-2c0o</guid>
      <description>&lt;p&gt;An agentic doc harness is a set of Claude Code skills that turn an Obsidian vault into a structured workspace, letting an LLM walk wikilinks the way a coding agent walks imports — without vector-RAG retrieval. On a 99-note evaluation vault it beat a vector-RAG baseline on faithfulness +0.27, grounding +0.80, insight novelty +1.00, answer relevancy +0.40 (Claude-as-judge, 0–3 scale).&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/nickyeolk/agentic_doc_harness" rel="noopener noreferrer"&gt;github.com/nickyeolk/agentic_doc_harness&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I built it
&lt;/h2&gt;

&lt;p&gt;I used to have OneNote as my main note keeping app, supplemented by Google Keep for quick notes. As my notes grew, and OneNote's android app withered, I experienced two problems: OneNote was way too slow, and there was no easy way for an AI agent to organically plug into it.&lt;br&gt;
The idea was to make use of a coding agent's capability to understand code, and use it to understand notes instead.&lt;br&gt;
I consolidated years of notes from OneNote into Obsidian. It worked ok-ish at first, but then I quickly started to encounter limits to the way claude grepped and grokked its way through my 'notebase'. Coding agents depend on code structure to jump from object to object, this did not exist in my disjointed notes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vector RAG vs vault harness
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftm0cor023qq3zbyece93.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftm0cor023qq3zbyece93.png" alt="Difference between RAG and an agentic harness for context retrieval" width="800" height="560"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Same input, same model, different navigation.&lt;/p&gt;

&lt;p&gt;Vector RAG embeds the vault, retrieves k chunks by cosine similarity, hands them to the LLM. Chunks arrive as sentence-level fragments with no provenance.&lt;/p&gt;

&lt;p&gt;The harness reads &lt;code&gt;VAULT_INDEX.md&lt;/code&gt; (a generated map of the vault), routes to an entry note, walks outbound wikilinks, surfaces a few topically-similar but unlinked notes. Notes arrive whole, in their original structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Harness structure
&lt;/h2&gt;

&lt;p&gt;Four Claude Code skills plus a small set of generated files.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwp17vtjlfhbokqs84rt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwp17vtjlfhbokqs84rt.png" alt="The entire architecture of the harness skills" width="800" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;/harness-init&lt;/code&gt;&lt;/strong&gt; (one-time). Walks the vault, classifies sections, detects hub-candidate notes by inbound mention count, asks 5 clarifying questions, generates &lt;code&gt;VAULT_INDEX.md&lt;/code&gt; (the map), root and per-section &lt;code&gt;CLAUDE.md&lt;/code&gt; files (orientation), and a small config.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;/vault-discover&lt;/code&gt;&lt;/strong&gt; (graph builder). Four modes. Mode 1 ranks notes by inbound mention frequency to surface hub candidates. Mode 2 finds every unlinked mention of a hub and proposes adding &lt;code&gt;[[wikilinks]]&lt;/code&gt;. Mode 3 detects orphan notes and classifies them. Mode 4 groups by shared vocabulary to find clusters that need a MOC.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;vault-context&lt;/code&gt;&lt;/strong&gt; (runtime navigator). Used every session. Depth Mode for directed tasks (1–2 hops). Synthesis Mode for cross-domain queries (multi-hub traversal). Hybrid Mode (opt-in) layers in embedding-aware filtering when Smart Connections is installed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;/obsidian-tooling&lt;/code&gt;&lt;/strong&gt; (optional). Installs the Smart Connections plugin and pre-configures it. The harness then reads &lt;code&gt;.smart-env/multi/*.ajson&lt;/code&gt; directly to use embeddings without any Python ML dependency.&lt;/p&gt;

&lt;p&gt;The harness instructs Claude &lt;em&gt;about&lt;/em&gt; the vault. It never prescribes what's &lt;em&gt;in&lt;/em&gt; the vault.&lt;/p&gt;

&lt;h2&gt;
  
  
  Eval
&lt;/h2&gt;

&lt;p&gt;Synthetic vault: 99 notes representing a fictional marketing consultant pursuing an MS Marketing degree with a family. Five folders. Generator emits zero wikilinks — flat import state.&lt;/p&gt;

&lt;p&gt;Baseline: LlamaIndex vector RAG with &lt;code&gt;nomic-embed-text&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Harness: pure wikilink traversal first, Hybrid Mode after one iteration.&lt;/p&gt;

&lt;p&gt;15 synthesis tasks (single-domain, cross-domain, trap queries). Each response scored by a Claude Sonnet judge on faithfulness, grounding, insight novelty, answer relevancy (0–3 each).&lt;/p&gt;

&lt;p&gt;First pass, pure wikilink traversal:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Baseline&lt;/th&gt;
&lt;th&gt;Harness (pure)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Faithfulness&lt;/td&gt;
&lt;td&gt;2.067&lt;/td&gt;
&lt;td&gt;2.000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grounding&lt;/td&gt;
&lt;td&gt;2.133&lt;/td&gt;
&lt;td&gt;2.533&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Insight novelty&lt;/td&gt;
&lt;td&gt;1.533&lt;/td&gt;
&lt;td&gt;2.333&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Answer relevancy&lt;/td&gt;
&lt;td&gt;2.067&lt;/td&gt;
&lt;td&gt;2.400&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three wins, one loss. Faithfulness regressed below baseline. Diagnosis: wikilink traversal is query-agnostic. From a Studies entry note, the agent followed a link to a Clients note even when the query was strictly about coursework. Cross-domain contamination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hybrid Mode
&lt;/h2&gt;

&lt;p&gt;Embedding the query at runtime would have required a Python ML stack. I wanted to avoid that.&lt;/p&gt;

&lt;p&gt;Smart Connections (Obsidian plugin) already maintains an embedding cache on every note save. The harness reads it.&lt;/p&gt;

&lt;p&gt;Move one: filter wikilinks by anchor similarity. For each outbound link, compute &lt;code&gt;cosine(entry_note_embedding, candidate_embedding)&lt;/code&gt;. Drop links below threshold. A Studies-to-Studies link scores 0.78–0.80. Studies-to-Clients scores 0.68. Studies-to-Family scores 0.57. Threshold becomes a tunable filter.&lt;/p&gt;

&lt;p&gt;Move two: orphan surfacing. After traversal, take top-k notes vault-wide by similarity to the entry note. Drop anything already loaded. Surface up to 5. These are notes the wikilink graph never reaches but the embedding flags as topical.&lt;/p&gt;

&lt;p&gt;Both moves use Python stdlib only. About 150 lines to walk &lt;code&gt;.smart-env/&lt;/code&gt;'s &lt;code&gt;.ajson&lt;/code&gt; files and compute cosine similarity over embeddings that already exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final results
&lt;/h2&gt;

&lt;p&gt;Threshold sweep at {0.55, 0.60, 0.65, 0.70, 0.75}. Orphan surfacing on. Pareto-optimal at t=0.65, orphan-k=5.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Faith&lt;/th&gt;
&lt;th&gt;Ground&lt;/th&gt;
&lt;th&gt;Novel&lt;/th&gt;
&lt;th&gt;Relev&lt;/th&gt;
&lt;th&gt;Notes loaded&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Baseline (vector RAG)&lt;/td&gt;
&lt;td&gt;2.067&lt;/td&gt;
&lt;td&gt;2.133&lt;/td&gt;
&lt;td&gt;1.533&lt;/td&gt;
&lt;td&gt;2.067&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pure traversal&lt;/td&gt;
&lt;td&gt;2.000&lt;/td&gt;
&lt;td&gt;2.533&lt;/td&gt;
&lt;td&gt;2.333&lt;/td&gt;
&lt;td&gt;2.400&lt;/td&gt;
&lt;td&gt;8.9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hybrid t=0.65 +orph5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.333&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.933&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.533&lt;/td&gt;
&lt;td&gt;2.467&lt;/td&gt;
&lt;td&gt;11.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hybrid t=0.75 +orph5&lt;/td&gt;
&lt;td&gt;2.333&lt;/td&gt;
&lt;td&gt;2.800&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.667&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.533&lt;/td&gt;
&lt;td&gt;9.3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every Hybrid variant beats baseline AND pure-traversal on every dimension. Per-query latency ~8 seconds (same as pure traversal). Notes loaded up 30%, still within token budget.&lt;/p&gt;

&lt;p&gt;Unexpected finding: stricter wikilink filtering plus orphan surfacing beats permissive filtering. At t=0.75 most wikilinks get cut, the surfaced orphans fill the gap, insight novelty peaks. Orphan surfacing is doing more work than the wikilink filter.&lt;/p&gt;

&lt;p&gt;This might change over time as your links build up, or if you have an extremely structured note structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this fits in the bigger picture
&lt;/h2&gt;

&lt;p&gt;The approach is not entirely novel. It sits inside the confluence of a numbr of trends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anthropic dropped vector RAG from Claude Code&lt;/strong&gt; in favor of agentic search. Boris Cherny, Claude Code lead: "Early Claude Code used RAG + a local vector DB, but in the end, we found agentic search to be overwhelmingly better" (&lt;a href="https://newsletter.pragmaticengineer.com/p/building-claude-code-with-boris-cherny" rel="noopener noreferrer"&gt;Pragmatic Engineer interview&lt;/a&gt;, &lt;a href="https://salarysaiyan.com/en/blog/rag-is-dead/" rel="noopener noreferrer"&gt;HN confirmation&lt;/a&gt;). Claude Code now uses &lt;code&gt;Glob&lt;/code&gt;, &lt;code&gt;Grep&lt;/code&gt;, &lt;code&gt;Read&lt;/code&gt; to navigate the way a developer does. The harness extends this pattern from code to prose.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Karpathy's LLM Wiki pattern&lt;/strong&gt; (&lt;a href="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f" rel="noopener noreferrer"&gt;gist&lt;/a&gt;) describes keeping knowledge as markdown and skipping retrieval infrastructure entirely. Multiple open-source implementations exist: &lt;a href="https://github.com/ussumant/llm-wiki-compiler" rel="noopener noreferrer"&gt;LLM Wiki Compiler&lt;/a&gt;, &lt;a href="https://github.com/kytmanov/obsidian-llm-wiki-local" rel="noopener noreferrer"&gt;obsidian-llm-wiki-local&lt;/a&gt;, &lt;a href="https://github.com/nashsu/llm_wiki" rel="noopener noreferrer"&gt;nashsu/llm_wiki&lt;/a&gt;, &lt;a href="https://github.com/Ar9av/obsidian-wiki" rel="noopener noreferrer"&gt;Ar9av/obsidian-wiki&lt;/a&gt;. The harness shares the anti-RAG stance and traverses progressively rather than dumping the whole corpus into context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Microsoft's GraphRAG&lt;/strong&gt; (&lt;a href="https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/" rel="noopener noreferrer"&gt;blog&lt;/a&gt;, &lt;a href="https://arxiv.org/abs/2404.16130" rel="noopener noreferrer"&gt;arXiv 2404.16130&lt;/a&gt;) builds a knowledge graph from a text corpus, then uses the graph for sensemaking queries. The harness uses the graph the user already built.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Smart Connections&lt;/strong&gt; (&lt;a href="https://github.com/brianpetro/obsidian-smart-connections" rel="noopener noreferrer"&gt;repo&lt;/a&gt;) is the dominant Obsidian-AI plugin and does RAG over the vault. The harness uses Smart Connections' embedding cache as a secondary signal in Hybrid Mode, not as the primary retrieval mechanism.&lt;/p&gt;

&lt;p&gt;The contribution here is synthesis: applying the agentic-search pattern to personal knowledge already structured by the user, with the wikilink graph as the first-class navigation primitive, validated against a vector-RAG baseline with concrete numbers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;Eval is on a synthetic vault. A real personal vault will surface failure modes the synthetic one does not.&lt;/p&gt;

&lt;p&gt;Path 1 anchor scoring uses the entry note's embedding, not the query's. When the entry note is broad and the query is narrow, the anchor does not pick up query intent. A future iteration may add a small local query embedder.&lt;/p&gt;

&lt;p&gt;Entry-point selection runs on routing heuristics generated from the vault. Fragile on first contact with a new vault.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Q: Why not just use Smart Connections?&lt;/strong&gt;&lt;br&gt;
A: Smart Connections is RAG over the vault — embed, retrieve k chunks, chat. Loses the structure. The harness uses the structure first, embeddings second.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Does this work without Smart Connections?&lt;/strong&gt;&lt;br&gt;
A: Yes. Depth Mode and Synthesis Mode work on wikilinks alone. Hybrid Mode is the opt-in layer that adds Smart Connections.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Does the harness modify my notes?&lt;/strong&gt;&lt;br&gt;
A: &lt;code&gt;/vault-discover&lt;/code&gt; Mode 2 adds &lt;code&gt;[[wikilinks]]&lt;/code&gt; to your notes, with every change shown before writing. The runtime navigator never writes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: What if my vault has no wikilinks yet?&lt;/strong&gt;&lt;br&gt;
A: The harness handles that case. Designed for the OneNote / Notion / Evernote migration scenario. Mode 1 ranks hub candidates by inbound mention frequency. Mode 2 wires them in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: How does this differ from Karpathy's LLM Wiki?&lt;/strong&gt;&lt;br&gt;
A: LLM Wiki dumps the full knowledge base into the model's context and trusts long context. The harness traverses progressively from an entry point. For vaults larger than the context window, this matters; for small vaults the two converge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: Why not embed the query at runtime?&lt;/strong&gt;&lt;br&gt;
A: Would require a Python ML stack. The harness is stdlib-only by design. Hybrid Mode achieves most of the win using anchor-based scoring (entry note as the anchor) instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q: What if I want to use this on an existing vault with thousands of notes?&lt;/strong&gt;&lt;br&gt;
A: It should work — &lt;code&gt;vault-context&lt;/code&gt; is bounded by &lt;code&gt;MAX_NOTES_PER_QUERY&lt;/code&gt; (default 12). The harness scales with the entry-point routing, not with vault size. The eval is on 99 notes; larger vaults are untested.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Repo (clone and run): &lt;a href="https://github.com/nickyeolk/agentic_doc_harness" rel="noopener noreferrer"&gt;github.com/nickyeolk/agentic_doc_harness&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents" rel="noopener noreferrer"&gt;Anthropic on effective context engineering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://newsletter.pragmaticengineer.com/p/building-claude-code-with-boris-cherny" rel="noopener noreferrer"&gt;Boris Cherny on Claude Code's agentic search&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f" rel="noopener noreferrer"&gt;Karpathy's LLM Wiki gist&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/" rel="noopener noreferrer"&gt;Microsoft GraphRAG&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>obsidian</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
