My Obsidian Vault Was A Pile Of Disconnected Islands. Ten Concept Hubs Turned It Into A Brain.

#obsidian #productivity #knowledgemanagement #python

I opened my Obsidian graph view yesterday and it looked like a scatter plot. A dense core of closely-connected notes, and then six or seven disconnected sub-clusters floating around the edges like dead planets. 1,058 ChatGPT conversation exports here, 63 reel transcripts there, 562 sent-mail notes over there, all islands.

Obsidian's graph view uses [[wiki-links]] directly — that's the only signal it has for what connects to what. And for most casual vaults, the natural writing process produces enough links organically. But if you're dumping structured data into your vault (exports, scrapes, generated notes), you end up with thousands of files that don't reference anything else.

The fix isn't to bulk-edit every note and add random links. The fix is a small, deliberate set of concept hub nodes that everything else attaches to through a consistent pattern.

Here's the exact approach I used to take my 2,916-note vault from scatter plot to brain in about 30 minutes.

The concept hub pattern

Instead of trying to cross-link every note to every other note (N² problem), create a small number of top-level "hub" notes that capture the major concept dimensions of your vault. Each hub is an entry point to a topic, not a note about the topic itself. Think of them as tags that happen to be notes.

For my vault, I picked these hubs:

Business.md — everything commercial
Projects.md — active builds
Goals.md — what I'm working toward
Military.md — Army Reserve life
Education.md — academic path
People.md — roster of people who matter
Fitness.md — training
Tools.md — software and services
Learning.md — skill acquisition
Content.md — things I publish
VMI.md — formative institution
Relationships.md — dynamics between people

Twelve hubs. That's it. Every note in my vault should semantically fit into at least one of them.

What a hub looks like

Not a blog post. Not a knowledge dump. Just a structured index with dense [[wiki-links]] to the actual content notes. Here's a trimmed example:

---
title: "Projects"
type: concept-hub
tags: [hub, projects]
---

# Projects

Everything I'm actively building or researching.

## Robotics / autonomy / hardware

- [[DirectDrive]] — Radxa autonomous driving
- [[cybertruck-autonomous]] — Tesla Cybertruck autonomy
- [[baseball-catcher]] — Robotic vision stereo catcher
- [[spatial-intent]] — Thesis research code
- [[Sleeve]] — Wearable forearm computer

## Business software

- [[project_summer_sales_saas]] — Utah summer sales mini-SaaS
- [[cowork-btc-bot]] — Trading strategy arena

## Atlas infrastructure

- [[project_atlas_dashboard]] — localhost:4100 ops console
- [[project_shorts_pipeline]] — Video production pipeline
- [[project_sleep_channel]] — Second YouTube channel

## Related hubs

- [[Business]]
- [[Goals]]
- [[Tools]]

The hub is basically a MOC (Map of Content) but with a stricter rule: it only contains wiki-links and headers, no prose that would compete with the underlying notes.

The bulk-linking pass

The hubs alone don't fix the disconnected clusters. You still need the existing 2,000+ notes to link TO the hubs. And you can't hand-edit 2,000 files.

I wrote a ~50-line Python script that walks the vault, reads each note's YAML frontmatter to get its topic tags, and appends a "Related" section at the bottom linking to the concept hubs that match. Here's the guts of it:

import re
from pathlib import Path

VAULT = Path('path/to/your/vault')

# Map topic tags → hubs to link
TOPIC_HUBS = {
    'coding':           ['[[Tools]]', '[[Projects]]', '[[Learning]]'],
    'business-concept': ['[[Business]]', '[[Goals]]'],
    'robotics':         ['[[Projects]]', '[[Tools]]', '[[Learning]]'],
    'military':         ['[[Military]]', '[[Education]]'],
    'self-reflection':  ['[[Goals]]', '[[People]]'],
    'content-creation': ['[[Content]]', '[[Business]]'],
    # ... more
}

UNIVERSAL = ['[[You]]']  # Always add your own entity node
MARKER = '<!-- OBSIDIAN_RELATED_AUTO -->'

def extract_topics(text):
    m = re.search(r'topics:\s*\[(.*?)\]', text)
    if not m:
        return []
    return [t.strip().strip('"').strip("'") for t in m.group(1).split(',')]

def inject_related(file_path):
    text = file_path.read_text()
    if MARKER in text:
        return 'skipped'

    topics = extract_topics(text) or ['other']
    hubs = set(UNIVERSAL)
    for t in topics:
        for h in TOPIC_HUBS.get(t, []):
            hubs.add(h)

    related = (
        f'\n\n{MARKER}\n## Related\n\n'
        + '\n'.join(f'- {h}' for h in sorted(hubs))
        + '\n'
    )
    file_path.write_text(text.rstrip() + related)
    return 'injected'

for f in VAULT.rglob('*.md'):
    result = inject_related(f)
    print(f'{result}: {f.name}')

Run once. On my vault, it took ~3 seconds to touch 1,121 notes. Every note now has outbound links to 2-7 concept hubs.

Critical details:

Use a marker comment (). This lets you re-run the script idempotently without creating duplicate Related sections, and gives you a way to grep-and-remove everything the script added if you want to undo.
Inject at the end of the file, not the top. Your existing notes have their own content — don't disrupt it. A Related section at the bottom is the least intrusive pattern and matches Obsidian's native conventions.
Always add a "universal" link. For me it's [[Will Weigeshoff]] — an entity node for myself. Every note in my vault links to my entity node. That makes the entity node the densest hub in the graph, which is semantically correct: everything in my vault is about me in some way.
Keep the topic → hub map small. Don't try to capture every possible classification. 10-12 hubs with 2-5 inbound links each is better than 40 hubs with 1 link each.

What the graph looked like before and after

Before: Dense main cluster with 6-7 floating sub-clusters. ChatGPT export island over here, reel research island over there, Discord archive off in its own corner. The overall shape was "tight core plus lonely satellites."

After: Dense main cluster grew significantly. The concept hubs became secondary dense nodes (Projects, Business, People, etc). The floating sub-clusters collapsed into the main graph because every note in them now links to a hub, which links to everything else.

The graph view now looks like a brain: a primary hub (the entity node), surrounded by 10 secondary hubs (the concept nodes), with thousands of content notes attached to the relevant hubs. Pan around and it feels navigable.

Why the concept hub pattern beats the alternatives

I tried three other approaches first. Here's what didn't work:

Approach 1: Cross-link everything to everything.
Fails because it's N² and produces a hairball graph where nothing is findable. The graph view just becomes noise. Don't do this.

Approach 2: MOCs at the top level.
Obsidian users often recommend "Maps of Content" — human-curated index notes that list related content. Closer to right, but MOCs are usually domain-specific ("Research MOC", "Reading MOC"). I had 6 MOCs and they weren't enough hub structure — the floating clusters still floated because no MOC felt like the "home" of a ChatGPT export.

Approach 3: Auto-tagging via LLM.
Expensive and inconsistent. I tried a pass where each note got sent to a small LLM for topic tagging, then auto-linked based on the tags. The results were fine but not better than the deterministic keyword-based approach, and cost real money on 1,000+ notes.

The concept hub pattern works because it leverages Obsidian's native graph model (wiki-links) without trying to be clever. You commit to a small set of hubs, commit to linking every note to at least one, and let the graph do the rest.

The deeper lesson

Connection density in a knowledge base is a design choice, not an emergent property. If you just dump structured data in, you get islands. If you write a 50-line script that commits to a consistent linking pattern, you get a brain.

The work isn't in the individual links — it's in picking the right hubs. 12 well-chosen concept hubs is worth more than 1,000 random cross-links. Pick your hubs by asking: what are the 10-12 broadest concept categories that everything in my knowledge base semantically fits into? Then commit, ship the linking script once, and let every future note you add inherit the pattern by matching a hub.

My vault went from "pile of files" to "queryable knowledge graph" in less than an hour of actual design work. The 3 seconds of Python execution was the easy part.