I built an interactive healthcare knowledge graph — conditions, medications, drug interactions, diagnostics, billing codes, care pathways — and structured it as a compressed markdown file that any AI model can reason over.
Not a summary. Not a document. A traversable knowledge graph in .md format.
~3,000 tokens instead of ~500,000. Same reasoning quality. 170x more efficient.
Here's the live interactive demo: graphifymd.com/healthcare-kg-demo.html
Why this matters
85% of enterprise AI pilots fail to scale. Not because the models are bad. Because the context is.
An LLM can't reason about drug interactions if it doesn't know that metformin relates to renal function relates to GFR thresholds relates to dosing adjustments. That's not a retrieval problem. That's a relationship problem.
RAG retrieves text chunks. Knowledge graphs traverse relationships. The difference is the difference between searching a library index and having a librarian who knows which books reference each other — and why.
The pipeline
Raw clinical data (~2MB)
↓
Knowledge graph extraction (200 entities, 500+ relationships)
↓
Graph distillation (typed relationships + traversal rules)
↓
Compressed .md (~12KB, ~3,000 tokens)
↓
Deploy anywhere
What the .md looks like
Here's a fragment of the cardiology domain graph compressed to markdown:
## Entities
### Conditions
- Atrial Fibrillation | ICD: I48 | prevalence: 2.7M US
- Heart Failure | ICD: I50 | prevalence: 6.2M US
- subtypes: HFrEF (EF≤40%), HFpEF (EF≥50%)
### Medications
- Apixaban | class: DOAC | no INR monitoring
- Warfarin | class: anticoagulant | INR target: 2-3
- Amiodarone | class: antiarrhythmic | ⚠️ toxicity
## Relationships
AFib → TREATED_BY → Apixaban (first-line DOAC)
AFib → RISK_FACTOR_FOR → Stroke (5x risk)
HFrEF → TREATED_BY → Metoprolol (mortality ↓35%)
Warfarin → INTERACTS_WITH → Amiodarone ⚠️
↳ RULE: ↑INR 50-70%. Reduce warfarin dose 30-50%.
Apixaban → REQUIRES → CrCl assessment
↳ RULE: Reduce dose if CrCl 15-29, avoid if <15
## Traversal Examples
Q: Patient with AFib + CKD Stage 4. Anticoagulation?
AFib → TREATED_BY → Apixaban
Apixaban → REQUIRES → CrCl
CKD Stage 4 → CrCl 15-29 → DOSE_ADJUST Apixaban
→ Answer: Apixaban 2.5mg BID (reduced dose)
The model doesn't guess. It follows the chain. Multi-hop reasoning with an audit trail.
The numbers
| Metric | Raw Data | Knowledge Graph .md |
|---|---|---|
| Size | ~2MB | ~12KB |
| Tokens | ~500,000 | ~3,000 |
| Density | 1x | 170x |
| Compression | — | 93% |
| CO₂ per query | ~0.34 kg | ~0.002 kg |
That last line matters. Fewer tokens = less compute = lower energy. 99.4% carbon reduction per query. Structured intelligence is greener intelligence.

March Madness knowledge graph — 68 teams, built live with graduate software engineers
It works everywhere
The same .md file works across every AI environment without modification:
- Claude Projects — upload as project knowledge
- Claude Code — CLAUDE.md project context
- ChatGPT — custom GPT instructions
- Cursor / Windsurf — context file
- Codex CLI — AGENTS.md
- MCP Server — serve as tool context
- API — system prompt injection
- Email — it's just text. Paste it.
No vendor lock-in. No format conversion. No special tooling. Markdown is the universal interface.
Why not just use RAG?
RAG retrieves the top-k text chunks that match your query. It's single-hop — find the most similar text, return it.
A knowledge graph traverses relationships. When you ask about a patient with AFib and kidney disease, the graph follows:
AFib → treatment options → Apixaban → renal requirements →
CrCl thresholds → CKD staging → dose adjustment rules
That's 5 hops. RAG would need to independently retrieve and stitch together 5 separate chunks and hope the model connects them. The graph has already connected them.
Microsoft's 2024 research showed knowledge graphs achieve an 83% win rate vs vector RAG. HopRAG (ACL 2025) showed 77% higher accuracy on multi-hop questions.
What I'm building
I run Graphify.md — we build domain knowledge graphs and compress them to portable .md for any industry. Healthcare is one vertical. We've also built graphs for:
- March Madness tournament — 68 teams, real-time scores, built live with grad students
- LinkedIn Groups ecosystem — 200+ groups, 15 verticals, relationship edges
- Defense, legal, construction, supply chain, GovTech, education — 12 verticals mapped
The methodology works on any domain. If your data has entities and relationships — and all data does — it can be graphed, compressed, and deployed.
Try it
The interactive demo is live. Hover over nodes to see relationship chains light up:
👉 graphifymd.com/healthcare-kg-demo.html
Built entirely with Claude Code. The whole thing — knowledge graph extraction, D3 visualization, .md compression, the site — solo, in days not months.
If you're working on a domain where AI keeps hallucinating or RAG keeps missing context, the problem might not be the model. It might be the structure.
Daniel Yarmoluk — Graphify.md — Book a call

Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.