DEV Community

Leo KIM
Leo KIM

Posted on

I Built a RAG-like Context Engine for Claude Code — Without Vector DB

The Problem

Claude Code reads your CLAUDE.md once at session start. But here's the thing — Vercel's engineering team found that skills-based retrieval was skipped in 56% of eval cases. The model simply didn't invoke them.

I run Claude Code as my daily coding assistant across 26+ custom resources. After months of watching Claude forget rules, ignore conventions, and skip critical project knowledge, I built a system to fix it.

The Solution: Context Feeder

Context Feeder is a lightweight context injection engine that runs on Claude Code hooks. It doesn't ask the model what's relevant — it force-injects matched context on every message.

No vector database. No embeddings. No cloud API. Just JSON tags + shell scripts.

🔗 GitHub: github.com/friends0485-cyber/context-feeder

How It Works: 3-Stage Chain

Your Message
→ Parser (keyword match against tags.json)
→ Counter (track frequency, assign rank: best/normal/worst)
→ Injector (check rank threshold, read file, output to Claude's context)

Stage 1: Parser (tag_search.py)

Reads the user message from Claude Code's UserPromptSubmit hook via stdin, scans for keywords defined in tags.json, and saves matched file paths.

Stage 2: Counter (counter.py)

Tracks how many times each tag has been called in the current session. Assigns a rank:

Rank Threshold Meaning
best 1st match → inject Frequently used tag
normal 2nd match → inject Standard frequency
worst 3rd match → inject Rarely used, auto-deleted after 30 days

Stage 3: Injector (tag_injector.sh)

Reads the ranked results, checks a 30-minute cooldown (prevents re-injection of the same file), and outputs matched TOML content to stdout — which Claude Code injects into the conversation context.

Context File Format

Rules and knowledge are stored as .toml files:

[rule_001]
title = "API Error Handling"
tags = ["error", "catch", "try", "exception"]
content = '''
All API endpoints must:
- Wrap async handlers with error middleware
- Return structured error responses
- Never expose stack traces to clients
'''
Enter fullscreen mode Exit fullscreen mode

The tags field is what the parser matches against. The content is what gets injected.

Why Not RAG?

Claude Code's own team confirmed they dropped vector DB-based RAG early on:

"We tried RAG… we tried a few different kinds of search tools. And eventually, we landed on just agentic search… One is it outperformed everything. By a lot."

Context Feeder follows the same philosophy — deterministic keyword matching instead of probabilistic similarity search. It's faster, simpler, and requires zero infrastructure.

Key Differences from Existing Tools

Tool Approach Context Feeder
CLAUDE.md Read once, model decides relevance Force-injected on every match
RAG + Vector DB Embeddings + infrastructure JSON keywords + shell scripts
Claude-Mem Session memory (past observations) Rule injection (present context)
Skills Model chooses to invoke System forces delivery

Quick Start

  1. Clone the repo
  2. Add your rules as .toml files in contexts/
  3. Register keywords in config/tags.json
  4. Connect to Claude Code hooks in .claude/settings.json
  5. Done — next message triggers automatic injection

Beyond the Core Engine

The open-source release is the core 3-stage chain. My production system has 10 interconnected modules including a logger (21 categories), watchdog (real-time dashboard), reminder (workflow violation detection), and a live console UI. The full architecture is documented in the repo.

What's Next

  • Auto-scanner that rebuilds tags.json from your TOML files (included)
  • Community-contributed context templates for popular frameworks
  • Integration patterns for monorepos

I'd love to hear how others are handling context injection with Claude Code hooks. What patterns have worked for you?


Built by Leo KIM — AI Automation Engineer
GitHub: context-feeder

Top comments (0)