TL;DR: I spent weeks tuning DeepSeek V4 to feel native inside Claude Code. The result: a one-command setup with 9 agents, 7 rules, security hooks, OCR, and auto-backup. Clone → ./init.sh → you're shipping.
I wanted DeepSeek's 1M context window inside Claude Code's interface. What I got was two weeks of fighting API configs, debugging token limits, and discovering that most "just swap the model" advice is missing half the puzzle.
So I packaged everything into a single repo.
What's in the Box
git clone https://github.com/YuhaoLin2005/deepseek-claude-code-starter.git
cd deepseek-claude-code-starter
./init.sh
Three commands. Here's what lands in your ~/.claude/:
| Component | What It Does |
|---|---|
| 9 Custom Agents | Code review, security audit, TDD guide, architecture, build-fix — each tuned for DeepSeek's reasoning style |
| 7 Behavior Rules | Code quality, security, testing discipline, YAGNI enforcement, commit standards |
| Security Hook | PreToolUse guard — blocks sensitive file access and dangerous commands before they execute |
| Auto-Backup | Pre-edit snapshots (keeps 5) + session-start git commit. You'll thank me the first time you roll back |
| Local OCR | RapidOCR on ONNX — lets DeepSeek "see" screenshots without sending them to a cloud API |
| Status Line | Compaction counter — warns you at 5+ compactions to start fresh (related: my compaction quality research) |
| DuckDuckGo MCP | Plugs DeepSeek's training cutoff gap with live web search |
| Auto-Format | Prettier on Edit/Write for frontend files (opt-in) |
The Secret Sauce: Model Routing
The single biggest quality-of-life improvement: main agent gets the Pro model, sub-agents get Flash.
DeepSeek V4 Pro handles architecture, debugging, and complex reasoning. But reading files? Searching? Running tests? Those don't need a 1M-context reasoning beast. Flash is faster, cheaper, and completely adequate for the grunt work.
This one decision doubled my effective throughput. The main agent never waits behind a queue of file-reads.
YAGNI as Code
One rule I'm unreasonably proud of: the 6-level decision ladder.
Level 0: stdlib can do it → don't write code
Level 1: one-liner → don't write fifty lines
Level 2: existing tool → don't build a replacement
Level 3: simple script → don't build a framework
Level 4: library → don't build from scratch
Level 5: only then, build
It's YAGNI compiled into a decision tree. When every token costs money, "just in case" code is a bill you pay every session.
What This Won't Do
- It won't magically make DeepSeek as good as Claude Opus at creative writing (different models, different strengths)
- It's tested on Windows — Mac/Linux paths may need tweaking
- You still need your own API key and base URL
- OCR speed depends on your local GPU (CPU fallback works, just slower)
Why I Open-Sourced This
My self-model protocol handles identity across sessions. But identity is useless if the tools aren't right. This starter kit is the companion piece — the "body" to the protocol's "mind."
Both are MIT licensed. Both took weeks of trial and error to get right. Both are now one git clone away.
Have you tried running non-Claude models inside Claude Code? What broke first — the API, the prompts, or your patience? Drop your war stories in the comments.
Related: LLM compaction isn't linear • Self-model protocol for AI identity persistence
Top comments (0)