Claude Code in Production: 12 Tips After 10,000+ API Calls
I've run Claude Code for thousands of tasks building 264 AI engineering frameworks. Here's what I wish I knew on day one — the stuff that isn't in the docs.
Context Management (Tips 1-4)
1. Front-Load Context, Not Instructions
❌ Bad:
Refactor the auth module to use JWT refresh tokens with rotation,
implement PKCE flow, add rate limiting per user...
✅ Good:
Here's the current auth module: [paste code]
Here's the failing test: [paste test]
Here's our security requirements doc: [paste doc]
The refresh token implementation has a race condition under load.
Fix it while maintaining the existing API contract.
Why: Claude Code performs dramatically better when it understands the codebase first, then gets a focused task. Long instruction lists lead to partial implementations.
2. Use AGENTS.md as Your Session Primer
Create an AGENTS.md at project root:
# AGENTS.md
## Stack: Next.js 15 + TypeScript + Prisma + PostgreSQL
## Style: Functional, no classes, Result<T,E> error handling
## Testing: Vitest, test files colocated with source
## Current sprint: Payment integration (Stripe)
## Known debt: WebSocket handler leaks memory (#142)
Claude Code reads this automatically. One file eliminates repeated context-setting across sessions.
3. The 200K Context Trap
Claude's 200K context window is amazing. It's also a cost trap.
The rule: Only load files the AI needs to MODIFY or UNDERSTAND for the current task. Not "everything just in case."
# ❌ Loading everything
claude "fix the bug" --include "src/**/*"
# ✅ Loading what matters
claude "fix the auth bug" --include "src/auth/**" "src/middleware/auth.ts" "tests/auth.test.ts"
Cost impact: A 200K context call costs ~10x more than a 20K context call. Be surgical.
4. The Plan-Then-Execute Pattern
For complex tasks, always split into two calls:
Call 1 (cheap — Sonnet):
Given this codebase context, create a step-by-step plan
for implementing [feature]. List files to modify,
approach for each, and potential risks.
Call 2 (powerful — Opus):
Execute this plan: [paste plan from Call 1]
Here are the files: [only the ones the plan identified]
This catches bad approaches at $0.10 instead of $5.00.
Cost Control (Tips 5-8)
5. Track Costs Per Task Category
After a month of tracking, here are real cost ranges:
| Task Type | Avg Cost | Model |
|---|---|---|
| Bug fix (single file) | $0.15 | Sonnet |
| Feature (multi-file) | $1.50 | Opus |
| Architecture refactor | $4.00 | Opus |
| Test generation | $0.08 | Sonnet/Mini |
| Code review | $0.20 | Sonnet |
| Documentation | $0.05 | Mini |
Lesson: 60% of coding tasks can use Sonnet or Mini. Reserve Opus for the 20% that actually needs deep reasoning.
6. Set Hard Budget Limits
# In your shell config
export CLAUDE_MAX_COST_PER_SESSION=10.00
export CLAUDE_WARN_AT=5.00
Without limits, a single runaway session (recursive debugging loop) can burn $20+.
7. Cache Repeated Context
If you're running multiple tasks against the same codebase:
# Generate context once
cat src/auth/**/*.ts > /tmp/auth-context.txt
# Reuse across calls
claude "Task 1..." --context /tmp/auth-context.txt
claude "Task 2..." --context /tmp/auth-context.txt
Some API providers offer prompt caching (up to 90% savings on repeated prefixes). Use it.
8. The 3-Attempt Rule
If Claude hasn't solved it in 3 attempts with different approaches, stop. Either:
- The problem needs human insight the AI lacks
- Your context is missing critical information
- The task needs decomposition into smaller pieces
Throwing more tokens at a stuck AI is the #1 cost waste.
MCP Integration (Tips 9-10)
9. MCP Servers Are Game-Changers (When Used Right)
Model Context Protocol lets Claude Code access external tools. The highest-value MCP servers:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["@anthropic/mcp-fs", "--root", "./src"]
},
"github": {
"command": "npx",
"args": ["@anthropic/mcp-github"]
},
"postgres": {
"command": "npx",
"args": ["@anthropic/mcp-postgres", "--connection-string", "$DATABASE_URL"]
}
}
}
Best practice: Start with filesystem + one integration (GitHub or DB). Don't add 10 MCP servers — each adds latency and token cost.
10. Progressive Disclosure for MCP
Don't expose all tools at once. Configure MCP servers to reveal capabilities progressively:
# Level 1: Read-only (default)
tools: [read_file, list_directory, search]
# Level 2: After confirmation
tools: [write_file, create_directory]
# Level 3: Explicit approval only
tools: [delete_file, run_command, database_write]
This prevents expensive mistakes. An AI with unrestricted write access WILL eventually corrupt something.
Production Patterns (Tips 11-12)
11. The Review Gate
Never ship AI-generated code without review. But make the review efficient:
# Generate a diff summary
claude "Summarize the changes you made and flag anything
that touches security, performance, or external APIs"
Focus your human review on:
- Security boundaries (auth, input validation)
- Error handling (is every failure path covered?)
- Performance (O(n²) hiding in innocent-looking code?)
- Side effects (unexpected state mutations?)
Skip reviewing: formatting, import ordering, variable naming.
12. Build Your Own Feedback Loop
The most valuable pattern I've found:
1. AI generates code
2. Tests run automatically
3. Test results feed back to AI
4. AI fixes failures
5. Repeat until green (max 3 cycles)
6. Human reviews final diff
This closes the loop between generation and validation. The AI learns from its own test failures within a session.
The Honest Truth
Claude Code (and all AI coding tools) are incredible for:
- Reducing boilerplate drudgery by 80%
- Exploring unfamiliar codebases fast
- Generating comprehensive test suites
- Maintaining consistency across large codebases
They're still unreliable for:
- Novel algorithm design
- Security-critical code
- Performance optimization
- Complex distributed systems reasoning
The developers getting the most value treat AI tools as a powerful junior developer — fast, eager, occasionally wrong, always needs review.
Want More?
These tips come from the AI Dev Toolkit — 264 production frameworks including Claude Code workflows, MCP server configs, multi-agent setups, and cost optimization pipelines. 168 samples are free on GitHub.
Which tip surprised you most? What's your Claude Code workflow? Let me know in the comments.
Top comments (0)