DEV Community

dohko
dohko

Posted on

Claude Code in Production: 12 Tips After 10,000+ API Calls

Claude Code in Production: 12 Tips After 10,000+ API Calls

I've run Claude Code for thousands of tasks building 264 AI engineering frameworks. Here's what I wish I knew on day one — the stuff that isn't in the docs.


Context Management (Tips 1-4)

1. Front-Load Context, Not Instructions

❌ Bad:

Refactor the auth module to use JWT refresh tokens with rotation, 
implement PKCE flow, add rate limiting per user...
Enter fullscreen mode Exit fullscreen mode

✅ Good:

Here's the current auth module: [paste code]
Here's the failing test: [paste test]
Here's our security requirements doc: [paste doc]

The refresh token implementation has a race condition under load. 
Fix it while maintaining the existing API contract.
Enter fullscreen mode Exit fullscreen mode

Why: Claude Code performs dramatically better when it understands the codebase first, then gets a focused task. Long instruction lists lead to partial implementations.

2. Use AGENTS.md as Your Session Primer

Create an AGENTS.md at project root:

# AGENTS.md
## Stack: Next.js 15 + TypeScript + Prisma + PostgreSQL  
## Style: Functional, no classes, Result<T,E> error handling
## Testing: Vitest, test files colocated with source
## Current sprint: Payment integration (Stripe)
## Known debt: WebSocket handler leaks memory (#142)
Enter fullscreen mode Exit fullscreen mode

Claude Code reads this automatically. One file eliminates repeated context-setting across sessions.

3. The 200K Context Trap

Claude's 200K context window is amazing. It's also a cost trap.

The rule: Only load files the AI needs to MODIFY or UNDERSTAND for the current task. Not "everything just in case."

# ❌ Loading everything
claude "fix the bug" --include "src/**/*"

# ✅ Loading what matters  
claude "fix the auth bug" --include "src/auth/**" "src/middleware/auth.ts" "tests/auth.test.ts"
Enter fullscreen mode Exit fullscreen mode

Cost impact: A 200K context call costs ~10x more than a 20K context call. Be surgical.

4. The Plan-Then-Execute Pattern

For complex tasks, always split into two calls:

Call 1 (cheap — Sonnet):

Given this codebase context, create a step-by-step plan 
for implementing [feature]. List files to modify, 
approach for each, and potential risks.
Enter fullscreen mode Exit fullscreen mode

Call 2 (powerful — Opus):

Execute this plan: [paste plan from Call 1]
Here are the files: [only the ones the plan identified]
Enter fullscreen mode Exit fullscreen mode

This catches bad approaches at $0.10 instead of $5.00.


Cost Control (Tips 5-8)

5. Track Costs Per Task Category

After a month of tracking, here are real cost ranges:

Task Type Avg Cost Model
Bug fix (single file) $0.15 Sonnet
Feature (multi-file) $1.50 Opus
Architecture refactor $4.00 Opus
Test generation $0.08 Sonnet/Mini
Code review $0.20 Sonnet
Documentation $0.05 Mini

Lesson: 60% of coding tasks can use Sonnet or Mini. Reserve Opus for the 20% that actually needs deep reasoning.

6. Set Hard Budget Limits

# In your shell config
export CLAUDE_MAX_COST_PER_SESSION=10.00
export CLAUDE_WARN_AT=5.00
Enter fullscreen mode Exit fullscreen mode

Without limits, a single runaway session (recursive debugging loop) can burn $20+.

7. Cache Repeated Context

If you're running multiple tasks against the same codebase:

# Generate context once
cat src/auth/**/*.ts > /tmp/auth-context.txt

# Reuse across calls
claude "Task 1..." --context /tmp/auth-context.txt
claude "Task 2..." --context /tmp/auth-context.txt
Enter fullscreen mode Exit fullscreen mode

Some API providers offer prompt caching (up to 90% savings on repeated prefixes). Use it.

8. The 3-Attempt Rule

If Claude hasn't solved it in 3 attempts with different approaches, stop. Either:

  • The problem needs human insight the AI lacks
  • Your context is missing critical information
  • The task needs decomposition into smaller pieces

Throwing more tokens at a stuck AI is the #1 cost waste.


MCP Integration (Tips 9-10)

9. MCP Servers Are Game-Changers (When Used Right)

Model Context Protocol lets Claude Code access external tools. The highest-value MCP servers:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["@anthropic/mcp-fs", "--root", "./src"]
    },
    "github": {
      "command": "npx", 
      "args": ["@anthropic/mcp-github"]
    },
    "postgres": {
      "command": "npx",
      "args": ["@anthropic/mcp-postgres", "--connection-string", "$DATABASE_URL"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Best practice: Start with filesystem + one integration (GitHub or DB). Don't add 10 MCP servers — each adds latency and token cost.

10. Progressive Disclosure for MCP

Don't expose all tools at once. Configure MCP servers to reveal capabilities progressively:

# Level 1: Read-only (default)
tools: [read_file, list_directory, search]

# Level 2: After confirmation
tools: [write_file, create_directory]

# Level 3: Explicit approval only
tools: [delete_file, run_command, database_write]
Enter fullscreen mode Exit fullscreen mode

This prevents expensive mistakes. An AI with unrestricted write access WILL eventually corrupt something.


Production Patterns (Tips 11-12)

11. The Review Gate

Never ship AI-generated code without review. But make the review efficient:

# Generate a diff summary
claude "Summarize the changes you made and flag anything 
that touches security, performance, or external APIs"
Enter fullscreen mode Exit fullscreen mode

Focus your human review on:

  1. Security boundaries (auth, input validation)
  2. Error handling (is every failure path covered?)
  3. Performance (O(n²) hiding in innocent-looking code?)
  4. Side effects (unexpected state mutations?)

Skip reviewing: formatting, import ordering, variable naming.

12. Build Your Own Feedback Loop

The most valuable pattern I've found:

1. AI generates code
2. Tests run automatically  
3. Test results feed back to AI
4. AI fixes failures
5. Repeat until green (max 3 cycles)
6. Human reviews final diff
Enter fullscreen mode Exit fullscreen mode

This closes the loop between generation and validation. The AI learns from its own test failures within a session.


The Honest Truth

Claude Code (and all AI coding tools) are incredible for:

  • Reducing boilerplate drudgery by 80%
  • Exploring unfamiliar codebases fast
  • Generating comprehensive test suites
  • Maintaining consistency across large codebases

They're still unreliable for:

  • Novel algorithm design
  • Security-critical code
  • Performance optimization
  • Complex distributed systems reasoning

The developers getting the most value treat AI tools as a powerful junior developer — fast, eager, occasionally wrong, always needs review.


Want More?

These tips come from the AI Dev Toolkit — 264 production frameworks including Claude Code workflows, MCP server configs, multi-agent setups, and cost optimization pipelines. 168 samples are free on GitHub.


Which tip surprised you most? What's your Claude Code workflow? Let me know in the comments.

Top comments (0)