Chudi Nnorukam

Posted on Feb 10 • Edited on Feb 25 • Originally published at chudi.dev

How I Use Claude Code to Ship Production-Quality Code Every Session

#claudecode #ai #workflow #tutorial

Originally published at chudi.dev

I shipped broken code three times in one week. The AI said "should work." I believed it.

That experience led me to build a complete system for AI-assisted development—one where evidence replaces confidence, context persists across sessions, and quality gates make cutting corners impossible.

This guide covers everything I've learned building with Claude Code.

What You'll Learn

This guide is organized into four core areas:

Part 1: Quality Control That Actually Works

The biggest mistake in AI-assisted development is accepting confidence as evidence.

When Claude says "should work," that's not verification—it's a guess. The two-gate system I built makes guessing impossible by blocking all implementation tools until quality checks pass.

For the complete breakdown of gates, phrase blocking, and the 4 pillars of quality, read: I Built a Quality Control System for AI Code Generation

The Core Principle

Gate 0: Meta-Orchestration

Validates context budget (under 75%)
Loads quality gates and phrase blocking
Initializes the skill system

Gate 1: Auto-Skill Activation

Analyzes your query intent
Matches against 30+ defined skills
Activates top 5 relevant skills

Only after both gates pass can you write code. Like buttoning a shirt from the first hole—skip it, and everything else is wrong.

Evidence Over Confidence

These phrases get blocked:

Red Flag	Problem
"Should work"	No verification
"Probably fine"	Uncertainty masked as completion
"I'm confident"	Feeling, not fact
"Looks good"	Visual assessment, not testing

Replace with evidence:

Build completed: exit code 0, 9.51s
Tests passing: 47/47
Bundle size: 287KB

For the complete verification system including the 84% compliance protocol, see the full quality control guide.

Part 2: Context Management

"We already discussed this."

I said it. Claude didn't remember. Thirty minutes of context—file locations, decisions, progress—gone after compaction.

The dev docs workflow solves this permanently.

For the complete dev docs workflow including automation hooks, read: How to Prevent Claude from Forgetting Your Task

The Three Dev Doc Files

Every non-trivial task gets a directory:

~/dev/active/[task-name]/
├── [task-name]-plan.md      # Approved blueprint
├── [task-name]-context.md   # Living state
└── [task-name]-tasks.md     # Checklist

plan.md: The implementation plan, approved before coding. Doesn't change during work.

context.md: Current progress, key findings, blockers. Updated frequently.

tasks.md: Granular work items with status. Check items as you complete them.

The Magic Moment

[Context compacted]
You: "continue"
Claude: [Reads dev docs automatically, knows exactly where you are]

No re-explaining. No lost progress. Just continuation.

When to use dev docs:

Any task taking more than 30 minutes
Multi-session work
Complex features with multiple files
Anything you'd hate to re-explain

For the complete workflow including 16 automation hooks, see the context management guide.

Part 3: Token Optimization

Most Claude configurations load everything upfront. Every skill, every rule, every example—thousands of tokens consumed before you've asked a question.

Progressive disclosure flips this.

For the complete progressive disclosure implementation, read: How to Reduce AI Token Usage by 60%

The 3-Tier System

Tier	Content	Tokens	When Loaded
1	Metadata	~200	Immediately
2	Schema	~400	First tool use
3	Full	~1200	On demand

Tier 1: Skill name, triggers, dependencies. Just enough to route the query.

Tier 2: Input/output types, constraints, tools available.

Tier 3: Complete handler logic, examples, edge cases.

The meta-orchestration skill alone: 278 lines at Tier 1, 816 with one reference, 3,302 fully loaded. That's 60% savings on every session that doesn't need the full content.

For implementation details and your own skill definitions, see the token optimization guide.

Part 4: Foundational Concepts

Before building complex AI workflows, you need to understand the underlying patterns.

RAG: Retrieval-Augmented Generation

RAG gives LLMs access to external knowledge at inference time. Introduced in a 2020 Meta AI paper and now foundational to production AI systems, RAG pulls in relevant documents before generating—rather than relying solely on training data with a fixed knowledge cutoff.

For the complete RAG explanation with code examples, read: What is RAG? Retrieval-Augmented Generation Explained

The pattern:

Query Processing → 2. Retrieval → 3. Augmentation → 4. Generation

Every time you feed context to Claude before asking questions, you're using RAG. The dev docs workflow is essentially manual RAG—retrieving your context files before generation.

Evidence-Based Verification

"Should work" is the most dangerous phrase in AI development. It indicates confidence without evidence.

For the psychology of verification and the 84% compliance protocol, read: Why 'Should Work' Is the Most Dangerous Phrase

The forced evaluation protocol:

EVALUATE: Score each skill YES/NO with reasoning
ACTIVATE: Invoke every YES skill
IMPLEMENT: Only then proceed

Research shows 84% compliance with forced evaluation vs 20% with passive suggestions. The commitment mechanism creates follow-through.

Part 5: Common Failure Modes

After building the quality control system, I've watched colleagues start using Claude Code and make the same mistakes. These are the ones worth knowing before you hit them.

Treating every task as a conversation

Claude Code's memory resets at the start of each session. Most people know this. But they still write context in chat messages instead of files. "Remember that we're using PostgreSQL, not MySQL" gets lost when context compacts. Write it in a file once, reference the file every session.

The dev docs workflow exists precisely for this. Before any significant session, start with: "Read context.md and give me a brief on where we are." Five seconds, no lost context.

Using Claude for decisions instead of execution

Claude Code is a tool for doing things, not deciding what to do. If you're asking it "should I use Redux or Zustand?" or "is this architecture good?", you're using it wrong. Make the decision yourself (or with a separate research session), then give Claude Code a clear, bounded task.

The clearer your input, the higher the quality of your output. "Implement a Redux store for auth state with these specific actions" produces better results than "help me set up state management."

Skipping the gate check when tired

The two-gate system works when you follow it and fails immediately when you skip it. The temptation to skip is highest when you're tired, rushing, or "just need to make one small change." That's exactly when it matters most. Small unverified changes in tired states are where production bugs come from.

The gate isn't bureaucracy. It's the system protecting you from yourself at 11 PM.

Letting context balloon without checkpointing

A session that starts with a clear task and runs for three hours without checkpointing will start producing worse results as context fills. Claude Code sees everything in the window—including the tentative approaches you abandoned, the errors you hit and recovered from, the exploratory tangents. All of that degrades signal.

Checkpoint every 45–60 minutes on long tasks. Run /update-dev-docs. The next sub-session starts clean. Quality stays high.

Part 6: Adapting to Your Stack

The gates, dev docs, and progressive disclosure patterns work across stacks. But how you apply them varies by project type.

SvelteKit + Static Sites

The context file structure for a SvelteKit project should reflect the routing model. Your context.md should document which routes are prerendered, which are server-side, and which are client-side—because Claude will make different assumptions about data loading depending on what it thinks the rendering strategy is.

One mistake I made early: assuming Claude remembered that a specific route was prerendered. It didn't. Every session it would suggest server-side loading patterns that don't apply to static routes. A two-line note in context.md—"route /blog is prerendered—no server hooks, use data exported from +page.ts"—eliminated that class of confusion entirely.

The build gate matters more for SvelteKit than some other stacks because the static adapter has strict requirements. Dynamic imports, server-only code in client components, and missing types cause build failures that don't surface in dev mode. Make pnpm build non-negotiable before marking any task complete.

Next.js App Router

The App Router's server component versus client component distinction is the most common source of confusion in Claude Code sessions. Claude will occasionally suggest a hook or browser API inside a server component, or vice versa.

Capture the component hierarchy in your context.md: which components are server components (no 'use client'), which are client components, and which are shared utilities. This isn't excessive documentation—it's the exact information Claude needs to avoid the most common class of error.

Evidence gate addition for Next.js: add tsc --noEmit to your gate checklist. TypeScript errors in App Router code often don't surface until type-checking because the dev server is permissive.

Pure API Projects

API-only projects benefit most from the dev docs workflow because the relevant context is all in files—no visual output to check, no screenshots, just data shapes and endpoint contracts.

Your context.md for an API project should always include: the current database schema, the authentication model, and the active endpoints with their expected inputs and outputs. Keep it to one page. If it's longer, you're capturing implementation details that belong in code comments, not context.

For evidence gates, add a smoke test to the checklist: one curl command per major endpoint that should return a 200. Not comprehensive integration tests—just enough to confirm the service is responding correctly before you consider the session complete.

Long-Running Projects

The three-file dev docs structure was designed for task-level work—features and bug fixes that complete in days. For projects that run months, add a fourth file: project-context.md.

project-context.md captures what doesn't change session-to-session: the architectural decisions, the tech stack choices and the reasons for them, the non-negotiable constraints, and the vocabulary the codebase uses. What the project calls "users" versus "accounts" versus "members" matters more than you'd expect.

This file gets read at the start of every session, before context.md. It's the stable foundation that prevents Claude from proposing changes that would violate architectural constraints established months ago.

The investment: 30 minutes once, saved every session for the life of the project.