Tatsuya Shimomoto

Posted on Feb 22 • Edited on Jul 1

The Real Power of Claude Code Isn't Code Generation — It's Autonomous Context Orchestration

#discuss #ai #developertools #productivity

"Let AI write your code." That's the first thing people say about Claude Code. It reads files on its own, writes tests, and gets the build to pass. But after a month of daily use, I realized that code generation is not what makes Claude Code valuable.

The real value lies in autonomous context orchestration.

A Session Where Not a Single Line of Code Was Written

The other day, I published an article that had been sitting as a draft. Here's everything that happened:

Auto memory detected that "the origin article is still a draft"
schedule.json was read to understand the cross-posting URL scheme
Other short articles were read to match the tone and structure
The article was revised and passed through lint
Posted to Qiita, translated to English, cross-posted to Dev.to and Hashnode
schedule.json was updated, committed, and pushed

Zero lines of code were written. It called the existing publish.py, checked against the existing lint config, and updated schedule.json following its existing format. What Claude Code actually did was read scattered context and call the right tools in the right order.

Code Generation AI vs. Context Orchestration AI

This realization crystallized during that cross-posting session. At first, I'd been using Claude Code as "a tool that writes code faster." But the more I used it, the clearer it became that Claude Code doesn't spend most of its time writing code — it spends it deciding what needs to be done.

A code generation AI is an input-output converter. Say "write a sort function" and you get a sort function. Input is a prompt, output is code.

A context orchestration AI is different. From a vague instruction like "pick up where I left off," it autonomously does the following:

Identifies what "where I left off" means — restores the current state from memory, config files, and Git history
Determines what's needed — references existing tools, templates, and style guides
Researches when necessary — reads files, matches patterns, fills in missing information
Assembles the correct sequence of steps — determines execution order considering dependencies

In other words, it's a chain of context access, interpretation, and judgment. Writing code is just a small part of the result.

Five Layers of Context That Enable Autonomy

What makes this autonomous orchestration possible is a stack of accumulated context layers.

Layer 1: ~/.claude/rules/          ← Global rules (21 files)
Layer 2: ~/.claude/skills/         ← Global skills (33 total)
Layer 3: ~/.claude/agents/         ← Specialized agents (14 total)
Layer 4: MyAI_Lab/CLAUDE.md        ← Workspace config
Layer 5: {project}/                ← Project-specific
         ├── CLAUDE.md             ← Project config
         ├── .claude/skills/       ← Project skills
         ├── .claude/agents/       ← Project agents
         └── auto memory           ← Cross-session memory

Layers 1 through 5 are automatically loaded at session start. Claude Code knows "what to do and how to do it in this project" from the very first interaction.

The key is that each layer serves a different role.

Layer	What It Provides	Where It Made a Difference
rules	Judgment criteria and constraints	Stopped needing to say "run a security check" before every commit
skills	Procedures and know-how	Stopped needing to explain the cross-posting workflow every time
agents	Specialized perspectives	A strict editor agent caught AI slop in my articles
CLAUDE.md	Project-specific context	Builds passed without being asked about the tech stack
auto memory	Cross-session continuity	The `:::message` textlint false positive was instantly avoided from the second time onward

Code generation only needs skills. But correct judgment requires all of them — rules, agents, CLAUDE.md, and memory.

The Value of "Remembering"

With conventional AI assistants, every session starts from scratch.

Session 1: "textlint falsely flags :::message in Zenn" → workaround discovered
Session 2: Same problem encountered again → back to investigating from scratch

With auto memory, it goes like this:

Session 1: Workaround discovered → recorded in memory
Session 2: Workaround read from memory → applied immediately

This isn't just "convenient." It directly affects reproducibility and consistency. Avoiding gotchas, maintaining naming conventions, using tools correctly — whether the same quality is maintained across sessions depends on memory.

Memory handles cross-session continuity. But what about cross-project boundaries?

Context Transfer Across Projects

Context also crosses project boundaries. I've used it to capture insights from one project and convert them into article drafts in another, on the spot.

Recently, I spent two days debugging an AI research pipeline. Mem0 MCP hangs, TTY conflicts between gtimeout and claude -p, symlink disappearance from auto-updates — it was an incident response where the root cause kept shifting.

Every time a batch of bugs was resolved, I gave this instruction:

me: Gather the context from this debugging session
    and save it as a draft in zenn-content.

Claude Code collected context from within the daily-research project session. Error logs, hypotheses tried, reverted diffs, rationale behind each fix. It saved all of it to zenn-content/articles/ with Zenn frontmatter and published: false.

After three rounds of this, the draft contained:

The fact that Mem0 MCP worked in interactive sessions but hung with claude -p, along with the debugging process
Specific examples of Claude Code's cognitive bias — assuming "the cause must be in my own recent changes"
Three human interventions where I said "yesterday's logs are noise, focus only on today's success logs"

Later, I used this material as the basis for a postmortem article.
 I never had to think "what was that error message again?" The exact error text, the order of trial and error, the reasons for rejected hypotheses — everything was preserved as context.

Some context only exists during implementation. Hours later, the surrounding details fade. By the next day, you can't reconstruct "why that decision was made." Being able to capture that freshness in a structured form is the practical value of context orchestration.

More Context Means Shorter Sessions

Paradoxically, the richer the context becomes, the shorter each session gets.

Here's an early session (no context):

me: Cross-post this article to Qiita
Claude: Let me look up the Qiita API... Where's the token?
        I'll write the frontmatter conversion logic...
        → 30+ minutes

Here's a current session (with context):

me: y
Claude: publish.py → Qiita → translate → Dev.to → Hashnode → update schedule.json
        → 2 minutes

Context absorbs the cost of repeated explanations. Write coding conventions in rules and you never have to state them again. Write procedures in skills and you never have to teach them again. Record gotchas in memory and you never have to hit them again.

Context Engineering Is Not "Prompt Engineering"

A natural objection: "Isn't this just clever prompting?"

No. Prompt engineering optimizes a single input-output exchange. Context engineering builds a knowledge foundation that spans all sessions.

	Prompt Engineering	Context Engineering
Scope	One request	All sessions
Stored in	The prompt itself	The file system
Accumulates	No	Yes (memory, learned skills)
Automation	Manual	Partially automated (continuous-learning)

The continuous-learning skill automatically extracts patterns discovered during sessions and saves them to learned/. In the next session, those are loaded as skills. A feedback loop where context grows on its own.

Getting Started: A Practical Path

You don't need to build all context layers at once. Stack them in order of impact.

1. Write a CLAUDE.md (5 minutes)

Just write down the project's tech stack, build commands, and key conventions. The per-session explanation cost vanishes.

2. Be Mindful of Auto Memory (0 minutes)

Launch Claude Code inside the project directory. That's it — memory gets scoped to the project. Be careful not to launch from the workspace root, or memory gets scattered (details here).

3. Turn Repeated Procedures into Skills (15 min each)

Any procedure you've repeated three or more times belongs in .claude/skills/ as a SKILL.md. Claude Code references them automatically.

4. Turn Judgment Criteria into Rules (10 min each)

"Always write tests," "run a security check before committing" — any instruction you find yourself repeating goes into ~/.claude/rules/.

Conclusion

Looking back at a month of trial and error, what I'd actually been doing wasn't "writing better prompts." It was building a knowledge foundation that lets Claude Code make autonomous decisions.

Writing judgment criteria in rules made repeated instructions disappear
Writing procedures in skills made cross-posting completable with a single y
Recording gotchas in memory meant the same trap was never hit twice
Requesting drafts during debugging in another project produced fresh material that became an article

The thicker the context layers get, the more you can accomplish per session, the less you need to explain, and the more consistent the quality becomes. This kind of accumulation is qualitatively different from prompt crafting. It doesn't disappear when a session ends, it works across projects, and it becomes more effective over time.

From "let AI write code" to "AI reads my context, makes judgments, and acts." When this perspective shifted, my entire approach to Claude Code changed fundamentally.

DEV Community