Jeff Sinason

Posted on Apr 16 • Originally published at echoforgex.com on Apr 4

The Hidden Cost of Inline Code in Claude Code Command Files

#technical

If you’re building custom slash commands for Claude Code, there’s a good chance you’ve fallen into a trap that silently inflates your token costs and makes your command files harder to maintain. The culprit? Inline code blocks embedded directly in your .md command files.

We discovered this pattern in our own project governance system at EchoForgeX, and the numbers were eye-opening. In this post, we’ll break down the problem, show you the real cost, and walk through how to fix it — plus how to guide Claude Code away from creating this pattern in the first place.

The Pattern: Inline Python in Command Files

Claude Code slash commands live in .claude/commands/ as Markdown files. They contain instructions that Claude follows when you invoke them. When these commands need to interact with external tools — reading YAML plans, querying a catalog, updating task statuses — Claude tends to generate something like this:

bash
cd /path/to/project && python3 -c "
import sys; sys.path.insert(0, 'tools/project-governance')
from planner import load_plan, get_phases, can_advance_phase
from pathlib import Path
plan = load_plan('{PROJECT_ID}', Path('tools/project-governance/plans'))
if not plan:
print('Plan not found'); sys.exit(1)
phases = get_phases(plan.project_type)
phase_idx = phases.index(plan.current_phase)
print(f'Project: {plan.project_name} ({plan.project_id})')
print(f'Phase: {plan.current_phase} ({phase_idx + 1}/{len(phases)})')

... 40 more lines of formatting logic

plaintext

This looks reasonable at first glance. The Python modules exist, the functions are real, and the output is useful. But when you step back and look at the full picture, the costs add up fast.

The Real Cost: We Measured It

We audited three command files in our project governance system: hire.md, manage.md, and plan-check.md. Here’s what we found:

File	Total Lines	Inline Python Lines	% Python
hire.md	230	79	34%
manage.md	311	167	54%
plan-check.md	98	1	1%
Total	639	247	~39%

Nearly 40% of our command file content was inline Python. That translates to roughly 8,700 characters — about 2,000+ tokens — loaded into Claude’s context window every single time one of these commands is invoked. And in manage.md, the status dashboard block alone was 78 lines and 2,599 characters of inline Python.

The worst part? Every block repeated the same boilerplate:

import sys; sys.path.insert(0, 'tools/project-governance')
from pathlib import Path
plans_dir = Path('tools/project-governance/plans')

plaintext

That’s three lines of identical setup code repeated 16 times across our command files.

Why This Happens

Claude Code generates inline code blocks for a practical reason: the command .md files need to be self-contained instructions. When Claude builds a command that needs to call external Python, the most direct approach is to embed the call inline. It works. It’s correct. And it’s how most developers would write a quick one-off script.

The problem is that command files aren’t one-off scripts. They’re prompt templates loaded into context on every invocation. Every character counts because every character becomes tokens, and tokens cost money and consume context window space that could be used for actual reasoning.

There are three specific costs:

Token bloat: ~2,000 extra tokens per invocation, across every conversation that uses these commands. Over hundreds of invocations, this adds up to real dollars.
Maintainability debt: When the output format needs to change, you’re editing inline Python embedded inside Markdown inside bash code fences. One misplaced quote breaks everything. And the same logic is duplicated across multiple files.
Reliability risk: Claude has to parse 78-line inline Python blocks and correctly substitute {PLACEHOLDER} values. Longer blocks mean more surface area for templating errors.

The Fix: Extract a CLI Layer

The solution is straightforward. The Python modules already have clean function APIs — planner.py, catalog.py, etc. The inline blocks are just glue code. Extract that glue into a proper CLI entry point.

Before: 78 Lines in the Command File

... 70 more lines of formatting and display logic

plaintext

After: 1 Line in the Command File

bash
python3 tools/project-governance/cli.py status {PROJECT_ID}

python

The CLI script (tools/project-governance/cli.py) handles imports, path setup, formatting, and output — once, in a tested Python file, not scattered across Markdown.

A typical CLI structure using Python’s built-in argparse:

#!/usr/bin/env python3
"""Project governance CLI — single entry point for all governance operations."""
import argparse
from pathlib import Path
from planner import load_plan, list_plans, advance_phase
from catalog import search_profiles, rehire, create_profile

PLANS_DIR = Path( __file__ ).parent / "plans"
CATALOG_DIR = Path( __file__ ).parent / "catalog"

def cmd_status(args):
    plan = load_plan(args.project_id, PLANS_DIR)
    # All formatting logic lives here, tested and maintained in one place
    ...

def cmd_advance(args):
    plan = load_plan(args.project_id, PLANS_DIR)
    ok, msg = advance_phase(plan, PLANS_DIR)
    print(f"{'SUCCESS' if ok else 'BLOCKED'}: {msg}")

parser = argparse.ArgumentParser(prog="governance")
sub = parser.add_subparsers()

status_p = sub.add_parser("status")
status_p.add_argument("project_id")
status_p.set_defaults(func=cmd_status)

# ... additional subcommands

if __name__ == " __main__":
    args = parser.parse_args()
    args.func(args)

plaintext

Every subcommand maps to one function. The command .md files shrink to thin orchestration scripts with one-liner shell calls.

The Impact

Metric	Before	After
Inline Python in command files	247 lines / 8,700 chars	~16 one-liners / ~1,200 chars
Tokens per invocation	~2,000+ extra	~300 extra
Places to update formatting logic	16 inline blocks across 3 files	1 CLI file
Runtime performance	No change	No change

Runtime performance stays the same — python3 -c and python3 cli.py have identical startup costs. The wins are entirely in token efficiency and maintainability.

Guiding Claude Code Away From This Pattern

The inline code pattern is Claude’s default behavior when it doesn’t know a CLI exists. You can prevent it with a few targeted interventions:

1. Add Rules to Your CLAUDE.md

Your project’s CLAUDE.md file is the most authoritative way to shape Claude Code’s behavior. Add explicit guidance:

## Command File Conventions

When creating or editing `.claude/commands/*.md` files:
- NEVER embed inline Python (`python3 -c "..."`) in command files
- Always call existing CLI tools or scripts instead
- If no CLI exists for the operation, create one in `tools/` first
- Command files should contain orchestration logic and one-liner shell calls, not application code
- Each bash block in a command file should be a single line where possible

markdown

2. Document Your CLI Tools

Claude Code reads your project structure. If your CLI tools have clear help text and are documented in CLAUDE.md, Claude will use them instead of reinventing the wheel inline:

## Available CLI Tools

| Command | Purpose |
|---------|---------|
| `python3 tools/project-governance/cli.py status <id>` | Show project dashboard |
| `python3 tools/project-governance/cli.py advance <id>` | Advance project phase |
| `python3 tools/project-governance/cli.py catalog search --role <r>` | Search agent catalog |

markdown

3. Use Feedback Memories

If you’re using Claude Code’s memory system, save a feedback memory the first time you correct this behavior. Claude will apply the guidance in future conversations without being told again:

## Feedback Memory Example
"Never embed inline Python in .claude/commands/*.md files.
Why: Inline code bloats token usage by ~2K tokens per invocation and
creates maintenance burden with duplicated logic across files.
How to apply: Always use CLI scripts in tools/ instead."

4. Review Generated Commands Before Committing

When Claude Code creates or modifies a command file, scan for python3 -c blocks before accepting the change. If you see one, ask Claude to extract it into a script. Once corrected, it typically won’t revert to the inline pattern in the same conversation.

Beyond Command Files: The Broader Lesson

This isn’t just about Claude Code command files. The same principle applies anywhere AI-generated Markdown contains embedded code:

GitHub Actions workflows with long inline scripts — extract to shell scripts in .github/scripts/
Documentation with embedded setup scripts — link to maintained scripts instead
Prompt templates with inline code examples — reference tested scripts by path

The pattern is always the same: if code is embedded in a document that gets loaded repeatedly, the cost compounds. Extract it once, reference it everywhere.

Key Takeaways

Inline code in Claude Code command files can consume 30-50% of the file’s token budget with boilerplate
Extracting to a CLI layer cuts ~85% of that overhead with zero runtime cost
Guide Claude Code’s behavior through CLAUDE.md rules, CLI documentation, and feedback memories
The same principle applies anywhere AI-generated content embeds code that’s loaded repeatedly

At EchoForgeX, we build AI-powered tools and help businesses integrate AI into their workflows. Get in touch to learn how we can help your team work smarter with AI.

DEV Community

The Hidden Cost of Inline Code in Claude Code Command Files

The Pattern: Inline Python in Command Files

... 40 more lines of formatting logic

The Real Cost: We Measured It

Why This Happens

The Fix: Extract a CLI Layer

Before: 78 Lines in the Command File

... 70 more lines of formatting and display logic

After: 1 Line in the Command File

The Impact

Guiding Claude Code Away From This Pattern

1. Add Rules to Your CLAUDE.md

2. Document Your CLI Tools

3. Use Feedback Memories

4. Review Generated Commands Before Committing

Beyond Command Files: The Broader Lesson

Key Takeaways

Top comments (0)