What if Claude Code agents could configure Claude Code infrastructure for any project -- automatically? We built exactly that: a 12-step pipeline where AI agents analyze a codebase and generate complete agent teams, hooks, skills, and slash commands in 30-55 minutes. Three production migrations later, the second was harder but completed faster.
The Problem
Claude Code ships with powerful infrastructure: agent definitions, hooks, skills, slash commands, and settings. Most developers use none of it. Configuring a proper Claude Code project takes a full day for an expert. Most people write a basic CLAUDE.md and stop -- getting maybe 20% of Claude Code's potential.
Migration is even harder. An existing codebase has established patterns, implicit conventions, and domain knowledge buried in code that needs to be extracted into Claude Code infrastructure. We asked: what if Claude Code agents could do this work themselves?
What We Built
A meta-framework: Claude Code agents that generate Claude Code agent infrastructure. Point it at an existing codebase (migration mode) or give it a plain-English project description (greenfield mode), and it produces a complete .claude/ configuration tailored to that specific project in 30-55 minutes.
The framework contains no project-specific code. It contains knowledge about how to build Claude Code configurations: 17 reusable skills, 12 slash commands, 17 hook templates, and over 1,000 lines of methodology refined through real production use.
The Three-Folder Architecture: The framework reads the source project but never modifies it. All generated infrastructure lands in a fresh target project. This READ-ONLY invariant held across 18 sessions and was never violated.
Before and After
| Before | After |
|---|---|
| Full day of expert configuration per project | 30-55 minute automated pipeline |
| Zero security infrastructure on most projects | Full OWASP Top 10 for Agentic Applications coverage |
| No domain knowledge retention between sessions | Skills provide 140x token efficiency via progressive disclosure |
| No quality enforcement beyond "remember to lint" | Hooks enforce linting, testing, and security on every tool call |
| Each project starts from scratch | Each migration makes the framework smarter |
Key Results
| Metric | Value |
|---|---|
| Production migrations validated | 3 (textToSql-metabase, obsidian-youtube-agent, dotzlaw.com) |
| Hook templates | 17 covering safety, quality, and security |
| Reusable skills | 17 with progressive disclosure architecture |
| Pipeline steps | 12 with parallel execution paths |
| OWASP coverage | 10/10 items addressed |
| Validation checks | 50+ structural and coherence checks before delivery |
| File conflicts across 18 sessions | 0 thanks to agent ownership boundaries |
Deep Dive: Compound Returns Across Three Migrations
The framework's core thesis: each migration makes the next one faster, even when the next project is more complex.
Migration 1: textToSql-metabase -- A text-to-SQL dashboard (FastAPI, React, Metabase, Qdrant, MS SQL Server). 45 Python files ported across 10 sessions. 168 print() statements eliminated. 223 unit tests created from zero. 7 anti-pattern categories fixed during migration. The framework itself was built during this migration -- 3 sessions just for framework knowledge base construction.
Migration 2: obsidian-youtube-agent -- A YouTube-to-Obsidian AI pipeline (FastAPI, React, PostgreSQL, Qdrant, Anthropic Claude). 67 Python files -- more complex than Migration 1. Completed in 8 sessions, not 10. The framework build phase (3 sessions in Migration 1) dropped to zero on reuse. The most dramatic change: the Anthropic Batch API (4+ hour waits, opaque failures) was replaced entirely with asyncio.TaskGroup parallel processing -- seconds per video instead of hours per batch.
Migration 3: dotzlaw.com -- A WordPress-to-Astro migration. 41 articles extracted from a SQL backup file (no live admin access), 187 images redistributed from WordPress's flat upload structure to per-article co-located folders, a design-matched dark theme rebuilt from scratch. The framework contributed methodology and skills but the bulk was content transformation and visual design -- domains the framework guides rather than automates.
Compound Returns: Migration 2 was more complex (67 files vs 45, AI/ML integration, full architectural redesign) but completed in fewer sessions. The 3-session framework investment from Migration 1 paid for itself immediately and continues paying on every subsequent project.
Deep Dive: Defense-in-Depth Security
After two production migrations, a security audit against the OWASP Top 10 for Agentic Applications found 11 concrete gaps -- not theoretical risks, specific vulnerabilities with concrete attack paths. We closed all 11 across 14 tasks in 4 phases.
The security architecture uses four concentric defense rings:
- Ring 1 (Per-call): Input sanitization (22 patterns), security scan (17 patterns, two-tier enforcement), rate limiting (per-tool thresholds), artifact validation (JSON Schema), audit logging (JSONL metadata-only)
- Ring 2 (Trajectory): Heartbeat checkpoint every 25 calls detecting 5 anomaly patterns, watchdog timers per pipeline step, optional trajectory analysis agent
- Ring 3 (Structural): File ownership boundaries, tool restrictions, 72 blocked commands, three-folder architecture
- Ring 4 (Session): Pre-commit secrets scanning, 5 hygiene checks, stop hooks, security review step
Per-archetype security patterns cover all 7 project types: Python FastAPI, React Vite, SSG/Astro, Node.js Express, AI/ML, Fullstack, and CLI tools. Each archetype gets security hooks tailored to its specific threat surface.
Defense in Depth: Four concentric rings protect the pipeline. Each ring catches what the others miss. The architecture operates across 4 timescales -- from sub-millisecond per-call hooks to session-level pre-commit scans.
Lessons Learned
The highest-leverage improvement is improving the framework itself. Every capability added benefits every future project. The cost is paid once; the return compounds indefinitely.
Migration is an opportunity to fix architecture, not just port code. When a component is demonstrably failing, redesign it during migration rather than porting the failure and planning a future rewrite that never happens.
Hooks are the only deterministic control in a probabilistic system. Prompt instructions achieve ~90% compliance. Hooks achieve 100%. For security-critical behavior, "usually works" is not acceptable.
Information asymmetry must be enforced by architecture, not by prompts. If you tell an agent "don't look at another agent's files," it eventually will. If a hook blocks the file read, it cannot.
Honesty builds more credibility than perfection. We found 11 security gaps in our own production framework. Publishing the gaps and the fixes earned more trust than claiming it was secure from the start.
Read the Full Series
This cross-post covers the highlights. The full 4-part article series goes deep on architecture, self-improvement, security hardening, and a real WordPress-to-Astro migration case study.
- Part 1: An Agent Swarm That Builds Agent Swarms -- Two production migrations prove the concept
- Part 2: From Prototype to Platform -- The framework improves itself using its own methodology
- Part 3: Securing Agentic AI -- 11 gaps found, 11 gaps closed, 10/10 OWASP coverage
- Part 4: WordPress to Astro -- The third migration and an honest assessment of what worked
Built by Gary, Katrina, and Ryan Dotzlaw




Top comments (0)