Claude Code is the fastest coder I've ever worked with. It can scaffold a feature, write tests, and open a PR in minutes. But I kept running into t...
Some comments have been hidden by the post's author - find out more
For further actions, you may consider blocking this person and/or reporting abuse
I think that plan is solid in itself and tries to enforce some very important software engineering and QA practices.
That said, the assumption that you can make Claude think something in particular or assume a software architect role -- hell, that any LLM can think at all -- is misguided. Instead you're just trying to offset statistics in your favour. Statistics-based output will still bullshit you now and then, and you still won't know whether it does, or not. (I see you trying very hard to, with all that reference output for human oversight, but still: LLM slop remains, no matter how hard you push.)
I will still try (and perhaps keep) using this as part of my ever-growing push for quality.
My point is that caution should go up, not down, with increasingly sophisticated and convincing output.
It’s not a bulletproof solution to let AI build things for you. These are simply guardrails, grounded in old school, battle tested software architecture and development principles that guide the AI in the right direction.
Sure, I dress it up with a bit of marketing flair, otherwise where’s the fun. But details aside, it works far better than raw prompt slinging. The goal is simple, make your life easier while still respecting the fundamentals of good software engineering.
The "process over intelligence" framing is exactly right. I've been applying the same thinking to AI outside of coding — specifically to email communication.
Most AI email tools operate in what you'd call "junior mode": they see an email, they generate a reply, they fire. Fast, enthusiastic, and occasionally catastrophic (wrong tone to a client, missing context on a sensitive thread, confidently replying to something that needed a human pause).
Your 8-phase approach maps surprisingly well to non-coding AI workflows. For email, the equivalent looks something like:
I'm building a Mac app (Drafted) that follows this exact philosophy. It reads your Gmail inbox, assesses confidence on each email (High/Medium/Low), and pre-drafts replies — but crucially, it never sends anything. You always review, edit, and hit send yourself.
The parallel to your CLAUDE.md approach: just like you encode process into the prompt so Claude doesn't skip steps, we encode process into the tool so the AI doesn't skip the "should a human look at this first?" step. The answer is always yes.
Curious if you've thought about applying the /wizard methodology beyond code — documentation, communication, operational workflows? The "think before you act" principle feels universally applicable.
Thanks for the kind words, and Drafted sounds like a genuinely thoughtful take on AI-assisted communication. The confidence-scoring layer is smart -- a lot of tools skip straight to drafting without asking whether this is even the kind of email that should be auto-drafted at all.
Funny you asked about applying the methodology beyond code -- I shipped something yesterday that does exactly that. Battle Mage is a Slack agent powered by Claude that answers questions about a GitHub codebase when you @mention it. It reads your repo in real time, follows up in threads, and even lets you correct it so it builds a shared knowledge base over time. For me it started as a way to help new users onboard to a product without drowning a small team in repetitive questions -- same "think before you act" principle, different domain.
Repo is here if you want to take a look: github.com/vlad-ko/battle-mage
Still needs some polishing, but I plan on announcing it to the world shortly.
p. s. I've got a whole magical AI army all of a sudden 😆 it's lore I can enjoy.
I couldn't agree with your thoughts more. What resonates most in this piece is the idea that the real bottleneck isn’t model intelligence—it’s the lack of process. The article makes this clear when it notes that Claude “defaults to junior mode… not because it lacks knowledge, but because it lacks process”. That distinction matters.
LLMs can generate code at incredible speed, but speed without structure just accelerates the path to subtle regressions, race conditions, and “it worked until it didn’t” failures. What /wizard does well is exactly what senior engineers do instinctively: slow down the beginning so the end goes faster. Planning, exploring the codebase, verifying assumptions, writing mutation‑resistant tests—these are the habits that prevent the 2am incidents described in the article, like the nullable datetime crash or the missing database lock.
But even with a strong process prompt, the human role doesn’t disappear. If anything, it becomes more important. The human is still the one holding the architectural context, the product intent, the long‑term tradeoffs, and the “should we even build this?” perspective. AI can execute steps, but it can’t yet own the big picture.
That’s why I see frameworks like /wizard as less about making AI autonomous and more about making AI reliable. The human sets the direction; the process keeps the AI from cutting corners; and the combination produces work that’s both fast and trustworthy.
In other words: intelligence is useful, but process is what makes intelligence safe and smart to use.
100% agree, Larry. This is exactly why I never let Claude Code work directly on main -- always a feature branch, always a PR, always a human review gate before anything merges. The process prompt keeps AI from cutting corners mid-task; the branch discipline keeps humans in the loop at every stage. Both layers matter.
The think-then-plan-then-code flow is solid — worth noting that coupling the /wizard skill with spec-driven prompts could let you enforce architectural constraints before any code gets generated, not just testing conventions.
Great point -- spec-driven prompts as an upstream constraint layer is exactly the right direction. Think of it as guardrails before the guardrails. The EXPLORE phase in /wizard already nudges Claude to read existing patterns, but encoding architectural rules explicitly in a spec would make that much harder to accidentally violate. Worth experimenting with.
Phase 7 is the one that resonated most. The adversarial review mindset — reviewing your own output as an attacker before shipping — applies well beyond code.
I built a Mac app that uses AI to draft email replies, and the hardest design decision was how much autonomy to give the AI. The answer: none for sending. The AI drafts, the human reviews, the human sends. Every time.
The confidence scoring system I built is basically a lightweight Phase 7 — the AI evaluates its own draft and flags how confident it is. High confidence means it nailed the tone and context. Low confidence means "read this carefully before hitting send."
Your line about "enthusiasm does not catch nullable datetime crashes" is perfect. In email, enthusiasm doesn't catch wrong names, misread tone, or confidently wrong recommendations. The adversarial review mindset applies everywhere AI touches human-facing output.
Bookmarking the /wizard skill — the 8-phase methodology is a solid framework for any AI-assisted workflow, not just coding.
Indeed. To be honest, this is largely the same approach I’ve always taken to software architecture and development, long before AI-assisted coding was even a distant possibility. The presence of AI on our side shouldn’t be an excuse to cut corners on tried and proven methodologies. If anything, it gives us a more sophisticated set of tools to follow those principles more consistently and enforce them more effectively.
Claude Code with a properly written CLAUDE.md is 10000000000000x better than this.
Before I installed this, I had clean, secure code.
After I installed this, I had to spend hours cleaning and securing the code.
You did something horribly wrong 😅