DEV Community

Cover image for I've organised the Claude Code commands, including some hidden ones.

I've organised the Claude Code commands, including some hidden ones.

灯里/iku on February 14, 2026

Greetings from the island nation of Japan. In an era where we outsource our cognitive heavy lifting to silicon, keeping up with the relentless upd...
Collapse
 
nedcodes profile image
Ned C

the /insights command is genuinely underrated. i ran it after a month of use and realized i was manually doing things that could have been custom commands the whole time. the report basically told me "you type this exact prompt 4x a day, just make a slash command." felt called out.

the sub-agents vs agent teams comparison table is helpful too. i've been using sub-agents for dedicated test runners and it works well, but the token cost catches up fast if you're not watching /cost between runs. the tip about using Opus as commander + Sonnet for workers is the right call for keeping costs sane.

Collapse
 
akari_iku profile image
灯里/iku

@nedcodes
Hi Ned C! Thanks so much for the comment!!

I've been really overwhelmed by the amount of Anthropic's official docs and all the recent info. There are still so many commands I didn't know about. It's a good reminder to go back to the basics, read the primary sources carefully, and use them more often... yeah, I feel that lol.

I'm glad the comparison table was helpful! That makes me really happy!

Claudecode is excellent, but the costs are definitely something to watch out for, right...!

Using Opus as the commander and Sonnet for workers is a great tip, but I'm still a bit worried about relying only on Claudecode. So I've been experimenting with using OpenAI's CodeX for the implementation parts. It might work well as a hybrid approach. (The downside is my AI subscriptions keep piling up... haha)

Collapse
 
nedcodes profile image
Ned C

the hybrid approach with CodeX for implementation is interesting. do you treat each tool as its own isolated step, or is there some context handoff between them?

and yeah, the subscription creep is real. i started tracking monthly AI spend separately just to stay honest with myself about it.

Thread Thread
 
akari_iku profile image
灯里/iku

@nedcodes
Great question! It's not fully automated yet, but it's more than just isolated steps.

Here's what I've been experimenting with:

  1. Claude plans, Codex implements
    I use Claude (Opus) for high-level design and planning, then hand that off to Codex for implementation. Practically, I run both side by side using tmux — split panes in VS Code's terminal, Claude Code on the left, Codex CLI on the right, same project directory. So context lives in the shared codebase and plan files, no manual copy-pasting needed.

  2. Parallel runs for cross-review
    Same tmux setup — Claude Code as the main driver (e.g., large refactoring) and Codex as a second opinion running in parallel. They catch different kinds of bugs, which is the whole point.

  3. Orchestration via Agent SDK (exploring)
    The dream is using Agents SDK to orchestrate Claude as the planner and Codex as the coder automatically, with context passed via SDK. Still early stage for me, but the potential is exciting.

Honestly, part of it is also a philosophical thing — I don't want to depend too heavily on a single company's model. Same as in the real world, right? One person can't solve everything. Getting a "second perspective" from a different model catches blind spots. It's like applying "the right person for the right job," but for AI models lol.

Personality-wise too, Claude is more chatty and has great vibes for conversation, while Codex is more of a serious worker type. Both have their strengths!

And yeah, the subscription creep is painful... tracking it separately is smart, I should do that too haha.

Here's a Medium article that covers the Claude Code vs Codex comparison well if you're interested:
blog.ivan.digital/claude-code-vs-o...

Thread Thread
 
nedcodes profile image
Ned C

the tmux split-pane setup is practical. i've been thinking about similar workflows where you keep both agents in the same project directory and let the shared codebase be the communication layer instead of trying to pipe context between them programmatically. the Agents SDK orchestration angle is worth exploring, especially if you can define clear boundaries for what each model handles. curious whether you've hit cases where Claude and Codex disagree on approach and how you resolve that. also good call on not depending on a single provider, it's something i think about more now.

Thread Thread
 
akari_iku profile image
灯里/iku

@nedcodes
thanks for the thoughtful comment! the "shared codebase as communication layer" framing is exactly how i think about it too. no fancy piping, just let the filesystem be the interface.

on the Claude vs Codex disagreement question, great timing, i've been meaning to write this up lol

honestly, it's less about "who's right" and more about "whose perspective fills the gap." the models reflect their makers' philosophies more than you'd expect.

here's what i've noticed so far (still experimental, grain of salt etc):

  1. top-down vs bottom-up
    Codex tends to think architecturally first. flags structural issues early and pushes for refactoring before you write more code. Claude jumps in and starts building fast, which feels productive until you hit a wall of edge cases you didn't plan for. for bigger features, Codex's "slow down and think" approach usually wins out.

  2. over-engineering vs shortcuts
    Claude's failure mode is over-abstraction. too many layers, too much modularity for what you actually need. Codex goes the opposite way, cuts corners, skips edge cases. so i literally cross-review: feed Claude's output to Codex and vice versa. they catch each other's blind spots surprisingly well.

  3. greenfield vs precision work
    for creative/new features, Claude moves fast and generates ideas. but Codex sometimes ships a more "complete" result out of the box (tried making a 2D platformer with both. Codex auto-generated sprite cleanup, Claude didn't even build the floor lol). for infra or anything requiring precision, both struggle, but Codex grinds through test-fix cycles longer before giving up.

  4. planning style
    Claude gives you clean markdown with actionable snippets. Codex generates strict XML-style architecture docs, thorough but harder to read. personally i prefer Claude's style for day-to-day work, it just feels more... human to work with.

  5. so how do i resolve disagreements?
    you don't pick a winner. you treat it like a code review between two engineers with different backgrounds. not depending on a single provider isn't just a resilience thing, it genuinely produces better output when you let them challenge each other.

  6. hybrid workflow
    that's exactly why a hybrid approach is working well for me right now. something like: Claude for planning → Codex for review & implementation → Claude for final check. each model plays to its strengths in sequence.

  7. main brain vs sub brain?
    comes down to your preference and what you're trying to build. no universal right answer. i switch the lead role depending on the task, and that flexibility is part of the fun.

that said, this is my personal answer as someone who can freely pick tools. in a business context? often you don't get to choose. company policy, compliance, contracts, etc. can lock you into one provider. so the practical answer is... it depends lol

  1. work in progress i've been intentionally throwing the same tasks at both models to compare, and there's still a lot of testing to do. every model update from each company can shift the balance. what's true today might not hold in a few months.

still exploring, still learning. this turned out longer than expected lol. might be worth its own article at some point 😄

Thread Thread
 
nedcodes profile image
Ned C

this is a really solid breakdown. the cross-review pattern where you feed one model's output to the other is something i want to try more deliberately. the personality difference you mention (Claude chatty, Codex serious worker) maps to what i've seen too. curious if you've hit cases where their architectural disagreements were both wrong, or does one usually end up closer to the right call?

Thread Thread
 
akari_iku profile image
灯里/iku

@nedcodes
That's a rather intriguing question!

Short answer: one usually ends up closer to the right call. But there are some fun failure patterns.

  1. Both wrong in the same direction (too optimistic)

I design AI-integrated workflows for organizations. Every company has a different daily stack: Slack for chat, SharePoint for docs, Salesforce for CRM, Google Drive for storage... often a beautiful mess with no clean integration. Very common in Japan, probably everywhere though lol.

When I ask both Claude and Codex to plan architectures for these environments, they both propose elegant solutions that assume humans will actually follow the new workflow. They seriously underestimate how lazy people are. Both end up too idealistic about adoption.

  1. Both wrong in complementary ways (this one's sneaky)

Their different mistakes don't cancel out.
Distributed systems

  • Codex piles on unnecessary config and endpoints (bloat), Claude proposes fast implementations that ignore race conditions. Both "look like they work" but are fundamentally broken.
  • Large-scale refactoring: Claude suggests beautifully modular code that doesn't scale, Codex produces conservative rewrites of outdated patterns. Both miss the real bottleneck (e.g., a denormalized DB schema that crashes in production).

It's not "same mistake twice." It's "different mistakes that don't offset each other." Root cause? Both hallucinate with confidence.

  1. What I actually do about it
  • Feed both models detailed context about the client's existing tools and daily habits. Anchors them in reality instead of theory.
  • Split PDCA: Plan and Act stay with the human, Do and Check get delegated to AI. Human stays the architect and final judge.
  • Same task to both models, let them "debate," then you make the call. This alone cuts failure rates dramatically. This is why I side-eye the "automate everything!" hype a bit. The real value of cross-review isn't picking the winner. It's that disagreement itself is a signal: when they diverge, think harder, don't just pick one.

It all boils down to the basics: context is key.

Thread Thread
 
nedcodes profile image
Ned C

the "both wrong in complementary ways" failure mode is the one i hadn't thought through. i was assuming cross-review works because disagreement is a signal, but if they're both confidently wrong in different directions you just end up with two plausible-looking bad answers instead of one. that's way harder to catch than one obviously wrong output. your PDCA split where the human stays as architect makes more sense for that scenario than trying to automate the tiebreaker

Thread Thread
 
akari_iku profile image
灯里/iku

@nedcodes
haha yeah exactly!
When both outputs look plausible, you actually let your guard down. that's the sneaky part.

and to be clear, i'm not against automation at all! i just think LLMs genuinely can't tell when they're wrong, so someone's gotta cover that blind spot. it's less about control and more about... caring for the process, i guess?

we're still super early in figuring out how humans and AI work together. i'd love to keep exploring what that looks like with people who actually think about it like you do;)

Thread Thread
 
nedcodes profile image
Ned C

the tricky part is that the failure modes are different from what we're used to, so we don't even have good instincts for when to double check yet. i think that's what makes the "both wrong in complementary ways" thing so dangerous, you can't just pattern match your way out of it

Thread Thread
Collapse
 
akari_iku profile image
灯里/iku

@nedcodes
i tried to reply in the thread but dev.to said "you two talk too much" lol.
so continuing here!

good morning! glad to be part of your coffee routine now haha.
it's 11pm here so we're on opposite ends of the day.

and yeah, that's exactly it. when two outputs are both wrong in complementary ways, pattern matching won't save you. the instincts we built from reviewing human code just don't transfer cleanly.

that's exactly why the human has to stay in the loop not just as a user, but as the one who guards and makes the judgment calls. no model can own that responsibility yet, and personally, even if someday a model gets praised for being "reliable enough," i still think in any real business context, accountability has to stay with humans.

for hobby projects, sure, let AI take the wheel and see what happens that's half the fun! but at the end of the day, it's still a human who receives the result and decides "okay, what now?" lol

which is honestly why i think engineers aren't going anywhere. the role shifts, but it shifts toward something harder — redesigning architectures that have AI-generated code baked in, maintaining systems where you didn't write half the code, and knowing when to trust the output and when to push back. structure over instinct becomes the whole game.

anyway, have a great day at work~!

Collapse
 
nedcodes profile image
Ned C

ha, dev.to putting a cap on our thread is kind of on brand for what we were talking about. guardrails you didn't ask for.

the "structure over instinct" framing is the cleanest way i've heard anyone put it. i think that's the part that frustrates people who've been writing code for years. the instincts they spent a decade building don't transfer, and rebuilding from scratch feels like starting over even though it isn't.

anyway have a good night, i'll probably be overthinking this thread tomorrow morning with coffee again.

Collapse
 
akari_iku profile image
灯里/iku

@nedcodes
honestly the thread cap is the most poetic guardrail we could've asked for.
at least this one came with a natural stopping point instead of a cryptic yaml error.

"rebuilding from scratch feels like starting over even though it isn't" that's the quieter truth underneath all the gatekeeping noise. worth sitting with.

there's a lot of talk about "AI fatigue" and "AI detox" going around, and i get it.
but maybe the fun part isn't figuring out what's next it's redefining what we want to be. less "what do we solve" and more "who do we become." now i understand why Anthropic has been hiring philosophers. maybe they knew we'd end up here.

you with your coffee, me with my matcha. different drinks, different timezones, same questions. not a bad way to spend a morning.

Collapse
 
javz profile image
Julien Avezou

What a great reference for useful CLAUDE commands, thanks for sharing. I agree with the Plan Mode, I find it so useful for subsequent implementation.
I will try using /insights more, that command does indeed seem valuable.

Collapse
 
akari_iku profile image
灯里/iku

@javz
thanks Julien! glad the reference is useful. Plan Mode is seriously underrated, it changed how i approach implementation too.

for /insights, i'd say find a rhythm that works for you rather than overdoing it. running it too frequently tends to surface similar patterns. personally i think every couple of weeks, or at natural breakpoints like finishing a project milestone, wrapping up a big feature, or after a round of GitHub pushes works well. whatever timing fits your workflow, that's the right one. hope it serves you well~!