If a person who invented the oven waits for it to heat properly, you do the same.
If the camera designer adjusts the lightning settings, you do the same.
If the car maker buckles the seatbelt first, you do the same.
Hey I’m Karo 🤗
AI product manager, Substack Bestseller, and someone who believes the most valuable lessons come from people who invent tools and then use them themselves.
This article was originally published here.
Boris Cherny is a Staff Engineer at Anthropic who helped build Claude Code. Today we’re going to look at how he uses it, what he does, what he avoids and, more importantly, what it means.
Link to original tweet.
#1 He Runs a Fleet of Claudes in Parallel
What he does
He keeps ~10–15 concurrent Claude Code sessions alive: 5 in terminal (tabbed, numbered, with OS notifications). 5–10 in the browser. Plus mobile sessions he starts in the morning and checks in on later. He hands off sessions between environments and sometimes teleports them back and forth.
What this means
- Why this matters: Boris doesn't see AI as a tool you use, but as a capacity you schedule. He’s distributing cognition like compute: allocate it, queue it, keep it hot, switch contexts only when value is ready.
The bottleneck isn’t generation; it’s attention allocation.
How he sees agents: Each session is a separate worker with its own context, not a single assistant that must hold everything. The “fleet” approach is basically: don’t make one brain do all jobs; run many partial brains.
What it means for vibecoders: This is fascinating and goes beyond prompt engineering or context engineering - to pipeline design: multiple pieces in flight, you only touch what’s ripe.
#2 He Uses the Slowest, Smartest Model by Default
What he does
He uses Opus 4.5 with thinking for everything, even though it’s bigger & slower. Because the quality he gets that way makes the whole process faster.
What this means
- Why this matters: He’s optimizing for total iteration cost. Single output speed is a vanity metric.
A wrong fast answer is slower than a right slow answer.
How he sees responsibility: He’s implicitly acknowledging the “correction tax”: every hallucination or half-right patch charges human attention. His choice is to pay compute to reduce human oversight load.
What this means for vibecoders: Don’t optimize for cost per token, optimize for cost per reliable change.
#3 He Maintains a Shared CLAUDE.md That Turns Mistakes Into Institutional Memory
This is one of my favourite takeaways.
What he does
His team keeps one shared CLAUDE.md checked into git. Everyone updates it multiple times a week. The rule: when Claude does something wrong, add it so it doesn't repeat. Each team owns keeping theirs up to date.
What this means
Why this matters: AI is powerful but forgetful by default. You must provide memory externally. It’s an artifact you help maintain.
How he sees responsibility: He’d rather pay once (write the rule) than double check everything all the time.
What this means for vibecoders: This validates what I wrote in If You Build With AI, You Need This File.
CLAUDE.md is your product’s safety rails: never touch prod, always run tests, preferred architecture, etc.
#4 He Uses Claude in Code Review to Update the System, Not Just to Approve a PR
What he does
In reviews, he’ll tag @.claude on coworkers’ pull requests to add learnings to CLAUDE.md, using the Claude Code GitHub action.
What this means
Why this matters: Code review isn’t only for catching bugs; it’s for training the development system. Review becomes meta-work: improving the process that produces code.
How he sees agents: Agents belong inside the team’s social workflows (PRs) as participants that can update shared norms. That’s a strong stance: agents aren’t a side tool; they’re part of the collaboration loop.
What this means for vibecoders: If you build with others, consider treating PR reviews as the place you encode product standards, so the future agent outputs stop degrading.
#5 He Plans First, Then Lets Claude Run
The ever-planning PM in me was very happy reading this one.
What he does
He starts in Plan Mode, iterates until the plan is good, then switches into auto-accept edits mode. Claude can execute the entire implementation in one go without needing back-and-forth revisions.
What this means
Why this matters: AI works best when it can commit to a structured plan: what to do, in what order, why. Plan Mode forces explicitness before execution.
What this means for vibecoders: This prevents the classic failure: when the agent thinks it’s helpful and makes 40 changes you didn’t want. Don’t let a system act before you’ve agreed on intent and constraints. Plan mode is part of the speccing process.
#6 He Uses Subagents as Reusable Workflow Atoms
What he does
Boris treats subagents like slash commands.
What this means
How he sees agents: Agents are not "one big agent." They're modular roles. Reliability comes from specialization plus constraint.
How he sees coding: Coding becomes a pipeline of phases: spec, draft, simplify, verify. Each phase benefits from a different "mind."
How he sees responsibility: Best practices get encoded into tools that run automatically. That’s ethics-by-design: fewer failures from fatigue.
What this means for vibecoders: Consider building one agent that writes PRDs, one that codes, one that checks UX.
If you’ve read my post on prompting, you may recall that I also use a similar setup to build and evaluate all my prompts.
#claude #claudecode
Top comments (0)