Scale-position 5: how I stopped drifting between depth and speed

#ai #career #learning #productivity

Every engineering trade-off between "ship faster using the abstraction" and "derive first, build from scratch" is implicitly a position on a 0-to-10 axis. Without codifying the axis and naming where you currently sit, you drift between positions session by session — one day demanding a whiteboard derivation, the next day shipping a library import without looking at it. The drift is invisible inside the session. It only shows up weeks later when the skills you thought you were building turn out to have been interrupted by the ones you thought you were also building, and neither compounded.

I fixed this for myself by codifying the axis as a rule file, pinning each position to a named practitioner, setting my current anchor at 5 with a drift target of 6, and running a weekly recalibration. The rule lives in claude-code-agent-skills-framework under scale-position.md.

The axis

Ten positions. Each one is a trade-off profile, not a skill rating.

0 — Pure vibe-coding. Agent writes, operator merges. No spec, no review, no eval loop.
2 — Simon-Willison-style shipper. Prolific throwaway experimentation, explicit about when it is and is not appropriate.
3 — Eugene Yan / Hamel Husain / Jeremy Howard shipping-mode applied practitioner. Strong problem framing, non-ML baselines, manual eval. Gets systems into production quickly; earns depth over time.
5 — Chip-Huyen-balanced ML engineer. Frames the problem, designs the I/O contract, picks the right primitive, reads the math to the level of explaining why a loss is appropriate (not only which one to import). Matches the median Anthropic / OpenAI Member of Technical Staff profile around 2025-2026.
6 — Karpathy-style educator-practitioner. Builds the 40-line version from scratch. Can derive backprop on a whiteboard under interview pressure. Ships a production system in the same week.
7 — Sebastian Raschka / Julia Evans full-mastery. Derives from first principles and descends to OS syscalls or CUDA kernels when an abstraction leaks. Teaches the next generation.
9-10 — Research contribution. Publishes novel architectures, optimizers, training recipes.

Positions 1, 4, and 8 exist as intermediates on the same line. The illustrative poles are the ones you can pin to a working practitioner.

Why the axis matters in practice

Most engineers sitting at 3 and most engineers sitting at 5 do the same daily work. The difference shows up under specific trade-offs:

When to reach for an abstraction. At 3, the abstraction is a productivity win and you ship faster. At 5, you read the abstraction's source before importing it and can explain why it works.
When a bug lands that the library does not document. At 3, you file an issue. At 5, you trace the bug into the library's code and propose a patch.
When a new model ships. At 3, you swap it in and test against your current harness. At 5, you read the model card, understand what changed in the architecture, and update the harness to test the specific capability that changed.

Neither is wrong. They are trade-offs. Position 3 ships more feature surface per week. Position 5 fails differently when the abstraction leaks — and the abstraction always eventually leaks.

The driver — why I anchor at 5

The driver matters because the driver decides which way the drift target points. Mine is AI-replacement moat, not income urgency. Stated precisely: the concern is not "I need more money this month" but "the shipping-mode applied work I am currently competent at is exactly the work coding agents are fastest at replacing."

Position 3 is where LLMs are fastest reaching competence. Shipping-mode applied work — ship against a Jira ticket, write the endpoint, wire the integration — is the profile currently being compressed. Position 5 and above is where the moat lives: problem framing, quality judgment, cross-layer debugging, first-principles derivation. These are the four skills agents do worst at today, and they are what I train for.

An operator whose driver is income urgency should anchor differently — probably at 3, ship hireable artifacts faster, earn depth after the runway stabilizes. The axis tolerates either anchor. What it does not tolerate is drifting between them without naming it.

The weekly Sunday recalibration

Every Sunday, a four-question check:

Hours. How many hours did the work actually hit this week versus the floor?
Concepts that closed the loop. Which concepts passed the Bransford transfer test this week? (Passed the novel-surface-form test without reaching for the original analogy.)
Fear level. Rising / steady / falling. The compass needle.
Pressure signals. Financial, health, life. Anything that changes runway or energy budget.

Then one of three proposals:

Hold at 5. Default. Hours hit, concepts landed, fear steady, no pressure signal.
Drift to 5.5. Hours hit comfortably, two or more concepts closed the loop, fear steady or falling. Add one math-depth exercise next week.
Compress to 4.5. Hours missed two consecutive weeks, concepts stalled, or pressure signal active. Trim depth, ship a visible artifact, rebuild momentum.

Logged to CALIBRATION-LOG.md in the repo. One row per Sunday. Not merged with PROGRESS.md — those are different logs serving different readers (session-level vs trajectory-level).

Why the recalibration is Sunday-weekly

Weekly is frequent enough to catch drift before it compounds across three sessions. Sunday is the edge between the prior week's evidence and the next week's planning. Running the recalibration on Wednesday loses one signal (the weekend's reflection); running it monthly loses three opportunities to adjust before the next month's plan locks in.

The check is short — fifteen minutes. The discipline is not the minutes; it is the fact that the check happens at all, regardless of how the week went. Skipping the check because "this week was normal" is how the drift gets back in.

The 12-week escape hatch

Every twelve weeks, a longer check. Three possible moves:

Compress to 3.5 if evidence is behind schedule and finances are tighter. Trim the depth track. Ship the hireable artifact earlier. Build the moat after income stabilizes.
Widen to 5.5 if evidence is on schedule and momentum is strong. Add math depth. Deep-dive into fundamentals that did not make the shipping ordering.
Hold at 5 if the evidence is mixed.

The 12-week cadence handles strategic drift. The weekly cadence handles short-term variance. Both are first-class.

What the axis forbids

Moving more than one position in a single week. Abrupt drift destabilizes the trajectory. If the recalibration says 5 → 7, something is wrong with the diagnosis and it needs to be rerun.
Drifting without evidence. Moving up to 6 requires concepts that closed the loop, not hours sat in a chair. Compressing to 4.5 requires a real pressure signal, not a bad mood.
Anchoring permanently. The anchor is the current position. The drift is the trajectory. Both are first-class. If I am still anchored at 5 in six months without having drifted to 5.5 twice, either the framework is wrong or I am not earning the drift.

Why this is the highest-order rule in the framework

Every trade-off during a teaching session or an engineering choice implicitly resolves at some position on the axis. Without the axis codified, the resolution is random — whichever intuition feels strongest in the moment. With the axis codified, the resolution is a reference: "At 5, we derive before deploying, so we pause the shipping impulse and run the Socratic Q&A first."

The cost of the axis is fifteen minutes a week. The payoff is that six months of work compounds in one direction instead of averaging across two directions and producing neither a shipper nor a derivator.

Pick your position. Write the axis down. Name the driver. Run the first Sunday check. Most of the compounding you want is downstream of those four actions — not of any specific technical choice.

Aman Bhandari. Operator of an AI-engineering research lab running Claude Opus as the coaching partner, plus a QA-automation surface shipping against a real sprint workload. Public artifacts: claude-code-agent-skills-framework and claude-code-mcp-qa-automation. github.com/aman-bhandari.