DEV Community: Janne Lammi

The Cost of Being Worth Using

Janne Lammi — Sun, 14 Jun 2026 11:39:04 +0000

A 2026 NBER working paper followed more than 100,000 developers across four app marketplaces after they picked up AI coding tools. The tools worked. With the most autonomous of them, commits jumped 180%. Then the gain bled out downstream — about 50% more projects, only 30% more releases.

Total usage of what they shipped: unchanged.

More code. More releases. The same number of people using any of it.

The pattern is simple:

AI is increasing software output.
Usage is not increasing with it.
Attention is consolidating, not expanding.
The scarce skill is no longer building. It is deciding what deserves to exist.

The App Store Confirms It

Zoom out from those developers and the shape repeats everywhere.

Apple's App Store took in 557,000 new app submissions in 2025 — up 24% year over year, its biggest year since 2016, reversing a multi-year slide. The first quarter of 2026 then ran 84% ahead of the year before, the largest quarterly jump in a decade. The firms tracking it name one cause: vibe-coding. Anyone with an idea and a chat window can ship a functional app in a weekend now.

Total app downloads, across every major store, grew under 1%.

That's the cut. The supply of new apps, up double digits and then some. The appetite to download them, flat. The shelves filled up. Nobody new came to shop.

The Hours Are Growing. The Access Isn't.

Here is the stat that looks like it kills the argument, so let's put it on the table.

People spend more time in apps every year, not less. 5.3 trillion hours in 2025, up 3.8% over the prior year. Time-in-app has climbed for a decade.

But read the second derivative. That growth is decelerating — 7.7%, then 5.8%, then 3.8%. And it isn't spreading out. Social and communication apps alone eat about a third of all mobile time; one company's apps take roughly a fifth. The hours are growing, and they're pooling in the handful of places people already were.

Rising attention doesn't reach your new thing. The pie got bigger. The new entrants still don't get a slice.

The attention ceiling is the number of things any person will ever choose to care about. It didn't move when building got cheap. It won't move when building gets cheaper still.

This Isn't a Phone Problem

Scott Brinker has counted the marketing-software landscape every year since 2011. It went from 150 tools to 15,384 — a hundredfold in fifteen years, AI the latest accelerant.

Cursor, the AI coding tool, went from zero to $2 billion in revenue in about three years — the fastest any business-software company has ever made that climb. Building has never been cheaper or faster.

The buyers didn't get fifteen thousand times more attention. The average knowledge worker switches apps about 1,200 times a day and loses four hours a week just reorienting. Rich-world internet penetration sits at 93% — there are no new users to go get. And the front door is closing: roughly 60% of Google searches now end without a click, on the way past two-thirds as AI answers swallow the page. The traffic that used to find your new tool isn't being redistributed. It's evaporating.

Where the money goes instead: up. Microsoft's cloud business alone grew 23% to $169 billion last year. Spend is rising — and consolidating into the few platforms big enough to bundle. The long tail competes for what's left.

What AI collapsed	What didn't move
Cost of writing code	Cost of distribution
Time from idea to shipped app	Demand for new products
Who can build software	Available user attention
Speed of iteration	The bar to earn a place in someone's life
Number of apps submitted	Number of apps actually used

The attention ceiling looks the same in enterprise software as it does in mobile. The shelves fill. The buyers don't multiply.

The Filter Is Gone

So why is supply exploding straight into a demand wall?

Because the cost of producing software fell off a cliff. The price of the AI capability itself — the inference to hit a given benchmark — has dropped on the order of 50x a year, by Epoch AI's measure, echoed in Stanford's AI Index. That capability is the raw material of building, and it dragged the cost of prototyping down with it. What took a funded team and two quarters takes one person and an afternoon.

The cost of making something collapsed. Not one cost of being worth using moved an inch. Distribution didn't get easier. Attention didn't expand. The bar for the tenth app in a category that does the same thing sits exactly where it was — except now there are ten of them by Friday instead of one by next quarter.

AI collapsed the cost of building. It didn't touch the cost of being worth using.

The Decision Nobody Makes Anymore

Today anyone can pull on the turtleneck, open a chat window, and feel like a reborn Steve Jobs. So can everyone else, in your exact category. The feeling of building something visionary got cheap. Being right about it didn't.

When building was expensive, building was the filter. You couldn't ship the wrong thing easily, so the cost of construction did your triage for you. Most bad ideas died in the estimate.

That filter is gone. The estimate is now an afternoon. Which means the decision the build used to force — is this worth existing? — doesn't get made by anyone. It gets skipped. The half-formed thing ships and joins the pile of installed-and-never-opened.

The bottleneck moved up the stack. Not to whether you can build it. To whether it should exist — and to what, precisely, earns a slice of attention nobody owes you.

That is a judgment problem. It always was. The cost of building was just hiding it.

What This Means for Builders

Start from evidence, not ideas. Real user friction, not feature assumptions. When execution is cheap, the competitive edge is knowing which friction to resolve.

Define the adoption outcome before the first line of code. Not "build a checkout flow" — but "a user who completes purchase without contacting support." If you can't write the adoption outcome, you don't know what you're building.

Name what must not be built. The cheap build's greatest risk isn't failure — it's drift. Name the constraints explicitly before you start. A spec without boundaries is a blank check.

An intent tool can become part of the flood — one more way to generate more, faster, with less thought. We know it. So the point isn't to add output. It's to force the decision the cheap build skips, on purpose, up front: name what this is for, name what it must not become, name how you'll know it worked. Before, not after.

The 100,000 developers in that study weren't failing to build. They built more than ever. Building more just wasn't enough to get more of it used — the paper's own name for the gap is a weak link, the human work downstream that doesn't scale because code generation does.

Name the weakest link in that chain and it isn't typing. It's the decision nobody is forced to make anymore: is this worth existing, and what makes it worth a slice of attention no one owes you. Cheap building doesn't make that call for you. It just lets you skip it faster.

That's the trade the whole industry is taking right now without naming it.

The cost of building fell off a cliff. Judgment didn't — which is exactly why it's the only scarce thing left. Context isn't judgment, and judgment is now the whole game.

Start building with intent →

Sources: Demirer, Musolff & Yang, "Writing Code vs. Shipping Code" (NBER w35275, 2026) for the commits → releases → usage chain; Appfigures for 2025 App Store submissions (Apple only), and The Information via TechCrunch / Entrepreneur for the +84% Q1 2026 jump (Apple only); Sensor Tower State of Mobile 2026 for total downloads (all major stores) and time-in-app; Epoch AI and the Stanford HAI AI Index 2025 for the inference-cost decline; Scott Brinker / MarTech.org for the martech-landscape count (150 → 15,384); TechCrunch for Cursor's ARR trajectory; Harvard Business Review (2022) for the 1,200-toggles-a-day study; the ITU's Facts & Figures 2024 for 93% high-income internet penetration; SparkToro / Datos for zero-click search; and Microsoft's FY2025 results for cloud-revenue concentration. App-market figures are vendor estimates; submission counts are Apple's App Store only, while downloads span all major stores; the inference-cost figure is the median rate to reach a fixed capability, not frontier cost.

The Intent Layer

Janne Lammi — Sat, 30 May 2026 19:43:19 +0000

The software stack has a gap.

On one end: tools that capture signal. Dovetail, Intercom, user research platforms—everything that tells you what users need. On the other end: tools that execute. Cursor, v0, Claude Code—everything that turns instructions into working software.

The middle is compressing. And that middle is where the most important work happens: deciding what gets built and why.

Most teams skip it. They go straight from customer feedback to a prompt in Cursor. The result: code that solves the wrong problem, or solves it without context.

We call this missing middle the Intent Layer.

┌─────────────────────────────────────┐
│         User Signal                 │  ← Feedback, friction, research
├─────────────────────────────────────┤
│       ▶ Intent Layer ◀              │  ← What gets built & why
├─────────────────────────────────────┤
│      Agentic Execution              │  ← Cursor, v0, Claude Code
├─────────────────────────────────────┤
│       Shipped Software              │
└─────────────────────────────────────┘

Prompts Are Not Intent

A prompt is a request. Intent is a specification.

Prompt	Intent Spec
"Add a checkout flow"	Preconditions, expected outcomes, edge cases, traceability to user friction
Ephemeral—typed once, discarded	Versioned, auditable, persistent
Invented by the developer	Derived from user signal
Optimized for speed to first output	Optimized for correctness of final outcome

When a developer opens Cursor, they shouldn't be inventing context from scratch. They should be pulling from a system that already computed what needs to be built and why.

Prompts generate code. Intent specs generate the right code.

Why the Gap Exists

Today, intent is scattered across your organization:

Research lives in Dovetail.
Specs live in Notion or Google Docs.
Tasks live in Jira or Linear.
Context lives in someone's head.

No single system synthesizes user friction into structured, machine-readable intent. Each handoff between these tools introduces drift. Each interpretation introduces error.

The fix isn't another tool in the stack. It's a layer that sits above them—synthesizing signal, structuring intent, and pushing context to every downstream agent.

The Three Layers of Intent

We see software development converging on a 3-layer architecture:

1. Intention — The Raw Signal

Unstructured information: user interviews, support tickets, analytics, NPS surveys. Scattered across disconnected silos. This is where friction surfaces—the raw evidence of where your product fails its users.

2. Structure — Computed Intent

This is the Intent Layer itself. It translates abstract friction into a formalized specification. Not by asking you to write specs from scratch, but by computing them—clustering friction signals and generating structured, evidence-backed specifications.

An IntentSpec is not a document. It's a data object:

objective:
  statement: "Reduce cart abandonment at payment step"
  success_criteria:
    - Checkout completes in under 3 seconds on 3G
    - Cart abandonment drops from 23% to below 15%

constraints:
  - Must work without JavaScript enabled
  - Must support existing payment provider
  - Rate limited to prevent abuse

outcomes:
  - observable: "Conversion rate increase at payment step"
  - verification: "A/B test shows 10%+ improvement"

edge_cases:
  - Payment declined → show retry with alternative method
  - Session expired → preserve cart state
  - Zero items in cart → redirect with explanation

This isn't prose. It's a contract. An agent can execute against it. A human can review it. Both can verify whether it was fulfilled.

3. Projection — Execution

Where the spec becomes executable artifacts: code, UI, API documentation, tests. This is where Cursor, Claude Code, and v0 live. When they receive an IntentSpec instead of a prompt, they don't guess—they execute against explicit criteria.

Why This Matters Now

AI has made execution cheap. You can prototype a working solution in the time it used to take to argue about the wireframe. This is powerful—but it creates a dangerous illusion.

When building is instant, teams skip the hard work of defining what they're building. Fast execution masks poor intent. A shiny AI-generated feature ships in a day, only to increase drop-off because no one asked whether users actually needed it.

This is what we've been calling The Vibe Coding Hangover. Solo developers can hold context in their head. Add a PM, a Designer, and three AI agents—and context fragments. Prompts become requests with no shared definition of success.

The bottleneck has shifted. It's no longer writing code. It's defining what code to write.

The PM's Role is Changing

The PM's job used to be translation: talk to customers, synthesize problems, write specs, hand them to engineers. The value was in the compression—taking noisy, qualitative input and producing structured output.

But that compression step is exactly what language models do well. When agents can take a well-formed problem and produce working code, the spec becomes the product. The PM's job shifts from writing handoff documents to forming intent clearly enough that agents can act on it directly.

This is not a demotion. Clarity is harder than verbosity. Precision is harder than prose.

What an Intent Layer Does

Captures friction from real user signals—support tickets, research transcripts, analytics—not from intuition alone.
Structures intent into versioned, machine-readable specs with explicit success criteria, constraints, and edge cases.
Feeds agents exactly what they need to execute—no hallucination, no drift.
Verifies outcomes against the defined spec after shipping. Did the friction actually decrease?
Preserves memory so the team knows why something was built, not just that it was built.

Who Owns Intent?

John Moriarty wrote about the shift from products to systems. His framework for the new design role is "Systemic Orchestration"—designers provide the bricks, blueprints, and building codes, not the finished house.

This applies to product work broadly. Designers, PMs, and engineers are all converging on the same discipline: defining intent with enough structure that agents can execute it faithfully.

Someone has to own that layer. That's what we're building at Pathmode.

Input Factories

Janne Lammi — Tue, 19 May 2026 12:30:46 +0000

Everyone is building agent factories.

Cursor, Codex, Claude Code, Devin. Coding agents that plan, write, test, ship. Internal pipelines with five, eight, twelve sub-agents in series. Orchestrators that retry. Eval loops. Tool calls. The pipeline keeps getting better, almost weekly.

This is not the hard part.

A factory amplifies whatever you feed it. If you feed it a thin brief, it builds something thin, faster. If you feed it a contradictory spec, it resolves the contradiction silently — usually wrong. If you feed it nothing, it invents.

Most teams are scaling their ambiguity.

The agent factory is solved. The input factory isn't.

By "input factory" I mean the layer above the build. The thing that decides what the agents read before they generate. It contains:

the intent (why this exists, who it's for, what it must do)
the product spec (scope, edges, what good looks like)
the design rules (tokens, voice, components, don'ts)
the brand (how it sounds, what it never says)
the playbooks (how this team approaches this kind of work)
the reference code (this is how we do it here)

These already exist. They live in Notion, Figma, Slack threads, the repo, somebody's head. They drift. They contradict each other. Nobody owns them. When the build is wrong, no one knows which one to fix.

This is the bottleneck.

Here is the loop most teams are running today:

Agent builds something
Designer or PM reviews the PR
PR is wrong
Fix the PR

Here is the loop they should be running:

Agent builds something
Designer or PM reviews the PR
PR is wrong
Which input was thin, stale, or contradictory?
Fix the input
Run the next thing through the better input

The first loop scales the work. The second loop scales the system.

The first loop is what most teams will look back on as the embarrassing era — when senior people spent their time fixing outputs an agent produced from inputs the senior person never actually wrote.

There is a role shift coming, and it is bigger than it sounds.

The designer stops being the one who fixes the output. They become the one who curates the system that produces the output. When the build comes back wrong, the question is no longer "how do I edit this?" but "what was missing from what the agent read?"

The PM stops chasing tickets. They author the intent — the why — and the evidence behind it. The thing nothing downstream can guess at.

The engineer stops translating. They integrate. They write the reference code, the conventions, the constraints. They make the input layer real in the repo.

In this world the seniority of the work moves upstream. The most leveraged person on the team is whoever owns the inputs.

I think the reason nobody has built this yet is that the inputs feel like just files. Markdown, Figma frames, brand decks, a couple of shared docs. It feels low-status. The factory feels high-status.

But the factory is a commodity. There will be five good ones in eighteen months and they will mostly do the same thing.

The inputs are not a commodity. They are your product's actual intent, captured and made legible to a machine. They are the only thing that makes your factory's output yours instead of generic.

A real input factory does a few things the file-soup version doesn't:

It authors — the inputs are written deliberately, not assembled from drift.

It evidences — every claim points back to the user signal that justifies it.

It governs — somebody owns each input, somebody updates it, the staleness is visible.

It compiles — at build time, the right slice of input is packaged into the agent's context. Not the whole pile. The right slice.

It diffs — when the output is wrong, you can trace which input was insufficient, and improve it.

Karpathy has been writing about a version of this from the other direction: raw notes compiled into a wiki an LLM can read. He is right about the pipeline. The thing he leaves implicit is that for product work, the raw notes aren't notes. They are intent, spec, brand, design rules, playbooks. The wiki isn't a wiki. It is the input factory.

The companies that figure this out first will look, from the outside, like they have better agents.

They won't. They will have better inputs.

If you are building an agent factory right now, the question worth asking is: what does my factory read, who wrote it, and how do I know it's still true?

If you do not have a good answer, the factory is the wrong place to spend the next quarter.

Build the input factory.

I'm building Pathmode, the input layer for product teams building with AI.

Anthropic Just Made Specs Load-Bearing

Janne Lammi — Wed, 06 May 2026 19:26:33 +0000

Today Anthropic shipped Managed Agents — and inside it, a feature called Outcomes.

Outcomes is small in scope and large in implication. The idea: when you dispatch an agent, you also define what success looks like. A separate grader evaluates the agent's output against those criteria and decides whether the work is done or needs another pass.

Most of the coverage focused on the self-correction loop. The deeper story is what Outcomes assumes — and what it quietly exposes.

1. Outcomes need testable success criteria

A grader can't grade a vibe.

For Outcomes to do anything useful, success has to be expressible in language that a separate model can evaluate without re-doing the work. That means specific, observable, decomposable. "The form submits and shows a confirmation." "No more than one network request per keystroke." "Email arrives within 30 seconds and includes the order number."

This is what verification has always looked like in good engineering. The difference is the audience. Until now, success criteria were a human courtesy — nice when the PM wrote them, fine when they didn't, because the developer would figure it out in review. With Outcomes, success criteria become the contract the agent is held to. They stop being decoration and start being load-bearing.

A vague Outcome doesn't fail loudly. It quietly accepts wrong work.

2. Most teams don't have them. They have tickets and Figmas.

Walk into the average product org and ask to see the success criteria for the next five tickets in the sprint. You'll get one of three answers:

"It's in the Figma." (It isn't.)
"The ACs are at the bottom of the Linear ticket." (Three bullets, all phrased as features.)
"Talk to Sara, she knows what we're trying to do." (Sara is on PTO.)

The intent of the work lives somewhere — in someone's head, in a Slack thread, in a comment on a design file. Almost never in a structured, retrievable, testable form.

This was tolerable when humans did the building, because humans are excellent at filling in gaps. They ask follow-up questions. They reread the ticket three weeks later and figure it out. They notice when something feels wrong even if no one wrote down what right looks like.

Agents don't do any of that. They execute against what they're given. Outcomes makes this concrete: the team that ships the clearer success criteria gets the better agent run. The team that doesn't ships nothing — or worse, ships something convincing but wrong.

The bottleneck used to be writing code. The bottleneck is now knowing what good looks like, in writing, before the work starts.

3. IntentSpec is what an Outcome looks like before it's machine-readable

This is the part most teams underestimate.

An Outcome — the JSON object you hand to the grader — isn't where the work happens. It's the output of the work. The work happens upstream, when someone sits with a real piece of user friction and decides:

What is the actual objective? (Not the feature. The change in user behavior.)
What outcomes would prove this worked? (Observable, decomposable, testable.)
What edge cases must hold? (The boring failure modes that ship bugs.)
What constraints can't be violated? (Invariants that don't show up in a happy path.)
What evidence grounds these decisions? (Tickets, interviews, telemetry — not vibes.)

That's a spec. Specifically, it's an agent-ready spec. When the spec is good, exporting it to an Outcome is a translation step. When the spec is missing, no amount of grader sophistication can rescue the run.

We've been calling this artifact an IntentSpec. The name doesn't matter. What matters is that the artifact exists, persists, and stays anchored to the evidence that justified it.

What this means for your team

Outcomes makes the spec the load-bearing artifact in agentic delivery. That's a one-sentence reframe with three uncomfortable consequences.

The cost of vague tickets just went up. Before, vague tickets cost a clarification meeting. Now they cost an agent run that completes successfully and produces the wrong thing.

Spec quality becomes a measurable input. When the grader rejects the work, you don't have a model problem — you have a specification problem. That's a much faster feedback loop than waiting for QA to find the bug.

Specs need to live somewhere persistent. Not in a Figma comment. Not at the bottom of a Linear ticket. Not in Sara's head. Somewhere the agent — and the next agent, and the next teammate — can read tomorrow and know what done means.

Anthropic just made specs the contract. The teams that already write them well are about to look extremely smart. The teams that don't are about to find out, expensively, what an agent does with ambiguity.

Originally published on pathmode.io. Pathmode is the intent engineering platform — we help product teams write specs that agents can actually execute against.

The Three-Person Team

Janne Lammi — Tue, 05 May 2026 11:44:14 +0000

For most of my career, building software meant coordinating people who weren't in the same room, on the same artifact, or even on the same week of work. Specs went one place, designs went another, code came later. We built whole categories of tools to manage the gaps between them.

I've been thinking about what happens when those gaps close.

The old math made sense for a long time. Implementation was expensive and slow, so it paid to specialize and hand off. A team of twelve was reasonable, because no single person could hold the full picture, and the work itself took long enough to justify the coordination overhead.

That math is changing, and the shape of the team is changing with it.

Three lenses, one loop

When AI absorbs the middle of the work — the actual writing of code — a team stops scaling by adding hands. It scales by adding clearer judgment.

The shape that keeps showing up, when I look at the teams that seem to be working best, is small. Maybe three people. The number matters less than the lenses they bring:

A design lens — someone holding user reality. They notice the friction, and carry the taste.
An engineering lens — someone holding system reality. They know what's possible, what's fragile, and what needs to scale.
A product lens — someone holding business reality. They know the bet, the constraint, and the why.

These aren't really job titles. They're responsibilities for kinds of judgment a team can't outsource to an agent. Everyone on a small team prototypes, talks to users, reads analytics. But each person owns one form of reality, and the team doesn't ship until those three forms agree.

You can see the shift starting to land in hiring. Job listings now ask for designers "comfortable operating without a defined brief," who can "prototype in code and present to stakeholders in the same afternoon," who "ship without waiting for research." Those aren't the traditional expectations of a senior designer. They describe someone carrying a lens on a small, AI-native team. The market is beginning to hire for shape of capability, not years of experience.

One product, one surface

The harder shift to defend, if you've spent years building inside a larger org, is what happens to handoffs.

Handoffs used to be reasonable. PMs wrote specs because engineers couldn't read minds. Designers built Figma files because engineers couldn't ship without pixels. Tickets existed because the work was sliced across roles before it was sliced across time.

What's happening on small AI-native teams is different. They work the same product, against the same artifact, at the same time. The product person isn't writing a doc the engineer will read next sprint. They're editing a living spec that the designer and engineer are also editing, while an agent prototypes against it in the background.

Async work doesn't disappear. Deep work, time zones, review, reflection — those still matter. What disappears is the handoff as the default operating model. The artifact passed between people stops being a Jira ticket and becomes something more like a shared intent: live, current, executable.

There's no handoff because there's no gap.

The unit of work moves upstream

When implementation gets cheap, the bottleneck doesn't disappear. It moves.

The old unit was the pull request. The new unit, more and more, is the spec. Teams that work this way converge on intent before they touch code, because that's where the leverage is. Five minutes of clearer intent saves an hour of regenerated code.

I think this is what makes the trio possible at all. Without a shared, structured definition of what's being built and why, three people pulling in three directions aren't a team. They're three threads of work an agent ships at speed.

A concrete example. A churned customer says onboarding felt "empty" after they signed up. The product lens frames the business risk: activation is stalling. The design lens names the experiential gap: the user has no obvious next action. The engineering lens surfaces the system constraint: setup state is fragmented across three services. The spec they write isn't "add an onboarding checklist." It's something closer to: make the first meaningful project action obvious within two minutes of signup, without introducing a parallel setup flow. That sentence is the unit of work. The agent builds against it, and the team verifies against it.

The substrate the team works on isn't a Notion doc, or a Figma file, or a backlog. It's the intent itself: versioned, structured, traceable back to the friction that justified it.

Evidence instead of phases

The other thing that changes is when discovery happens.

The teams I grew up in ran discovery in phases. A research sprint, then a planning sprint, then implementation. The customer sat somewhere upstream of the work, separated from the build by weeks of process and a slide deck nobody read twice.

A small AI-native team doesn't really do phases. Friction signals (support tickets, session recordings, sales calls, churned-user interviews) flow into the same surface where specs live. Evidence stops being something a PM presents and starts being the substrate the team builds against.

When friction enters the system, the team sees it. When a spec exists, it points back to the friction that justified it. When an agent ships a feature, the team can ask the boring, important question: did the friction actually decrease?

This is the loop that bigger teams used to break through specialization. Smaller teams don't have to break it.

Judgment is what's left

I'm wary of writing anything that sounds like "X is the only thing that matters." Lots of things still matter — distribution, customer trust, proprietary data, domain depth, the brand you've built over years.

But on top of all of that, when the cost of building drops, judgment becomes the new differentiator. The ability to look at a hundred possible features and pick the one that actually matters. The ability to look at five prototypes and feel which one is right. The ability to kill an idea your agent could ship in an afternoon, because it's the wrong idea.

Velocity used to separate teams, but now velocity is becoming a commodity. What's left is the judgment behind what's being built, and the ability to articulate that judgment clearly enough that another person, or an agent, can act on it.

Where this falls apart

The honest version of this argument has a limit.

The three-lens team works when the problem space fits in three heads. It doesn't work as well for deep platform work, for regulated domains, for systems where a wrong intent has catastrophic consequences. A small team probably shouldn't be alone behind a hospital records system or a payments backbone. There's still a strong case for specialization where the cost of being wrong is high.

But for the layer of software most teams actually build — applications, products, internal tools, features — this shape is already starting to show up. Not everywhere. In the teams I'm watching most closely, it already is the operating model.

A small group, one product, one shared surface. Handoffs replaced by something closer to a shared intent that everyone — and every agent — can see.

I don't think AI-native teams are getting smaller because companies want fewer people. I think they're getting smaller because the core work has shifted, from coordinating execution to shaping what's being built. The team of the next decade isn't a smaller version of the one you have today. It's a different shape.

The headcount drop is the symptom. The change in shape is the point.