Shinsuke Matsuda

Posted on Jan 20

Why “Plans” Should Be First-Class Artifacts in AI-Assisted Development

#ai #llm #codereview #productivity

Why Plans Should Be First-Class Artifacts in AI-Assisted Development

AI-assisted development is no longer experimental.

At this point, it’s fair to say it has become mainstream.

There are, of course, organizations that cannot adopt generative AI yet due to security or regulatory constraints.

But for software teams that can use AI and still choose not to, the issue is no longer technical—it’s a matter of mindset.

Productivity gains from AI-assisted development are real.

Today, shipping and operating small to mid-sized services with minimal handwritten code is no longer unrealistic.

And yet, many teams report the same frustrations:

“We can’t keep track of the massive amount of AI-generated code.”
“AI-written code is hard to maintain.”
“Pull requests are getting huge, and reviews have become the bottleneck.”

I believe these problems are not caused by limitations of AI itself.

They stem from the fact that our development processes haven’t caught up with the AI era.

The Real Problem: Reviews Can’t Even Start

AI dramatically increases the speed of code generation.

But the real pain in AI-assisted development is not that diffs are large.

The real problem is this:

Reviewers don’t know where to start.

When reviewing an AI-generated pull request, reviewers implicitly need answers to questions like:

What is the intent of this change?
What is the scope of this PR? (Where is the review boundary?)
What assumptions are being preserved?
Which parts are risky and need extra attention?

Without an explicit plan, reviewers are forced to reconstruct all of this information from the diff alone.

As AI writes more code faster, this reconstruction step becomes the true bottleneck.

What “Plan” Means in This Article

Before going further, let’s define what “plan” means here.

This is not about tools or features.

It’s about how we structure information to make reviews work in the AI era.

A plan is not a TODO list.

A plan is a compressed decision artifact that makes implementation and review possible.

A good plan typically includes:

Findings from investigating the existing codebase (what matters, what assumptions exist)
Goals and non-goals
References to relevant specs or architecture
Expected impact and risks
The review boundary (how far this PR should go)

The key point is that a plan is reviewable in size.

It is not a long design document.

Why Plans Should Be Committed

Here’s an important point:

the size and quality of plans described above are not theoretical.

Modern AI tools (including Claude Code) can reliably generate plans at this granularity.

That makes plans not temporary notes, but real artifacts worth keeping.

1) A Plan Is a Deliverable

In AI-assisted development, a plan is no longer a personal memo.

A solid plan contains:

Consolidated investigation results
A clear list of affected files
Explicit intent and boundaries
Rejected alternatives and caveats

At this point, the plan has become shared team knowledge.

It is agreed upon before implementation and often referenced longer than the code itself.

If that’s the case, it should be treated like code.

Anything that changes over time should live where change history is tracked best:

in git.

2) Reviews Can Start with Intent, Not Diff Reading

Without a plan, reviews inevitably start like this:

Read the diff top to bottom
Try to infer intent
Guess the impact
Discover late that “this wasn’t the intended direction”

With a plan, reviews start somewhere else:

Is this the right direction?
Is this review boundary safe?
Are the stated invariants preserved?
Which parts deserve the most attention?

As a result, reviews shift from

“code inspection” to “decision validation.”

3) PR Sizes Shrink Structurally

When the review boundary is explicitly written in the plan, agreement comes first.

This PR stops here
Behavior is unchanged in this step
Follow-up work goes into the next PR

With this structure:

Massive PRs become rare
“While I’m here” changes are easier to reject
Reviews stop collapsing under their own weight

In other words, we convert

“how much AI can generate” into “how much humans can reasonably decide.”

Plans Improve Review Quality—for Humans and AI

1) Human Reviews Shift Focus

Reviewing diffs alone leads to flat review criteria:

Is the code style okay?
Does it look bug-free?
Are tests present?

These are important, but increasingly they are areas where

automated checks and AI reviews excel.

That means human review time is better spent on judgment-heavy areas:
design intent, boundaries, invariants, and risk.

With a plan, reviews gain contrast:

High-impact areas get deeper scrutiny
“Non-goals” are explicitly verified
Risky paths are reproduced locally

The result isn’t less review effort—it’s better reviews.

2) AI Reviews Become Intent-Aware

When AI reviews only see diffs, feedback tends to be generic.

When plans are included, AI reviews can instead check:

Are non-goals violated?
Did changes cross the review boundary?
Are promised invariants preserved?
Are impact estimates missing something?

AI reviews evolve from

diff interpretation to intent verification.

The Decisive Value of Plans in the AI Era

1) Plans Save Context Window

Plans change what we pass to AI.

No need to paste full chat histories
No need to dump raw logs
Only relevant documents and summarized findings

A plan is a compressed state representation:

Readable by humans
Easy for AI to understand
Cheap in context window usage

This leaves more room for actual reasoning.

2) Plans Shortcut File Discovery

For similar follow-up tasks, the most expensive step is often figuring out:

Which files matter
What depends on what

If plans are committed, AI doesn’t need to re-scan the entire codebase.

It can reuse explicit file lists, dependencies, and constraints—

avoiding unnecessary context pollution.

3) Plans Promote Knowledge

Plans are short-lived by nature, but they often contain durable knowledge.

Feature-level knowledge can graduate into documentation
Constraints and boundaries can move into architecture docs

When teams consistently:

Aggregate knowledge in plans
Promote it after completion

They stop relying on past chat logs as an external memory.

Plans Are Not Documents—They Are Process

The goal is not to treat plans as optional artifacts.

A healthy loop looks like this:

Commit plans
Review plans (intent, boundaries, non-goals)
Update plans alongside implementation
Promote lasting knowledge into canonical docs

This loop becomes the development process in the AI era.

Summary

Faster AI-generated code makes reviews more fragile
The root cause is not diff size, but the loss of a clear review starting point
Treating plans as first-class artifacts enables:
- Reviews to start from intent
- Structurally smaller PRs
- Higher-quality human and AI reviews
- Lower context window usage
- Cheaper future exploration

Plans are the unit of review, generation, and knowledge in AI-assisted development.

That’s why plans are deliverables—and why they belong in git.

DEV Community