Why Plans Should Be First-Class Artifacts in AI-Assisted Development
AI-assisted development is no longer experimental.
At this point, it’s fair to say it has become mainstream.
There are, of course, organizations that cannot adopt generative AI yet due to security or regulatory constraints.
But for software teams that can use AI and still choose not to, the issue is no longer technical—it’s a matter of mindset.
Productivity gains from AI-assisted development are real.
Today, shipping and operating small to mid-sized services with minimal handwritten code is no longer unrealistic.
And yet, many teams report the same frustrations:
- “We can’t keep track of the massive amount of AI-generated code.”
- “AI-written code is hard to maintain.”
- “Pull requests are getting huge, and reviews have become the bottleneck.”
I believe these problems are not caused by limitations of AI itself.
They stem from the fact that our development processes haven’t caught up with the AI era.
The Real Problem: Reviews Can’t Even Start
AI dramatically increases the speed of code generation.
But the real pain in AI-assisted development is not that diffs are large.
The real problem is this:
Reviewers don’t know where to start.
When reviewing an AI-generated pull request, reviewers implicitly need answers to questions like:
- What is the intent of this change?
- What is the scope of this PR? (Where is the review boundary?)
- What assumptions are being preserved?
- Which parts are risky and need extra attention?
Without an explicit plan, reviewers are forced to reconstruct all of this information from the diff alone.
As AI writes more code faster, this reconstruction step becomes the true bottleneck.
What “Plan” Means in This Article
Before going further, let’s define what “plan” means here.
This is not about tools or features.
It’s about how we structure information to make reviews work in the AI era.
A plan is not a TODO list.
A plan is a compressed decision artifact that makes implementation and review possible.
A good plan typically includes:
- Findings from investigating the existing codebase (what matters, what assumptions exist)
- Goals and non-goals
- References to relevant specs or architecture
- Expected impact and risks
- The review boundary (how far this PR should go)
The key point is that a plan is reviewable in size.
It is not a long design document.
Why Plans Should Be Committed
Here’s an important point:
the size and quality of plans described above are not theoretical.
Modern AI tools (including Claude Code) can reliably generate plans at this granularity.
That makes plans not temporary notes, but real artifacts worth keeping.
1) A Plan Is a Deliverable
In AI-assisted development, a plan is no longer a personal memo.
A solid plan contains:
- Consolidated investigation results
- A clear list of affected files
- Explicit intent and boundaries
- Rejected alternatives and caveats
At this point, the plan has become shared team knowledge.
It is agreed upon before implementation and often referenced longer than the code itself.
If that’s the case, it should be treated like code.
Anything that changes over time should live where change history is tracked best:
in git.
2) Reviews Can Start with Intent, Not Diff Reading
Without a plan, reviews inevitably start like this:
- Read the diff top to bottom
- Try to infer intent
- Guess the impact
- Discover late that “this wasn’t the intended direction”
With a plan, reviews start somewhere else:
- Is this the right direction?
- Is this review boundary safe?
- Are the stated invariants preserved?
- Which parts deserve the most attention?
As a result, reviews shift from
“code inspection” to “decision validation.”
3) PR Sizes Shrink Structurally
When the review boundary is explicitly written in the plan, agreement comes first.
- This PR stops here
- Behavior is unchanged in this step
- Follow-up work goes into the next PR
With this structure:
- Massive PRs become rare
- “While I’m here” changes are easier to reject
- Reviews stop collapsing under their own weight
In other words, we convert
“how much AI can generate” into “how much humans can reasonably decide.”
Plans Improve Review Quality—for Humans and AI
1) Human Reviews Shift Focus
Reviewing diffs alone leads to flat review criteria:
- Is the code style okay?
- Does it look bug-free?
- Are tests present?
These are important, but increasingly they are areas where
automated checks and AI reviews excel.
That means human review time is better spent on judgment-heavy areas:
design intent, boundaries, invariants, and risk.
With a plan, reviews gain contrast:
- High-impact areas get deeper scrutiny
- “Non-goals” are explicitly verified
- Risky paths are reproduced locally
The result isn’t less review effort—it’s better reviews.
2) AI Reviews Become Intent-Aware
When AI reviews only see diffs, feedback tends to be generic.
When plans are included, AI reviews can instead check:
- Are non-goals violated?
- Did changes cross the review boundary?
- Are promised invariants preserved?
- Are impact estimates missing something?
AI reviews evolve from
diff interpretation to intent verification.
The Decisive Value of Plans in the AI Era
1) Plans Save Context Window
Plans change what we pass to AI.
- No need to paste full chat histories
- No need to dump raw logs
- Only relevant documents and summarized findings
A plan is a compressed state representation:
- Readable by humans
- Easy for AI to understand
- Cheap in context window usage
This leaves more room for actual reasoning.
2) Plans Shortcut File Discovery
For similar follow-up tasks, the most expensive step is often figuring out:
- Which files matter
- What depends on what
If plans are committed, AI doesn’t need to re-scan the entire codebase.
It can reuse explicit file lists, dependencies, and constraints—
avoiding unnecessary context pollution.
3) Plans Promote Knowledge
Plans are short-lived by nature, but they often contain durable knowledge.
- Feature-level knowledge can graduate into documentation
- Constraints and boundaries can move into architecture docs
When teams consistently:
- Aggregate knowledge in plans
- Promote it after completion
They stop relying on past chat logs as an external memory.
Plans Are Not Documents—They Are Process
The goal is not to treat plans as optional artifacts.
A healthy loop looks like this:
- Commit plans
- Review plans (intent, boundaries, non-goals)
- Update plans alongside implementation
- Promote lasting knowledge into canonical docs
This loop becomes the development process in the AI era.
Summary
- Faster AI-generated code makes reviews more fragile
- The root cause is not diff size, but the loss of a clear review starting point
- Treating plans as first-class artifacts enables:
- Reviews to start from intent
- Structurally smaller PRs
- Higher-quality human and AI reviews
- Lower context window usage
- Cheaper future exploration
Plans are the unit of review, generation, and knowledge in AI-assisted development.
That’s why plans are deliverables—and why they belong in git.
Top comments (0)