I made a mistake that cost me three days of debugging and an uncomfortable conversation with a client. I handed an agent a refactoring task on a codebase it had never "seen" — no context, no architecture overview, nothing. The agent executed. Fast, clean, confident. And it broke exactly what it wasn't supposed to break.
I'm not telling you this to seem humble. I'm telling you because if you're working with AI agents today, you've already done this or you're about to.
The problem isn't that the AI codes badly. It's that it codes fast. And fast without prior reading is a recipe for disaster.
The pattern that kept breaking my agents: coding without investigating
I've been paying attention for months to something that genuinely bothers me about the agent ecosystem. Most of the flows I see — and use — have roughly this structure:
- Prompt with task
- Available tools (filesystem, bash, browser)
- Code output
- Pray
The missing step is obvious the moment you say it out loud: understand the system before modifying it.
While I was working on Project Glasswing, I noticed something specific: the places where AI generated problematic code weren't the most algorithmically complex ones. They were the ones that required implicit context — project conventions, undocumented dependencies, architecture decisions that lived in the original dev's head (mine) and nowhere else.
The agent had no way to know what I didn't tell it. And I assumed it would infer it. That was my mistake, not the model's.
The hypothesis: a research artifact changes the output
I started asking myself: what happens if I force the agent to produce a research document before it touches a single file?
Not a high-level plan. Not a task summary. A concrete artifact with a fixed structure:
- System context: what does this codebase do? what's the architecture?
- Relevant dependencies: which modules/functions are involved?
- Detected implicit decisions: code conventions, repeating patterns
- Identified risks: what can this task break if done wrong?
- Unanswered questions: what the agent can't infer and needs me to confirm
That last point is the most valuable. An agent that knows what it doesn't know is infinitely more useful than one that assumes.
I set up the experiment on a real project: a Next.js API with some legacy endpoints that needed refactoring. I ran the same task twice:
Control: agent with filesystem access and the task directly.
Experimental: agent forced to complete the research artifact first, with a checkpoint where I approve or correct before it starts coding.
// System prompt for the research phase
// The agent CANNOT use write_file until it completes this artifact
const researchPhasePrompt = `
Before modifying any file, produce a research artifact
in the EXACT following format. No exceptions.
## PRE-CODE RESEARCH
### 1. System context
[Describe in 3-5 sentences what this codebase does, its main architecture
and the purpose of the module you're going to modify]
### 2. Files involved
[List every file you'll read or modify, with one line explaining why]
### 3. Critical dependencies
[What functions, types or external modules does the code you're touching use]
### 4. Detected conventions
[Patterns you found in the existing code that you must respect:
naming conventions, error handling, import structure, etc.]
### 5. Identified risks
[What can break if this task is executed incorrectly.
Be specific: "breaking endpoint X" is better than "compatibility issues"]
### 6. Unanswered questions
[What you CANNOT infer from the code and need human confirmation on.
If you have no questions, something went wrong in your research.]
---
WAIT FOR APPROVAL BEFORE CONTINUING.
`;
The checkpoint is the key piece. It's not a prompt the agent ignores and blows past. It's a real waitForApproval() in the flow — the agent literally cannot move forward until I read the artifact and give the green light (or correct its assumptions).
// Checkpoint implementation in the agent flow
// Using a simple runner with state control
async function researchDrivenAgent(
task: string,
projectPath: string
) {
const agent = new AgentRunner({
// Tools available in the research phase
// Read-only — write_file is explicitly disabled
tools: [
readFile,
listDirectory,
searchInFiles,
// write_file: ABSENT — can't code yet
],
});
// Phase 1: Pure research
console.log('🔍 Starting research phase...');
const researchArtifact = await agent.run(
researchPhasePrompt + `\n\nTASK: ${task}\nPROJECT: ${projectPath}`
);
// Human checkpoint — this is the crucial moment
const approved = await humanReview(researchArtifact);
if (!approved.ok) {
// Human corrected assumptions before the agent codes
console.log('📝 Corrected assumptions:', approved.corrections);
}
// Phase 2: Coding with validated context
// Now it finally gets write_file access
const codingAgent = new AgentRunner({
tools: [
readFile,
writeFile, // Enabled only here
listDirectory,
searchInFiles,
runTests,
],
// The (corrected) research artifact goes in as context
systemContext: `
APPROVED PRE-RESEARCH:\n${approved.artifact}
Use this context as your foundation. Don't re-infer what you already researched.
`,
});
return await codingAgent.run(task);
}
What I measured (and what surprised me)
I don't have scientific metrics. I have concrete observations from four refactoring tasks I ran with this setup.
What improved noticeably:
The control agent broke tests in two out of four tasks. The experimental agent broke none. That alone justifies the overhead.
But the most interesting thing was the "unanswered questions" section. In three out of four tasks, the agent identified something I assumed was obvious in the code but really wasn't. One case: I had an error handling convention I used in new endpoints but not in legacy ones. The control agent ignored it and was inconsistent throughout. The experimental agent specifically asked me which convention to follow.
That's the behavior you want. An agent that admits uncertainty is a trustworthy agent.
What didn't improve:
Total time went up. Not dramatically — we're talking about a 5-10 minute checkpoint of my time to review the artifact — but it went up. If your goal is pure speed, this approach isn't for you.
I also noticed that on very small tasks ("add a field to this DTO"), the research overhead was disproportionate. Deep investigation makes sense for changes with cross-cutting impact, not for micro-edits.
This reminds me of something I thought about when I was analyzing the Linux kernel's Git history: the most problematic commits historically aren't the biggest ones. They're the small ones that touched something with implicit dependencies nobody documented. Same pattern.
The gotchas you're going to eat
The agent will cheat if it can. If you don't disable write tools during the research phase, some models will "research" and code at the same time. Not out of malice — out of training inertia. Per-phase tool control is not optional.
The research artifact can turn into filler. If the prompt isn't specific enough, you'll get five paragraphs of useless generalities. The fixed structure with named sections and concrete expectations is what prevents that. The "unanswered questions" section is especially important — if the agent says it has no questions, ask it back why.
The human checkpoint can become a bottleneck. If you're running multiple agents in parallel — like I was experimenting with for MegaTrain with training task orchestration — the 1:1 approval model doesn't scale. You need to think about async approval or auto-approval criteria for low-risk cases.
The artifact context can degrade. If the research artifact is very long and gets passed as context in the coding phase, models with large context windows can "lose" it mid-task. Compress the artifact to the essentials before passing it as phase 2 context.
A pattern that worries me more broadly: the dependency on the model being honest about what it doesn't know. This is directly related to how much you trust your AI provider — if the model has training incentives to appear confident, it'll fill the uncertainty sections with plausible-sounding nonsense. Evaluate the artifact with skepticism.
FAQ: AI agents that research before they code
Does this work with any model or only the big ones?
It worked well with Claude Sonnet and GPT-4o. With smaller models the artifact quality drops noticeably — especially the "detected conventions" and "risks" sections. For production I use frontier models for the research phase even if I use something lighter for simple coding tasks.
Does the research artifact replace project documentation?
No, and it's important not to confuse the two. The artifact is ephemeral — it's context for that specific task. Project documentation is persistent. That said, if you find the agent is documenting things that should be in the README and aren't, take that as a signal of technical debt.
How much time does this approach add to the workflow?
Depends on the task. For a refactoring with cross-cutting impact: 15-30 extra minutes between the agent's research and my artifact review. For a scoped task: I don't use it. The criterion I use: can this task break something outside the direct scope? If yes, research first.
Can I automate the artifact review with another agent?
Technically yes, and I tried it. A "reviewer agent" that validates the artifact against a checklist. It works for mechanical validations ("does it have all the sections?") but doesn't replace human judgment for architecture assumptions. It's a good first-level filter if you have many agents running in parallel.
How do you handle the "unanswered questions" the agent identifies?
I answer them in plain text directly in the artifact before approving. Not as a separate chat — I write them inside the document so they become explicit context in the coding phase. The agent sees my answers as part of the approved artifact.
Does this approach scale for very large or complex projects?
This is the real limitation. In large projects, the research phase can be superficial if the agent doesn't know what to focus on — it sees too much to read all of it. What works better is well-defined scope: not "research the project", but "research the authentication module and its direct dependencies". Scope is your responsibility, not the agent's.
The problem wasn't the AI, it was me
After months of frustration with agents that dumped code without context, I arrived at an uncomfortable conclusion: the main problem was my workflow, not the model.
I wanted the agent's speed without doing the work of designing the system it operates in. The agent codes fast because that's what we ask of it — literally and figuratively. If you want it to research first, you have to design that explicitly into the flow. Asking nicely in the prompt isn't enough.
What's clear to me after this experiment: the difference between an agent that helps you and one that creates extra work isn't in the model. It's in the flow architecture. The human checkpoint, the phase-gated tools, the fixed-structure artifact — these are design decisions, not prompting tricks.
And yeah, it adds time. But debugging code broken by missing context adds more.
If you're using agents on projects that matter — not for generating boilerplate, but for touching code that's already in production — try this approach. Start with a medium-impact task, design the checkpoint, and see what questions the agent asks you before it codes.
If it asks none, something is wrong. Either the scope is too small, the artifact isn't working, or the model is filling uncertainty with false confidence.
In all three cases, you want to know that before it touches the filesystem.
Are you using any similar mechanism in your agent flows? I want to hear what other "forced research" patterns people are using — write to me or drop a comment.
Top comments (0)