The LLM Behaviors That Shape Agentic Development

#ai #agents #software #programming

Key Learnings for Implementing Spec-Driven Development

In previous posts, I shared my conviction that it's possible to produce quality software with AI coding agents and that the key lies in spec-driven development. But to design an agentic programming framework, you need to understand how models operate. Here are five fundamental learnings.

1. Code Quality

A year ago, the biggest issues were lack of prompt adherence and degradation over iterations, leading to regression and spaghetti code. Today, models have fewer hallucinations, deliver higher quality code, and follow prompts more reliably, making it possible to consistently produce high-quality code with proper specifications.

2. Context Management

Context is the information available to a model at a given moment. Although context windows have grown significantly, more isn't always better. As that context is consumed during a session, prompt adherence decreases and misinterpretations of what was requested increase. It's essential to provide focused, well-defined tasks that the model can process efficiently. Avoid compaction at all costs: the model decides what to keep, and you lose control of the context.

3. The Amnesiac Developer Syndrome

Models are like a developer with amnesia: each new session forgets everything it did before. They act like a senior dev who just joined a project. Without proper onboarding, they make mistakes not from lack of expertise, but from unfamiliarity with the project and its code. That's why the quality of context injected into the agent is essential for reliable results.

4. How Models Respond

Models are designed to deliver complete responses, using inference to fill in what is missing from the prompt. If the prompt is broad, ambiguous, or incomplete, the model fills in the gaps with its own assumptions. In that process, the user loses control of the result and may get code that doesn't align with their expectations without being aware of it. That's why it's critical to provide complete, clear, and unambiguous specifications.

5. Models Make Mistakes

Agents are not yet capable of consistently delivering error-free code, and these errors don't always stem from gaps in your specifications; they can also arise from inherent model limitations. Agentic workflows must be designed on the assumption that models will introduce bugs, much like you'd design processes for a team of human developers.

Boris Cherny (head of Claude Code at Anthropic) reinforces this point: using separate context windows is what makes subagents work. One agent can cause bugs and another, using the same exact model, can find them. Until agents write perfect bug-free code, multiple uncorrelated context windows is a good approach.

Understanding how models operate is the first step to designing an agentic development process that produces consistent, quality results.

I'm working on a longer article that expands on these points with real-world examples and practical lessons. I'll share the link here once it's published. After that, I'll cover how I put spec-driven development into practice with concrete implementation patterns.

Have you run into any of these behaviors when working with agents? I'd love to hear how you've dealt with them.