DEV Community

Alessandro Bahgat
Alessandro Bahgat

Posted on • Originally published at abahgat.com

The Ghost in the Training Set

During the last several weeks, I've run into setting up MCP servers a few times and noticed something surprising. MCP has been gaining popularity and, as things mature, it's running into paradigm shifts. In early 2025, the recommended way to build MCP servers over HTTP switched from SSE (Server-Sent Events) to Streamable HTTP.

To my surprise, the agents I use most (Gemini and Claude) kept reverting to SSE. It wasn't until I started digging that I realized what was happening: the models were haunted by the statistical momentum of their own training data.

Even when LLMs are aware that Streamable HTTP is the standard now—and can competently answer questions about it when asked—the "statistical momentum" in their training data pulls them back to the old standard. Because most of the examples they have seen were written using the old approach, they default to it when generating code.

Note: Astutely, Claude ships with an /mcp-builder skill, which serves as a specialized instruction package. Try building an MCP server with Gemini now and you'll be surprised to get a perfectly functional implementation built on a deprecated pattern.

Why This Is Happening: The Invisible Weight of Training Bias

LLMs don't just "read" instructions in a traditional sense; they weigh them against their internal probability map. If the majority of MCP implementations in their training set used SSE, that creates a massive bias in that direction.

This is a sneaky pattern. We don't naturally think about how old (or new) a model's training set is. If you're working on a bleeding-edge domain, you may find yourself with an agent offering a beautiful implementation that is actually a frozen snapshot of last year's best practices.

Agents thrive on Common Knowledge, but they struggle with Private Context. When we use bespoke patterns or fast-moving standards, we are essentially moving the agent into a zero-shot environment without even realizing it.

From Instructions to Infrastructure

You may be tempted to overcome this through prompting (ALWAYS use Streamable HTTP), but over time, you should move these guardrails into your agents.md files. We need to shift technical standards out of lossy prompts and into the tooling infrastructure.

Well-written skills and tools help a lot here. Anthropic's /mcp-builder is extremely effective in ensuring you land with a well-functioning implementation that overcomes the inherent bias in the models.

The Trap of "Contextual Debt"

Just like code accumulates technical debt, continuously adding to agents.md without cleaning up leads to Contextual Debt. Files become bloated with a mountain of "Don't do X" or "Remember Y."

We are getting to a point where our "Instruction Budget" is as important as our compute budget. If you have clashing instructions across multiple files, you're not just wasting tokens; you're creating "hallucination traps" that are far more expensive to debug than a standard syntax error.

Strategies for Garbage Collecting agents.md

Here are a few things that seem to work for me:

  • Progressive Disclosure: Borrow from the Claude skills playbook. Instead of one giant instruction file, use a modular approach (e.g., a docs/MCP_STANDARDS.md file linked from your root agents.md).
  • The "Zero-Prompt Test": Periodically run your project with a blank instruction file, especially after model updates. If it works well, your instructions have become cruft. Delete them.
  • Project-Level Ground Truth: Get your team to own maintaining agent configs as much as they would maintain their editor configurations. Up-to-date documentation is now more precious than ever.

Conclusion: Managing the Agent's AI "Memory"

Regardless of the tropes around "software engineering being dead," more of our job is moving up the stack from writing code.

We are increasingly managing the attention and memory of our agents. The most sustainable systems will be the ones where instructions and scaffolding are pruned as ruthlessly as—if not more than—the code itself.


Enjoyed this?

I write about the intersection of engineering leadership and the "agentic" era. If you're navigating similar paradigm shifts in your own team, let's connect:

Top comments (1)

Collapse
 
itsugo profile image
Aryan Choudhary

Great post Alessandro, this is a huge challenge for LLMs, I've seen it with my own projects - they're so good at finding patterns in data, but when that data is outdated, they can just reinforce it instead of adapting to the present.