This is a three-part series based on a talk I gave under the same title.
In the first post, I talked about why MCP exists at all — as a way to standardise how models interact with tools. In the second, I zoomed in on Chrome DevTools MCP and showed how browser-level instrumentation becomes dramatically more powerful when AI can reason over it._In this final post, I want to step back and look at the bigger picture: what happens when you don’t just use MCPs, but start extending and composing them to fit your own workflows.
Chrome MCP is powerful on its own, but most developer workflows don’t fail because we lack data — they fail because we lack accumulated understanding. Using a simple performance tracker MCP built on top of Chrome MCP, I’ll use this as an example to show how chaining MCPs adds memory and automation, turning isolated tools into an end-to-end workflow.
The missing piece: memory
Think about a typical performance investigation.
You run a trace. You spot something slow. You apply a fix. You rerun the trace. You move on.
The artefacts are valuable:
- traces
- metrics
- screenshots
- network waterfalls
But once the task is done, they usually disappear. Some results live in ad-hoc notes. Some stay buried in folders. Some only exist in a developer’s head. And crucially, there’s no consistent way to compare runs, track trends, or answer questions like:
- Has this page actually improved over the last three releases?
- Which changes made performance worse - and when?
- Are we fixing the same regressions repeatedly?
This is the gap an end-to-end MCP workflow is designed to fill.
Composing MCP servers, not building monoliths
The instinctive reaction might be: “Let’s just make a smarter Chrome MCP.”
But that’s exactly the wrong direction.
Instead of adding more responsibility to one server, MCP encourages composition.
In the workflow I’ll describe here, we deliberately split responsibilities:
- Chrome DevTools MCP
Responsible only for measurement
It observes the browser and returns raw facts.
- A custom Performance Tracker MCP
Responsible for memory and comparison
It stores results, tracks changes, and exposes history.
- The LLM (via the host/client)
Responsible for reasoning
It decides what to run, what to store, what to compare, and how to explain outcomes.
Each piece does one job well. None of them becomes a god object.
A clean separation of concerns
This architecture works because each layer stays honest about its role.
- Chrome MCP doesn’t “understand” performance - it just measures it.
- The custom MCP doesn’t interpret metrics - it just remembers and exposes them.
- The LLM doesn’t collect raw data - it reasons over what it’s given.
This separation makes the system:
- easier to evolve
- easier to test
- easier to reuse across tools and teams
Most importantly, it prevents us from baking intelligence into places where it doesn’t belong.
What the custom MCP actually does
The custom MCP server is intentionally boring - and that’s a good thing.
It doesn’t talk to Chrome.
It doesn’t run traces.
It doesn’t optimise anything.
Instead, it focuses on three kinds of capability, exactly as MCP intends.
Tools: actions with side effects
Tools represent intentful operations:
- start an experiment
- save a diagnostic run
- log a code change
- compare two runs
Each tool:
- has a strict schema
- performs one deterministic action
- writes or retrieves data from storage
They’re auditable, composable, and predictable.
Resources: structured memory
Resources expose read-only views of state:
- experiment metadata
- run history
- change timelines
This allows the LLM to ground its reasoning in facts:
“Compare the last three runs.”
“Show me regressions since commit X.”
Prompts: reusable reasoning scaffolds
Prompts capture common analysis patterns:
- explain a regression
- summarise improvements
- identify trends over time
They reduce prompt duplication and keep reasoning consistent across sessions and users.
The overall setup of this MCP is laid out as follows:

The key idea here is subtle but important:
A well-designed MCP server exposes actions, memory, and reasoning scaffolds — not intelligence.
How the end-to-end workflow plays out
Once these pieces are in place, the workflow becomes almost boringly smooth.
1. A developer asks a high-level question:
“Improve performance on the checkout page.”
2. The LLM calls Chrome MCP to:
- launch a browser
- navigate to the page
- run a performance trace
3. Chrome MCP returns raw artefacts:
- Core Web Vitals
- traces
- network data
- screenshots
4. The LLM analyses the data and decides:
- what’s worth keeping
- what changed
- what to fix next
5. The LLM calls the custom MCP to:
- save the diagnostic run
- log associated code changes
- compare against previous runs
6. Fixes are applied.
7. The exact same measurements are rerun.
8. The LLM compares before and after:
- explains improvements
- highlights regressions
- summarises trends over time
At no point does any single component need to know the whole system.
Why this feels different from “AI tools”
What’s interesting about this setup is that nothing here is particularly magical.
There’s no new model capability.
No clever prompt trick.
No opaque automation.
The leverage comes from structure:
- consistent measurement
- persistent memory
- explicit composition
- clear responsibility boundaries
This is why MCP feels less like an AI feature and more like an architectural shift.
From one-off assistance to continuous workflows
Most AI dev tools today are optimised for moments:
“Generate this code.”
“Explain this error.”
“Summarise this diff.”
MCP workflows are optimised for continuity:
- across runs
- across changes
- across time
- across people
That’s the real shift. Not smarter models - but systems that can remember, compare, and evolve alongside the codebase.
Final thoughts
Across these three posts, I’ve tried to make one argument:
AI becomes genuinely useful in developer workflows not when it gets better at guessing, but when it’s embedded into the same systems we already trust - browsers, filesystems, version control, and now, structured protocols like MCP.
Chrome MCP shows what’s possible when AI can see the browser.
Custom MCPs show what’s possible when AI can remember.
Together, they point toward workflows that are faster, more reliable, and easier to reason about.
That’s not a future vision. It’s already buildable - one focused MCP server at a time.
This is the first time I’ve turned a talk into a blog post series, so I’d genuinely love any feedback — especially on what resonated, what didn’t, or what you’d like me to expand on next, as I’ll be continuing to develop this material for more conference talks throughout the year. Looking forward to hear from you, and see you at my next post!



Top comments (0)