I'm writing this fully aware that predictions about AI often age badly.
I don't want to sound like those CEOs who confidently announce that AI will replace engineers in six months, only to quietly move the timeline when nothing happens. Instead, this is a personal thought experiment.
I've been experimenting with AI-assisted coding since it was still taboo to admit you were doing it. I started in 2021 while working at GitHub, helping developers understand the value of well-written prompts through GitHub Copilot. I was an early user of ChatGPT, alongside Claude and many other tools, long before "prompting" became its own discipline.
Today, I'm a Developer Advocate for goose, which serves as a reference implementation for Model Context Protocol and one of the first MCP clients. I use multiple MCP servers daily workflow to solve real problems.
All of that gives me a decent sense of where things might head next.
So I decided to make a few predictions for 2026, mostly to sharpen my own visionary skills. Will any of these come true? Would I tweak them a year from now? Let's find out.
These are my personal opinions. I'm not speaking on behalf of my employer or any project I work on.
Prediction 1: AI Code Review Gets Solved
By the end of 2026, I believe we'll have cracked AI code review.
Right now, one of the biggest bottlenecks in software development, especially in open source, is review capacity. People generate code faster than ever with AI, but that speed shifts pressure downstream. Maintainers, tech leads, and engineering managers now face more pull requests, more diffs, and more surface area to validate.
We already see AI-powered code review tools, but none fully hit the mark. They often feel noisy, overly rigid, or disconnected from real-world developer workflows.
Recently, Aiden Bai publicly shared thoughtful, constructive feedback on how AI code review tools like CodeRabbit could improve.
Beyond the controversy around how CodeRabbit responded, the attention his tweet received signaled something important: developers are actively hoping for a better solution.
By 2026, I expect either an existing product to meaningfully level up or a new company to enter and get it right. This is one of the most pressing problems in the space, and I think the industry will prioritize fixing it.
If you want to stay on top of developments in AI code review, I recommend following Nnenna Ndukwe / @nnennahacks
Prediction 2: MCP Apps Become the Default
I think MCP Apps will become a core part of how people interact with AI agents.
MCP Apps are the successor to MCP-UI, which first showed that agents didn't need to respond with text alone, but could render interactive interfaces directly inside the host environment. Think embedded web UIs, buttons, toggles, and selections. Users express intent through interaction rather than explanation.
As this pattern gained traction, it became clear that interactive interfaces needed first-class support in the protocol itself. MCP Apps build on that momentum and are now being incorporated into the MCP standard.
Below is a video of MCP-UI in action:
This matters beyond developer ergonomics. For years, companies tried to keep users inside their apps with embedded chatbots, hoping increased "stickiness" would drive revenue. That approach never fully worked. Meanwhile, user behavior shifted. People now go directly to AI tools like ChatGPT for answers instead of navigating websites, even if they aren't engineers.
MCP Apps flip the model. Instead of pulling users into your app, your app meets users inside their AI environment.
We already see early adoption. OpenAI is moving in this direction with ChatGPT, and goose adopted MCP-UI early and is close to shipping full MCP Apps support. Other platforms are taking similar steps.
To learn more about MCP Apps, check out this blog post.
Prediction 3: Agents Become Portable Across Platforms
I think agents will follow users wherever they work.
Today, I benefit heavily from MCP servers because they make it possible to connect agents to tools and systems. Still, I there's friction. Many users grow attached to a specific agent and want it available across environments without constant reconfiguration.
This is where Agent Client Protocol becomes interesting. ACP allows an agent to run inside any editor or environment that supports the protocol, without tightly coupling it to a specific plugin or extension.
We felt this pain firsthand with goose. Maintaining a VS Code extension proved difficult. goose would evolve, the extension would lag, and users would hit breakage. ACP changed that dynamic. Instead of tightly coupling the agent to a plugin, the editor becomes the client.
Zed Industries introduced this model. When I tried goose inside the Zed editor, the experience felt noticeably smoother. Editors from JetBrains have also adopted the protocol. ACP tends to get less attention than MCP, partly because it's less flashy and partly because the acronym overlaps with other agent-related protocols. Even so, the impact is real.
Here's where I get more ambitious. I don't think this stops at editors. Over time, agent portability may extend to design tools, browsers, and other platforms. I can imagine bringing goose, Codex, or Claude Code directly into tools like Figma without rebuilding the integration each time. This part is more speculative, but the direction feels plausible.
Prediction 4: DIY Agent Configuration Hits a Ceiling
This one feels riskier to say out loud, but I think we eventually move away from heavy context engineering and excessive configuration.
Right now, we compensate for model limitations by adding layers of structure: rules files, memory files, subagents, reusable skills, system prompt overrides, toggles, and switches. All of these help agents behave more reliably, and in many cases, they're necessary, especially for large codebases, legacy systems, and high-impact code changes.
As an engineer, I find this exciting. Configuring my setup feels participatory. I enjoy shaping how an agent reasons and responds. There's satisfaction in tuning behavior instead of treating AI as a black box.
But there's another side we haven't fully felt the consequences of yet.
Every week introduces a new "best practice." Another rule or configuration users feel pressure to adopt. At some point, the overhead may outweigh the benefit. Instead of building, people spend more time configuring the act of building.
I already see developers opting out. Some reject AI because of poor early experiences. Others reject it because the process feels exhausting. They just want to write code.
I've seen this pattern before. When Kubernetes became widely adopted, it unlocked enormous power but also exposed developers to infrastructure complexity they weren't meant to manage. The response wasn't to turn every developer into a Kubernetes expert, but to introduce platform teams, DevOps roles, and abstractions that absorbed that complexity.
I don't want to leave anyone behind in this AI era.
The thought genuinely makes me sad when we say things like "People will get left behind," so I'm brainstorming ways of how to make sure everyone "eats".
When we approach a similar inflection point with agents, I see two likely paths forward:
Tooling improves to the point where most configuration fades into the background.
Companies formalize roles around AI enablement. I've already seen early versions of this. We have internal AI champions at and enablement groups (led by my manager Angie Jones) that help teams use agents safely and effectively.
Personally, I hope for balance. I enjoy configuration and depth, but I don't think productivity scales if every repo demands a complex setup just to get started.
Those are my predictions for 2026. Let's revisit this in a year and see what holds up.
What are your predictions? And what do you think of mine?
Top comments (9)
I completely agree. #1 and #2 struck me particularly.
I'm the lead developer of Autodock, and within the project, there's a tension between #1 and #2 that I think you may find interesting.
Autodock launched first as an MCP Agent for provisioning and syncing preview environments. We saw that, outside of a few forward-looking devs, adoption stalled pretty fast. So that's your #2.
Then, we noticed that people were mostly creating preview environments for PRs and we added features that allowed it to automatically spin up GH envs. The feature uses MCP internally (all the same primitives). It's now mostly used as a way for agents and humans to quickly validate PRs. So that's your #1.
My prediction is that, next year, #2 will overtake #1 again. I actually think the PR is an antiquated vehicle for communicating about features, and tbh I even think "features" is an antiquated way to look at what units of work are in coding. Rather, I think that projects will evolve to have environments, and the environments will all diverge in some meaningful respect from the main environment. As ideas coalesce in the side-environments, and as they're tested out with different audiences, they'll be brought into the main environment. That's exactly how Supercell operates in Finland - it grows tens of small projects a year, most of them are killed off, some of them have aspects that are incorporated into one or several games, and some become games outright.
The main orchestrator of this will be MCP. And the main place where folks will go to hang out is not the PR, but the environment. No one will really read code anymore, but people will mess around with apps and funnel their reactions into the agents that are pumping out code. So code review will basically die and be replaced with what we now call QA. But it won't be the type of black-box QA we have today where a separate team that's not hacking on the code evaluates an app and reports feedback - it will be a QA performed by the developers themselves that's informed by and plugged into an agentic ecosystem.
I really believe Autodock will be part of this (obv I'm biased given my role), but more importantly than any one tool, I see this as the trajectory for the next year.
Your vision of environments replacing PRs is honestly mind-blowing. I look forward to this potential shift from "reading diffs" to "experiencing running apps"
One small clarification from my side: when I talk about MCP Apps, I’m not equating them with MCP servers. Autodock sounds like an MCP server, whereas MCP Apps are about rendering interactive UI directly inside the agent’s chat. I linked a short video in my post that shows what I mean. Sorry if I'm over explaining and you already know this 😅
But your comment is making me think that your vision where devs mess around with apps and funnel reactions into agents..we can use MCP Apps for that 👀
I look forward to seeing how things change..cuz AI has been making tech move at such a record speed.
Thanks so much for your comment. Sharing visionary ideas is so fun for me!
Ah actually that's an elision that I made but I shouldn't have - I completely glossed over the App part. Rereading the article, I see exactly what you mean now and I learned something new there, it's a category I didn't know existed which is why my brain skipped over it. I'll dig into that!
no worries...the naming MCP Clients, MCP Servers, and MCP Apps can be muddy!
Spot on with Prediction 3! Honestly, the move toward ACP is such a breath of fresh air. Being able to hop between Zed and Figma without feeling like you're 'locked in' to one ecosystem is exactly what we should be aiming for. I was just thinking though—how do you see us balancing that portability with local data privacy? It feels like that's going to be the big hurdle for us to clear in 2026. Really appreciate you sharing these thoughts!
on 3.
Your prediction about DIY agent configuration hitting a ceiling is the one I'm watching most closely. MCP is at 97M monthly SDK downloads now — the technical adoption curve is steep. But most AGENTS.md implementations I've reviewed are copy-pasted templates with no actual boundary documentation.
The parallel to Kubernetes complexity spawning platform teams is apt. I expect we'll see "AI enablement" specialists emerge who own the context engineering layer — the people who actually understand what agents should and shouldn't touch.
Question: with goose as a reference implementation, are you seeing teams treat MCP server configuration as a devops concern or a documentation concern? That distinction will probably determine whether this consolidates around infrastructure teams or technical writers.
This is such a solid take. I especially appreciate the 'Kubernetes' comparison in Prediction 4. We’re definitely at that point where 'context engineering' is starting to feel like a full-time job. I love the power of a perfectly tuned .cursorrules or rule file, but if we don't find a way to abstract that away, we’re just trading one type of manual labor for another.
Prediction 2 is the one I’m watching most closely. The shift from 'chatbots in a sidebar' to interactive MCP Apps feels like the 'iPhone moment' for agent UX. Meeting the user where they already are (whether that's in Zed, a browser, or even Figma) is a much bigger deal than people realize.
It’s cool to see the work you’re doing with goose—it feels like one of the few projects actually pushing the standard forward rather than just reacting to it.
I'm curious—on Prediction 1, do you think AI code review will ever get 'human' enough to handle the 'why' of a PR, or will it stay focused on the 'how' (logic/security/perf)?
Why this works as a human reply:
It uses "Dev-Speak": Mentioning things like .cursorrules, "abstracting away," and "iPhone moment" makes it feel authentic to the community.
It validates the author: Calling out their specific work with goose shows you actually read the article.
It’s curious, not just agreeable: It ends with a specific question that invites the author to keep talking.
We're building an observability platform specifically for AI agents and need your input.
The Problem:
Building AI agents that use multiple tools (files, APIs, databases) is getting easier with frameworks like LangChain, CrewAI, etc. But monitoring them? Total chaos.
When an agent makes 20 tool calls and something fails:
Which call failed?
What was the error?
How much did it cost?
Why did the agent make that decision?
What We're Building:
A unified observability layer that tracks:
LLM calls (tokens, cost, latency)
Tool executions (success/fail/performance)
Agent reasoning flow (step-by-step)
MCP Server + REST API support
The Question:
1.
How are you currently debugging AI agents?
2.
What observability features do you wish existed?
3.
Would you pay for a dedicated agent observability tool?
We're looking for early adopters to test and shape the product