Alvarito1983

Posted on Apr 15

The next phase of AI isn't smarter models. It's infrastructure.

#ai #discuss #infrastructure #devops

Everyone is talking about the models.

GPT-5. Claude Opus 4. Gemini Ultra. Which one scores higher on benchmarks. Which one writes better code. Which one is worth the subscription.

I think that's the wrong conversation. And I think in 18 months, most people will agree.

The next phase of AI isn't about smarter models. It's about infrastructure. And I say that as someone who has spent 15 years building infrastructure for a living.

Where we are now

The shift that already happened — and that most people haven't fully absorbed — is the move from conversational AI to agentic AI.

Conversational AI waits for you. You ask, it answers. You prompt, it responds. The human is the engine; the AI is the tool.

Agentic AI plans and executes. You give it a goal. It reads your codebase, breaks the problem into steps, executes them in sequence, checks the results, fixes what broke, and reports back. The AI is the engine; the human is the director.

This isn't future speculation. It's what Claude Code, GitHub Copilot's agent mode, and a dozen other tools are doing right now. I've been running multi-agent workflows for months — orchestrator agents coordinating specialist sub-agents building entire features in parallel while I review the output.

But here's what I think most people are missing: this transition creates massive unsolved infrastructure problems. And those problems are going to define the next 18-24 months.

What I think happens next

1. Agents are going to need their own infrastructure

Right now, agents run in your terminal, on your machine, inside someone else's cloud. That works at small scale. It breaks at large scale.

Think about what a production agent system actually needs:

Persistent state — an agent mid-task needs to survive a restart. Where does its working memory live?
Networking — agents calling other agents, agents calling external APIs, agents accessing internal services. Who manages that network?
Identity and auth — if an agent is making API calls, creating files, pushing commits, what identity does it have? How do you audit what it did?
Resource limits — a runaway agent can consume compute indefinitely. Who enforces the limits?
Observability — when something goes wrong in a multi-agent workflow, how do you trace which agent made which decision at which step?

None of this exists in a coherent form yet. We're running agents the way we ran web apps in 2003 — on single servers, with manual restarts, hoping nothing crashes.

Someone is going to build the Kubernetes of agents. Probably in the next 18 months. And when they do, someone is going to have to run it.

That someone is infrastructure engineers.

2. Prompt engineering is going to die

Not immediately. But the trend is clear.

Right now, there's an entire cottage industry around "prompt engineering" — the art of asking AI the right question in the right way to get the right answer. It's a real skill. It matters today.

But it's a transitional skill, not a permanent one.

As agentic systems mature, the question stops being "how do I write the perfect prompt?" and starts being "how do I design this system of agents so it reliably solves this class of problems?"

That's not prompt engineering. That's systems design.

It's the same shift that happened with databases. Early practitioners had to be experts at writing optimal SQL queries. Then query optimizers got good enough that you could trust the system to figure it out — and the skill that mattered became schema design, indexing strategy, and query planning at the architecture level.

The same thing is going to happen with AI. The people who will matter aren't the ones who can write clever prompts. They're the ones who can design reliable systems.

I've been building a 6-tool Docker management ecosystem with Claude Code. The prompts matter less than I expected. What matters almost entirely is: clear architecture, explicit scope boundaries, good context management, and knowing when something is wrong. Those are systems thinking skills, not prompt writing skills.

3. Agents are going to start modifying themselves

This one is further out, but the early signs are already there.

Right now, agents execute within the boundaries you set. They read your CLAUDE.md, follow your instructions, build what you ask.

But agents already write code. And some of that code is agent infrastructure — the scaffolding, the context management, the workflow definitions. The logical next step is agents that notice their own inefficiencies and propose modifications to their own workflows.

We're not there yet. But the gap between "agent that executes a workflow" and "agent that improves a workflow" is narrower than it looks.

When it closes, the role of the human changes again. Not from director to spectator — the human still needs to validate, approve, understand what changed and why. But the cycle time between "this workflow has a problem" and "the workflow is fixed" compresses dramatically.

The implication: the humans who stay in the loop effectively are the ones who understand systems well enough to evaluate a proposed change. Generalists who can prompt but can't reason about system behavior will struggle here.

4. Self-hosted AI is going to become serious infrastructure

This is the one I'm most confident about, because it's already starting.

Right now, most AI runs on someone else's infrastructure. Anthropic's servers. OpenAI's servers. Google's servers. You send your data there, get a response back, trust that the provider handles it responsibly.

For consumer use, this is fine. For enterprise use — especially in regulated industries, sensitive domains, or organizations that have genuinely learned the lesson about vendor dependency — this is becoming a problem.

The models are getting small enough to run on-premise. Llama, Mistral, Qwen — capable open-source models that you can run on hardware you control. The tooling around self-hosted inference is maturing fast.

And when organizations start running AI on their own infrastructure, someone has to manage it. GPUs don't configure themselves. Model updates need to be evaluated and deployed. Inference infrastructure needs to be monitored, scaled, and maintained.

That's not a developer job. That's an infrastructure job.

I manage 7,000 servers professionally. I can see exactly where the self-hosted AI infrastructure conversation is heading — it's heading toward the same conversations we had about on-premise databases and private cloud ten years ago. Same problems, new stack.

What this means in practice

I'm not making predictions about timelines with false precision. But the direction seems clear:

The model layer is becoming a commodity. When you have ten capable models competing for your subscription, the model itself stops being the differentiator. The system around it is.

The infrastructure layer is becoming critical. Agents need to run somewhere, persist state somewhere, authenticate somewhere, get monitored somewhere. That infrastructure doesn't exist yet in mature form.

The skills that matter are shifting. Prompt writing → systems design. Single-agent workflows → multi-agent orchestration. Cloud-hosted AI → self-hosted AI infrastructure.

I've been building on the leading edge of this — running agent workflows, building self-hosted tools, thinking about how multiple services coordinate and communicate. And what I keep noticing is that the problems I'm solving aren't AI problems. They're infrastructure problems wearing AI clothes.

The next chapter of AI isn't written by the people building smarter models. It's written by the people building the systems those models run in.

NEXUS Ecosystem is my attempt to build serious infrastructure for self-hosted Docker environments. Open source, 6 tools, unified control plane.

GitHub: github.com/Alvarito1983
Docker Hub: hub.docker.com/u/afraguas1983

ai #devops #infrastructure #claudecode #selfhosted #programming #discuss #career

DEV Community