HARD IN SOFT OUT

Posted on Apr 28

From Code Generation to Message Injection: Richard Seroter's AI Evolution (and What It Means for Us)

#cloudnextchallenge #googlecloud #ai #architecture

Google Cloud NEXT '26 Challenge Submission

From Code Generation to Message Injection: Richard Seroter's AI Evolution (and What It Means for Us)

Richard Seroter has a habit of building things two years before the rest of us are ready to talk about them. In 2024, he asked: can we store prompts in source control and let an LLM generate the entire app at build time? Last week, he asked something newer and, honestly, a bit more unsettling: should we call LLMs directly from our messaging middleware?

One experiment is about generating software. The other is about injecting intelligence into the veins of your data flow. Put them side by side, and you don't just see the evolution of one guy's thinking. You see where the whole industry is quietly heading—whether we're ready or not.

I've been dissecting Google Cloud Next '26 for weeks. But sometimes the most revealing stuff isn't in the keynotes. It's buried in a blog post where someone asks, "Should we?" and leaves the answer hanging.

Let's pull that thread.

1. The 2024 Experiment: Prompts as Source Code

I already covered this in depth in my previous article, but the core architecture was beautifully simple:

Source Control (prompts.json)  
    │  
    ▼  
Spring AI + Gemini 1.5 Flash  
    │  
    ▼  
Generated code (Node.js/Python + Dockerfile)  
    │  
    ▼  
Cloud Run deployment

The philosophy: treat prompts as the source of truth, and let AI do the implementation. Richard even built a working GitHub repo to prove it. The AI pumped out everything from index.js to Dockerfile to package.json. The output was non-deterministic, untested, and completely unregulated. Richard knew this. He called it "bonkers" and explicitly warned against using it for real workloads.

You can explore the entire project here:

rseroter / Gemini-code-generator

Java application that generates code using prompts fed to the Google Gemini LLM

Gemini-code-generator

This is a Spring Boot app that uses Spring AI, Google Gemini 1.5 Flash and various Google Cloud services to take prompts, generate code, and publish the generated app to Cloud Run.

You can customize and use this without Google Cloud services, and presumably with a different model!

See this blog post for a full walkthrough.

View on GitHub

But here's the thing he actually proved, even if he didn't shout it: if you can describe what you want precisely enough, an LLM can turn intent into executable artifact. That idea stuck. That's the seed of agentic software engineering you see everywhere now.

Honestly, re-reading his 2024 post gave me a weird nostalgia—like finding the first commit of a framework that now runs half the internet.

2. The 2026 Experiment: AI Inference Inside the Message Bus

Fast forward two years. Richard's latest post explores a new Google Cloud feature: AI Inference SMT (Single Message Transform) for Pub/Sub.

The architecture is almost embarrassingly simple:

Pub/Sub Topic  
    │  
    ▼  
AI Inference SMT (calls Gemini)  
    │  
    ▼  
Pub/Sub Subscription (enriched/altered message)

No custom subscriber. No Cloud Run service. No boilerplate code. You configure a template that tells the SMT what to ask the LLM, and every message passing through Pub/Sub gets enriched, translated, summarized, or routed. It feels like magic—and like most magic, it probably comes with a hidden price tag.

The philosophy: intelligence as infrastructure, not application code. Instead of writing a service that calls an LLM, you declare, "this message topic shall be intelligent," and the platform handles the rest.

Richard's post raises a question that's going to age like fine wine: is this a good idea? He compares it to storing business logic in database triggers—powerful, but a maintenance nightmare if abused. He's cautious. I'd say he's right, and maybe even understating the danger. More on that in a bit.

3. Architecture Comparison: Two Sides of the Same Coin

At first glance, these two experiments look like completely different animals. They're not. They're two halves of the same coin: declarative AI-driven automation.

Dimension	2024 (Code Generation)	2026 (Message Injection)
Trigger	CI/CD build pipeline	Every single message arrival
Scope	Static artifact generation	Dynamic content transformation
Output	Application files	Modified messages
Latency	Minutes (build time)	Milliseconds (real-time)
Governance	Manual review of generated code	Template configuration + IAM
Risk	Non-deterministic builds, no testing	Unexplained message mutations, AI hallucinations in the data plane

The 2024 experiment automates what gets deployed. The 2026 experiment automates what happens to data in flight. Put them together, and you've got a pipeline where intent flows through your message bus and comes out the other side as a running service. That's not science fiction. That's today.

4. Building the Bridge: The Agentic Message-to-Deployment Pipeline

So, naturally, I started sketching. What happens if we connect these two ideas? I call the result the Agentic Message-to-Deployment Pipeline.

The Scenario

A business unit sends a single message: "We need a microservice that translates customer feedback from any language to English and stores it in Firestore."

How It Would Work

Trigger: Business user sends a natural language request to Pub/Sub.
Enrichment: AI Inference SMT adds architectural constraints and routes to Generator Agent.
Generation: Generator Agent (ADK + Gemini 3.1 Pro) builds full application code using Memory Bank for context.
Evaluation: Evaluator Agent runs tests and security scans; Red Agent attacks, Green Agent fixes.
Deployment: Deployer Agent ships to Cloud Run via MCP.
Notification: A message is published back to Pub/Sub confirming the live service URL.

The whole chain starts and ends with a message. The same Pub/Sub topic where requests arrive is also where results are published. The intelligence doesn't live in a monolithic pipeline script—it lives in the SMT layer and the agent mesh.

When I sketched this out, I sat back and stared at it for a minute. It's elegant. It's terrifying. It's probably where we're all heading.

5. The Dark Side: Why This Should Honestly Keep You Up at Night

I've been writing dark jokes about AI automation for a month. But this combination genuinely made me pause.

1. Message mutations are invisible bugs.
When an SMT silently modifies a message using AI, how do you even start debugging? The original message is gone. The AI's decision is a black box. If the LLM hallucinates a translation or misinterprets a feature request, you ship the wrong service. Try explaining that one to the business.

2. The attack surface just exploded.
In 2024, a malicious prompt could generate bad code. In 2026, a malicious message can trigger an entire agentic deployment pipeline. Someone with Pub/Sub publish permissions can potentially spin up new services, modify data, or exfiltrate information—all through natural language. "Please delete all production databases." Said politely. Executed instantly.

3. The platform is the developer now.
We already saw that 75% of Google's new code is AI-generated. With AI Inference SMT plus agentic deployment, that percentage doesn't stop at 75. It inches toward 100. The developer becomes a reviewer, a policy setter, a person who says "yes" or "no" to an agent's proposal. I wrote about this role shift in my Developer Keynote analysis, and honestly, it's accelerating faster than I expected.

6. What Richard's Evolution Tells Us About Our Own Careers

Richard Seroter's journey from 2024 to 2026 isn't just his—it's a mirror for the whole industry.

- **2024**: "Let's see if AI can write code at all."
- **2025**: "Let's put AI in the CI/CD pipeline."
- **2026**: "Let's put AI in the data plane. And the deployment plane. And the governance plane."

I respect the hell out of Richard for publishing both experiments with zero pretense. The 2024 repo is humble—4 commits, 4 stars. The 2026 blog post is cautious, full of "should we?" questions. That kind of intellectual honesty is getting rare in tech evangelism. He's not selling. He's exploring. There's a difference.

But don't mistake the modesty. These two experiments together form a blueprint. One that says: AI should generate code, and AI should decide when to generate code, and AI should route the decision to generate code through your message bus. It's turtles all the way down, and the turtle is Gemini.

7. What You Should Actually Do About This

Read Richard's posts. Both of them. They're short, honest, and technically precise. It'll take you maybe 20 minutes total.
Experiment with AI Inference SMT, but start stupidly small. Simple enrichments. Nothing critical. Get a feel for the footguns before you aim them at production.
Sketch your own "bridge." What happens when your team's Pub/Sub topic can trigger an agentic deployment? What policies need to exist before that's even remotely safe? Draw it out. The act of drawing it will reveal the gaps.
Never forget the Red Agent. Whatever you build, make something try to break it. The Green Agent is useless without a Red Agent keeping it honest. I learned that from Next '26, and I'm going to keep saying it until it sticks.

What do you think? Are AI Inference SMTs a brilliant abstraction or a future maintenance nightmare? Would you let a message bus trigger your deployment pipeline? Drop your experience—or your darkest prediction—in the comments. I read them all, and honestly, some of you scare me more than the AI does.

Sources

Richard Seroter's Work

Google Cloud Documentation

My Previous Analysis

Google Cloud NEXT '26 Challenge Submission

HARD IN SOFT OUT

Apr 23

The 75% Illusion: What Google's AI-Generated Code Statistic Actually Means for Developers

#cloudnextchallenge #devchallenge #productivity #googlecloud

6 min read

Google Cloud NEXT '26 Challenge Submission

HARD IN SOFT OUT

Apr 24

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering

#cloudnextchallenge #devchallenge #programming #googlecloud

11 min read

Google Cloud NEXT '26 Challenge Submission

HARD IN SOFT OUT

Apr 24

26 Dark Jokes Google Cloud Next '26 Told Me

#cloudnextchallenge #devchallenge #darkhumor #googlecloud

19 min read

Top comments (1)

HARD IN SOFT OUT • Apr 28

I'm still torn on whether the SMT approach is a 'Database Trigger' mistake or a new paradigm. Richard's post was cautious, but what's the line between 'infrastructure' and 'spaghetti logic' in your opinion?