From Code Generation to Message Injection: Richard Seroter's AI Evolution (and What It Means for Us)
By HARDIN
Richard Seroter has a habit of building things two years before the rest of us are ready to talk about them. In 2024, he asked: can we store prompts in source control and let an LLM generate the entire app at build time? Last week, he asked something newer and, honestly, a bit more unsettling: should we call LLMs directly from our messaging middleware?
One experiment is about generating software. The other is about injecting intelligence into the veins of your data flow. Put them side by side, and you don't just see the evolution of one guy's thinking. You see where the whole industry is quietly heading—whether we're ready or not.
I've been dissecting Google Cloud Next '26 for weeks. But sometimes the most revealing stuff isn't in the keynotes. It's buried in a blog post where someone asks, "Should we?" and leaves the answer hanging.
Let's pull that thread.
1. The 2024 Experiment: Prompts as Source Code
I already covered this in depth in my previous article, but the core architecture was beautifully simple:
Source Control (prompts.json)
│
▼
Spring AI + Gemini 1.5 Flash
│
▼
Generated code (Node.js/Python + Dockerfile)
│
▼
Cloud Run deployment
The philosophy: treat prompts as the source of truth, and let AI do the implementation. Richard even built a working GitHub repo to prove it. The AI pumped out everything from index.js to Dockerfile to package.json. The output was non-deterministic, untested, and completely unregulated. Richard knew this. He called it "bonkers" and explicitly warned against using it for real workloads.
You can explore the entire project here:
rseroter
/
Gemini-code-generator
Java application that generates code using prompts fed to the Google Gemini LLM
Gemini-code-generator
This is a Spring Boot app that uses Spring AI, Google Gemini 1.5 Flash and various Google Cloud services to take prompts, generate code, and publish the generated app to Cloud Run.
You can customize and use this without Google Cloud services, and presumably with a different model!
See this blog post for a full walkthrough.
But here's the thing he actually proved, even if he didn't shout it: if you can describe what you want precisely enough, an LLM can turn intent into executable artifact. That idea stuck. That's the seed of agentic software engineering you see everywhere now.
Honestly, re-reading his 2024 post gave me a weird nostalgia—like finding the first commit of a framework that now runs half the internet.
2. The 2026 Experiment: AI Inference Inside the Message Bus
Fast forward two years. Richard's latest post explores a new Google Cloud feature: AI Inference SMT (Single Message Transform) for Pub/Sub.
The architecture is almost embarrassingly simple:
Pub/Sub Topic
│
▼
AI Inference SMT (calls Gemini)
│
▼
Pub/Sub Subscription (enriched/altered message)
No custom subscriber. No Cloud Run service. No boilerplate code. You configure a template that tells the SMT what to ask the LLM, and every message passing through Pub/Sub gets enriched, translated, summarized, or routed. It feels like magic—and like most magic, it probably comes with a hidden price tag.
The philosophy: intelligence as infrastructure, not application code. Instead of writing a service that calls an LLM, you declare, "this message topic shall be intelligent," and the platform handles the rest.
Richard's post raises a question that's going to age like fine wine: is this a good idea? He compares it to storing business logic in database triggers—powerful, but a maintenance nightmare if abused. He's cautious. I'd say he's right, and maybe even understating the danger. More on that in a bit.
3. Architecture Comparison: Two Sides of the Same Coin
At first glance, these two experiments look like completely different animals. They're not. They're two halves of the same coin: declarative AI-driven automation.
| Dimension | 2024 (Code Generation) | 2026 (Message Injection) |
|---|---|---|
| Trigger | CI/CD build pipeline | Every single message arrival |
| Scope | Static artifact generation | Dynamic content transformation |
| Output | Application files | Modified messages |
| Latency | Minutes (build time) | Milliseconds (real-time) |
| Governance | Manual review of generated code | Template configuration + IAM |
| Risk | Non-deterministic builds, no testing | Unexplained message mutations, AI hallucinations in the data plane |
The 2024 experiment automates what gets deployed. The 2026 experiment automates what happens to data in flight. Put them together, and you've got a pipeline where intent flows through your message bus and comes out the other side as a running service. That's not science fiction. That's today.
4. Building the Bridge: The Agentic Message-to-Deployment Pipeline
So, naturally, I started sketching. What happens if we connect these two ideas? I call the result the Agentic Message-to-Deployment Pipeline.
The Scenario
A business unit sends a single message: "We need a microservice that translates customer feedback from any language to English and stores it in Firestore."
The whole chain starts and ends with a message. The same Pub/Sub topic where requests arrive is also where results are published. The intelligence doesn't live in a monolithic pipeline script—it lives in the SMT layer and the agent mesh.
When I sketched this out, I sat back and stared at it for a minute. It's elegant. It's terrifying. It's probably where we're all heading.
5. The Dark Side: Why This Should Honestly Keep You Up at Night
I've been writing dark jokes about AI automation for a month. But this combination genuinely made me pause.
1. Message mutations are invisible bugs.
When an SMT silently modifies a message using AI, how do you even start debugging? The original message is gone. The AI's decision is a black box. If the LLM hallucinates a translation or misinterprets a feature request, you ship the wrong service. Try explaining that one to the business.
2. The attack surface just exploded.
In 2024, a malicious prompt could generate bad code. In 2026, a malicious message can trigger an entire agentic deployment pipeline. Someone with Pub/Sub publish permissions can potentially spin up new services, modify data, or exfiltrate information—all through natural language. "Please delete all production databases." Said politely. Executed instantly.
3. The platform is the developer now.
We already saw that 75% of Google's new code is AI-generated. With AI Inference SMT plus agentic deployment, that percentage doesn't stop at 75. It inches toward 100. The developer becomes a reviewer, a policy setter, a person who says "yes" or "no" to an agent's proposal. I wrote about this role shift in my Developer Keynote analysis, and honestly, it's accelerating faster than I expected.
6. What Richard's Evolution Tells Us About Our Own Careers
Richard Seroter's journey from 2024 to 2026 isn't just his—it's a mirror for the whole industry.
- **2024**: "Let's see if AI can write code at all."
- **2025**: "Let's put AI in the CI/CD pipeline."
- **2026**: "Let's put AI in the data plane. And the deployment plane. And the governance plane."
I respect the hell out of Richard for publishing both experiments with zero pretense. The 2024 repo is humble—4 commits, 4 stars. The 2026 blog post is cautious, full of "should we?" questions. That kind of intellectual honesty is getting rare in tech evangelism. He's not selling. He's exploring. There's a difference.
But don't mistake the modesty. These two experiments together form a blueprint. One that says: AI should generate code, and AI should decide when to generate code, and AI should route the decision to generate code through your message bus. It's turtles all the way down, and the turtle is Gemini.
7. What You Should Actually Do About This
- Read Richard's posts. Both of them. They're short, honest, and technically precise. It'll take you maybe 20 minutes total.
- Experiment with AI Inference SMT, but start stupidly small. Simple enrichments. Nothing critical. Get a feel for the footguns before you aim them at production.
- Sketch your own "bridge." What happens when your team's Pub/Sub topic can trigger an agentic deployment? What policies need to exist before that's even remotely safe? Draw it out. The act of drawing it will reveal the gaps.
- Never forget the Red Agent. Whatever you build, make something try to break it. The Green Agent is useless without a Red Agent keeping it honest. I learned that from Next '26, and I'm going to keep saying it until it sticks.
What do you think? Are AI Inference SMTs a brilliant abstraction or a future maintenance nightmare? Would you let a message bus trigger your deployment pipeline? Drop your experience—or your darkest prediction—in the comments. I read them all, and honestly, some of you scare me more than the AI does.
Sources
Richard Seroter's Work
- Store prompts in source control and use AI to generate the app code in the build pipeline (2024)
- GitHub: Gemini-code-generator
- You can now easily call LLMs from your messaging engine. Should you? (2026)
- Richard Seroter on X quoting this author
Top comments (1)
I'm still torn on whether the SMT approach is a 'Database Trigger' mistake or a new paradigm. Richard's post was cautious, but what's the line between 'infrastructure' and 'spaghetti logic' in your opinion?