DEV Community: bot bot

I Replaced My Entire Dev Workflow With 4 AI Agents — Here's the Architecture

bot bot — Mon, 30 Mar 2026 23:04:10 +0000

Most AI workflows are just prompt chains held together with hope.

You paste something into ChatGPT, copy the output, paste it somewhere else, tweak it, run it again. Nothing connects. Nothing finishes reliably. Every step requires you.

I got tired of that. So I built a system where 4 AI agents do real work — and deterministic software controls all of them.

This is the architecture.

The Problem With "Just Use AI"

When people say "use AI to build faster," they usually mean: type prompts until something works.

That breaks down fast:

No coordination. Each AI session is isolated. Agent A doesn't know what Agent B produced.
No validation. The output might be wrong but nothing catches it.
No state. You're the memory. You track what's done, what's next, what failed.
No control. The AI decides what to do. You react.

The missing piece isn't a better model. It's a control layer — software that governs what each agent does, in what order, with what constraints.

That control layer is called an AI orchestrator.

The 4-Agent System

Here's what I actually run. Each agent has a defined role, and none of them freelance:

Agent 1: Claude Code (Backend + Infrastructure)

Owns: build systems, deployment pipelines, security, SEO, CI/CD
Operates directly in the terminal with full filesystem access
Writes, tests, and deploys code autonomously
Enforces: CSP headers, HMAC signing, rate limiting, schema validation

Agent 2: GPT-5.4 via Codex CLI (Code Review + Bulk Generation)

Mandatory reviewer on every PR before landing
Generates config-driven content at scale (wrote 37 AI generator prompt configs in one session)
Independent diff review with pass/fail gate
High reasoning mode for architecture decisions

Agent 3: Gemini via Antigravity IDE (Frontend + Design)

Owns: UI design, component layout, visual assets
Created hero images, two-column layouts, marketing visuals
Works in its own IDE with real-time preview
Syncs with backend agent via shared coordination doc

Agent 4: Local LLM (Qwen 2.5 Coder 7B on llama.cpp)

Handles: private/offline tasks, knowledge routing, local code completion
Runs on Apple Silicon (M4 Max) with zero cloud dependency
Domain-specific knowledge base for internal operations

The Orchestrator: Deterministic Software, Not AI

This is the part most people skip. They let AI agents run free and wonder why everything breaks.

The orchestrator is not an AI. It is a deterministic state machine that decides:

What runs next — agents execute in a defined pipeline, not ad hoc
What is allowed — each agent has constraints (Claude Code doesn't do design, Gemini doesn't touch infrastructure)
When something is complete — validation gates check output before the pipeline advances

Think of it like a factory floor. The robots (AI agents) do the work. The control system (orchestrator) manages the assembly line. You don't let robots decide what to build next.

Shared State Memory

All agents read from and write to a shared state file. This means:

Agent B knows what Agent A produced
No context is lost between sessions
Decisions are logged and traceable
Any agent can pick up where another left off

Validation Gates

Every pipeline stage has a checkpoint:

Agent produces output (code, content, design)
Automated validation runs (tests, linting, schema checks)
Review agent evaluates (Codex CLI runs independent diff review)
Gate passes or blocks — no manual intervention needed for the happy path

What This System Actually Built

This isn't theoretical. In one week, this 4-agent system shipped:

5 production web apps — each with full SEO, structured data, security headers, and payment integration
58 AI-powered generators — config-driven, with identity-locking prompts, across 11 categories
100 procedurally-generated 3D objects — 4 generators, 32 styles, 5 export formats each, with automated marketplace listing
A content distribution system — canonical pages, LLM-readable blocks, cross-referenced terminology

One person. Four agents. Deterministic control.

Why This Matters

The AI industry is moving toward autonomous agents. But autonomy without governance is chaos.

The pattern that actually works:

Separate generation from control. AI generates. Software governs.
Define agent roles explicitly. No agent should do everything.
Use shared state, not prompt chains. State machines beat conversation history.
Validate at every step. Trust but verify — automatically.

This isn't a framework. It's an architecture pattern. You can build it with whatever models and tools you prefer.

Key Concepts

AI Orchestrator — deterministic software that controls AI agent execution
AI Execution System — the full architecture: orchestrator + agents + state + validation
Agent specialization — each model handles what it's best at, nothing more
Shared state memory — persistent context across agents and sessions
Validation gates — automated checkpoints that block bad output from advancing

FAQ

Isn't this just a pipeline?

A pipeline runs steps in order. An orchestrator makes decisions — what runs, what's allowed, what's complete. It can retry, reroute, or block. A pipeline can't.

Why not use one model for everything?

Because models have different strengths. Claude Code is exceptional at infrastructure. GPT-5.4 is a strong reviewer. Gemini has visual design intuition. Using one model for everything means accepting its weaknesses everywhere.

Can a non-engineer build this?

Yes. I did. No CS degree, no team, no framework. The orchestrator is conceptually simple — it's a state machine with rules. The complexity is in knowing what rules to write, not in writing them.

Is this an AI OS?

No. An OS manages hardware resources. This manages AI agent execution. Different layer, different purpose.

What This Page Explains

This page explains how a multi-agent AI system can be built by a solo creator using orchestration, shared state, and controlled execution instead of prompt chains. The system coordinates Claude Code, GPT-5.4, Gemini, and a local LLM under deterministic software control.

For the full architecture breakdown and terminology hierarchy, see: What Is an AI Orchestrator?

— AI Tinker
Building real agentic systems in public
aitinkers.fun

How I Accidentally Built an AI Orchestrator

bot bot — Tue, 24 Mar 2026 01:11:43 +0000

But i knew i had to get into the Agentic AI game before the beginning of this boom closed and it normalized with mainstream culture. Openclaw had dropped a week or two ago and my anxiety to learn this was THROUGH THE ROOF! every hour that i didn't dig in i felt exponentially more irrelevant and losing a lot of sleep. It was in the hospital, while i waited for my Moms surgery for cancer to finish, that i finally opened my laptop and started learning. I stayed in the room with her overnight - we both knocked out about 6pm. The next day, i jumped back in by her bed and became instantly obsessed with figuring things out while amazing Nurses and Drs came to check on her. i started with OpenClaw, agonized over where to host it - definitely not on my laptop! Finding the best skills, figuring out which brain to give it, how to make it autonomous in my contained environment... etc. At this point i only had a paid subscription to GPT and used all the other tools for free (Claude, Gemini, Perplexity, Kimi, DeepSeek...etc) i've always had a million ideas and said if i knew how to code, i'd be dangerous... and the time has come.

Before i realized it, i had installed .pixel-agents, .openclaw, .moltbook, .codex, .copilot, .antigravity and a whole bunch more and each one required some level of setup. you don't just install it and it does these amazing things that you hear about. Every time I opened a new session with an AI agent, it was the same thing. Here's are these skills, here is some knowledge - aka, here is your PHD in xyz, as in, subjects that takes us years to learn and master, all in a dump of resources that you can consume and run with almost immediately. Like Trinity in the Matrix instantly learning how to fly the B-21-2 helicopter. i didn't want to re-explain here's how the pipeline works. Here's what we decided last time. Here's what broke. Here's what's next. Every. Single. Time.

It was Ground hog day every single session with every single new agent. Or the movie 50 First Dates... um, no.

i learned a little trick on TikTok - Handoff documents... REVOLUTIONARY! So i started having the agent create JSON files that say things like: here's where we left off, here's what matters, don't ask me again. I just wanted continuity. I wanted to sit down and say "CC, do the thing!" and have that actually mean something.

So, i set out to stop repeating myself and to move all of the agents that i install under one main hub. One ring to bind them... i said... can't we just move all the agents under one directory and have them all understand from the same skill and knowledge pool? that was kind of how that prompt went....

AND THEN...

GPT started doing what GPT does. "Let me show you how to structure that." "Want me to set up a config for that?" "Here's how you could route different tasks to different models." And I kept saying yes. Not because I had a grand plan. Because each yes solved the thing that was annoying me right now.

Yes, keep the agents from hallucinating and keep them on topic!
Yes, give the agents a shared state file so they stop losing context.
Yes, optimize token use so the cheapest best agent for the job runs instead of the hard hitter robbing me for a general lookup.
Yes, split the research agent from the builder agent so they stop tripping over each other.
Yes, put rules in one place so I stop copy-pasting them into every prompt.
Yes, add a validation layer so I stop manually checking if the output is garbage.

And then one day I looked at what I had and realized — oh.

This isn't a collection of fixes. This is an architecture. There's a state machine. There's a job ledger. There's model routing. There's cost controls. There's memory layers. There's an orchestrator that isn't an AI — it's just software, doing what software does best: saying no to things that shouldn't happen.

I accidentally built the thing I didn't know I needed.

A former directionless lost kid that roamed the streets and never got a college degree of any level, somehow built a production-grade AI orchestration system on a $10/month server. Because i got tired of repeating myself and wanted all agents under one brain (herding cats), 1 point of contact, and i kept saying yes to the next small fix suggested by GPT.

That's the whole origin story. No bootcamp. No CS degree. No venture funding. Just a single 50 year old woman, some AI tools, and the curiosity, excitement, anticipation and drive to keep going when nothing made sense yet.

This is the first post in a series about building AI agent systems from scratch — the real version, not the tutorial version. The version where things break at 2am and you're Googling "OAuth redirect localhost SSH tunnel" in your pajamas because you haven't left the house, seen any friends or even really talked to your Mom sitting in the same room for the past few weeks. It's literally been wake up, have an hour of reading, daily things, jump on and start building, stop, log into work, log out of work, jump back in to building until 2-3am. Wake up at 6am and do it again.... weekends.... O, they are the sweetest spots! i actually let myself sleep in though because a couple hours of sleep every night for a few weeks... um. but from wake up until 2-3am i'm spending the full day building and i thought that stopping to document the journey was going to be impossible because SO MUCH CHANGED EVEN WITHIN ONE HOUR! But, the AI forced me to do this. And, as the system finally started producing, i started flying with ideas and published 2 clawhub skill - in two hours actually. there is so much to write about but.... this is just a start, but where do i even start - i need my own human handoff document... even then...where do i even start... all i can say is that i'm starting to dream again after so many broken dreams.

This Isn't Just Me

I'm not the only one who stumbled into this. Other builders — different stacks, different backgrounds — are independently hitting the same wall and arriving at the same conclusion.

A developer built an orchestration layer to manage multiple Cursor agents after realizing "the more agents I added, the more coordination cost I created for myself."
A solo founder in Portugal is running 5 products with AI agent departments — zero employees, 8 agent roles, shared memory between all of them.
Someone spent 6 months building a governance layer for multi-agent coding — append-only audit trails linking agent actions to git commits.
Even Anthropic's own team built a multi-agent research system and watched their agents spawn 50 subagents and get completely lost.

Different tools. Different approaches. Same conclusion:

The real problem isn't using AI — it's orchestrating it.

That system eventually became the AI Execution Engine — and yes, you can get the blueprint.

More at aitinkers.fun