Tobiloba Adedeji

Posted on Feb 26

Agent Architectures: A Map

#ai #uxdesign #programming #agents

Last year I was tasked with building an "agentic pet": a Pudgy Penguin that could talk, react, and feel alive. I slapped an LLM on Vercel's AI SDK and built a chat UI with tool calls wired up. Then someone asked if the penguin could remember things between sessions, if it could react to events on its own, if its health could change in real-time. Each question implied different infrastructure: a persistent store, an event loop, a process supervisor. I hadn't thought about any of it. I'd been building a chatbot and calling it an agent, and the word had hidden every decision that actually mattered.

Below is a map of four architectures you might actually mean when you say "agent," and why you should sort that out before you write any code.

The Map

Four structural modes, classified by two questions about runtime behavior: who initiates action, and when does the process end?

These two axes are useful because they determine your infrastructure, at least most of the time. Do you need a server or a cron job? A persistent store or a disposable context?

Mode	Who initiates	When it ends	Infrastructure shape
Copilot	Human, every turn	When the human stops	Request-response server, session-scoped context, no persistence
Daemon	Human once, then events	Runs until killed or crashes	Long-running process, persistent state, crash recovery, observability
Subagent	Parent process	When the scoped task completes	Parent-managed lifecycle, scoped context, no independent recovery
Workflow	Orchestrator script	When the sequence finishes	Fixed pipeline, LLM as a step, no adaptive control flow

Copilot

The human initiates every turn, whether single-shot or multi-turn, and the system never acts unprompted. There is no lifecycle beyond the conversation. Close the tab and nothing is running.

ChatGPT in conversation mode, GitHub Copilot, Claude AI, and Claude Code in interactive mode are all copilots.

In practice this means a request-response server and session-scoped context. You can skip persistent storage; nothing has to survive past the tab. If it crashes, the user just asks again, though they'll get annoyed if that happens too often.

Daemon

A persistent background process that stays alive between events or triggers, acts on them, and goes back to idle. The human kicks it off once (sets it up, points it at a channel or a log stream) and after that, it runs on its own.

Crash recovery is most obviously a problem in this mode. A daemon that loses its memory between restarts is broken. You also need to think about supervision (what happens when it dies at 3 AM?), event ingestion (webhooks, queues, polling), and observability for a process that runs unattended.

A Slack bot that wakes on mentions, a monitoring agent polling logs, a customer support agent that remembers prior conversations, OpenClaw heartbeat.md / cron, Codex running async tasks; these are all daemons.

A daemon needs a long-lived server or worker with persistent storage for state and memory. You also need a supervision strategy, plus logging and alerting so you know when it falls over, and you'll probably burn a few nights discovering which failures you forgot to instrument.

Subagent

Like a Unix process: it can be short-lived or long-running, you can communicate with it while active, but it has no lifecycle beyond the parent that created it. When the task is done, it terminates.

A subagent usually has no persistent memory and no awareness of its siblings. It reads what it's given, does the work, hands back a result, and exits. Need the same work again? Spawn a fresh one. Retry logic, failure handling, all of that lives in the parent.

Claude Code's Task tool, OpenAI Agents SDK handoffs, or a micro-service that spins up an LLM worker for a scoped job all fit this pattern.

You'll need a parent process that manages spawn and termination, plus scoped context (only what the subagent needs for its task). Any shared state should go through the filesystem or a database, not through conversation.

Streaming is where this gets ugly. Picture subagents running in parallel: one fetching docs, another analyzing code, a third running tests, all producing output at their own pace. The first time I tried this, the UI turned into a slot machine. You need to funnel all of it back through a single UI coherently, nested inside a parent stream, displayed to one user staring at one screen. That's the UX engineering problem most frameworks don't address.

Workflow

A scripted sequence of steps where one or more involve an LLM call. You define the sequence in code as a DAG, a pipeline, or a chain of function calls; it executes in order and is fully specified before it starts.

This works best when you know the steps in advance. The value is reliable orchestration across time, not adaptive decision-making; when people try to cram a constantly-changing process into a fixed workflow, it gets ugly fast.

You can't keep a script in memory for three days. You need to pause, persist state to disk or database, survive process restarts, and resume exactly where you left off. Before we used anything like Temporal or Inngest, we kept discovering half-finished runs after a deploy and had no idea what the system had already done. That's the niche those tools and Trigger.dev occupy: durable execution across long time spans with async human behavior in the loop.

Hybrids

Hybrids exist, and they make the taxonomy useful rather than invalidating it, because they show how the modes interact in real systems.

Claude Code spans three modes in a single session. In interactive mode, it is a copilot: you type, it responds. Its Task tool spawns subagents that execute a scoped job and terminate. In agent mode, it takes multiple autonomous actions in sequence. Ask it to refactor authentication and it reads files, spawns an exploration subagent, edits twelve files, runs tests. Three modes, one session, each with its own way of failing, consuming resources, and exposing an interface. The taxonomy tells you which mode you are in at any given moment.

A note on frameworks

Agent frameworks tend to emphasize character-like setups (a role, some backstory, a set of goals). They give comparatively little guidance on how the thing actually runs or what it persists. CrewAI is the clearest example. LangGraph and AutoGen follow the same pattern: they focus on agent identity and mostly ignore runtime shape. You end up configuring a "Senior Research Analyst" with a backstory and a delegation strategy before you've even decided whether the system needs a persistent store, and the architecture requirements hide behind the character sheet.

Close

Most people use "agent" as a loose marketing term that tells you almost nothing about the actual architecture.

If you find yourself reaching for the word, force yourself to write down the concrete runtime you actually need: who starts it, how it dies, and what must survive in between. The right infrastructure usually falls out of those three answers.

DEV Community