Thoughtworks Technology Radar April 2026 issue moved LangGraph out of Adopt. It is an architecture argument:
Instead of starting with a rigid graph and a massive shared state, this approach favors simple agents communicating through code execution, with graph structures added later when needed. … Because each agent only has access to the state it needs, reasoning, testing and debugging become easier.
Two claims are doing the work there:
- Communicate through code execution, not a shared mutable blackboard.
- Each agent gets only the state it needs, not a view into one global object.
We build a cloud-security evaluator — a CLI, not an agent framework. We never wrote a graph orchestrator or a shared-state store. But when we recently traced an agent-driven workflow through our tool end to end, we realized we had landed on precisely the pattern the Radar is now recommending — by accident, by way of an older discipline: make every capability a deterministic command that reads files and writes files.
The workflow that gave it away
Here is a four-agent task an engineer might hand to an LLM orchestrator, using our tool's commands:
Engineer: "Connect Steampipe to AWS and produce observations for S3 and IAM."
Agent 1: reads contracts/steampipe/aws_s3_bucket.yaml
reads contracts/steampipe/aws_iam_role.yaml
queries Steampipe, transforms, validates → observations/
Engineer: "Evaluate and show me compound risks."
Agent 2: stave apply → findings.json
stave gaps → gap-report.json
Engineer: "Prove whether anonymous access to PHI is reachable."
Agent 3: reads reasoning-specs/.../z3-public-read-bucket/spec.yaml
stave export-sir → SMT-LIB facts
follows the spec → SAT / UNSAT
Engineer: "Map findings to HIPAA Technical Safeguards."
Agent 4: reads the compliance crosswalk
stave export compliance --framework hipaa → status report
Look at what is not here:
- No shared
AgentStateobject threaded through a graph. - No orchestrator that has to know about all four steps in advance.
- No agent that can read or corrupt another agent's working memory.
Each agent's state is exactly the slice of the filesystem its job requires. Agent 1 touches Steampipe contracts and writes observations. Agent 3 touches a reasoning spec and the exported facts. Agent 3 has no idea Agent 1 exists — it consumes a snapshot, not Agent 1's internal variables. The integration surface between agents is the thing every engineer already knows how to inspect, diff, and version: files and exit codes.
That is "communicate through code execution" in its most literal form. The agent runs a command; the command's output is the message.
Why "the state it needs" falls out for free
In a global-shared-state design, scoping is something you have to impose — you write reducers, you namespace keys, you hope no node mutates a field another node depends on. The Radar's critique is that this is where reasoning and debugging go to die: when everything can touch everything, a wrong value has N possible authors.
Our agents get scoped state without anyone designing scoping, because the unit of work is a single-responsibility command over an explicit input:
| LangGraph-as-default | Command-and-file composition |
|---|---|
| One graph, defined up front | No graph; agents call commands ad hoc |
| Global shared state object | State = the files each command reads |
| Scoping is engineered (reducers, namespaces) | Scoping is the command's argument list |
| A bad value has many possible authors | A bad value came from one command's input |
| Test a node by mocking the whole state | Test a command with an input file |
stave gaps cannot accidentally read the reasoning spec. stave export compliance cannot mutate the findings. Not because we forbade it — because those things were never in scope. The argument list is the scope.
The hidden requirement: determinism
Here is the part that is easy to miss. "Agents communicate through code execution" only works if a command's output is trustworthy as a message — which means the same input must always produce the same output. If Agent 2's apply returned subtly different findings each run, Agent 4's compliance mapping would be building on sand, and you'd be back to debugging a distributed system where the state is non-reproducible.
We made determinism a founding rule long before any of this was about agents: same inputs + same --now produce byte-identical output. Snapshots instead of live API calls. Time as an explicit input. Sorted, canonical JSON. A byte-for-byte verification command and golden tests that fail the build the instant output drifts.
That rule turns out to be the thing that makes code-execution composition safe:
- An agent can cache and reuse a prior step's output, because re-running would produce the same bytes.
- An agent can verify a claim by re-deriving it —
export-siron the same snapshot yields the same facts, so the SAT/UNSAT proof is reproducible, not a one-time oracle reading. - A human can debug the pipeline by running any single command in isolation and getting the exact output the agent saw.
A global mutable graph state gives you none of this. A value in the blackboard has no provenance — you can't re-derive it, you can only trust that whatever node wrote it was correct. Every fact our tool emits, by contrast, carries a deterministic id that traces back through the export and the projector to the specific observation property that produced it. Provenance is a property of the architecture, not a logging afterthought.
We made this stronger by deleting things
The most counterintuitive move was subtraction. Over the last stretch we removed the commands that didn't fit "snapshots in → findings out": continuous monitoring, remediation planning, incident timelines, external enrichment, multi-account orchestration. Those are real jobs — but they are orchestration jobs, and orchestration must be owned by the calling agent (or CI, or a scheduler).
In LangGraph terms, we resisted the temptation to grow our own graph. We kept the tool a set of leaf functions — evaluate, export, prove, map — and let the agent be the graph, when a graph is even needed. That mirrors the Radar's final point: add graph structure later, when the use case demands it, instead of paying for it everywhere up front.
The result is a tool that is agent-ready by being boring: deterministic, file-based, single-responsibility commands with no shared state to corrupt and no orchestration opinions to fight. An LLM can compose them. A bash script can compose them. A human can run them one at a time. The composition layer is free to be as simple — or, when warranted, as graph-shaped — as the problem actually requires.
Global State and Simplicity
The Radar's pullback on LangGraph is not "graphs are bad." It's "don't start with a rigid graph and a global state when simple agents over code execution would be leaner, and easier to reason about, test, and debug."
If you're building a capability for agents to use — rather than the agent framework itself — the lesson is sharper: expose deterministic commands over explicit files, scope every command to its inputs, and let the caller own the graph. You'll get the scoping, testability, and debuggability the Radar is asking for, and you'll get them without writing a single reducer.
We didn't build for agents. We built for determinism and single responsibility. It turns out that's the same thing.
Top comments (0)