<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alessio Masucci</title>
    <description>The latest articles on DEV Community by Alessio Masucci (@mscalessio).</description>
    <link>https://dev.to/mscalessio</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F506769%2Ffda41aff-0c94-4d5a-8065-4434d9ed6dea.png</url>
      <title>DEV Community: Alessio Masucci</title>
      <link>https://dev.to/mscalessio</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mscalessio"/>
    <language>en</language>
    <item>
      <title>I Tried to Run Symphony for Real on Rentello. It Broke in Exactly the Right Place.</title>
      <dc:creator>Alessio Masucci</dc:creator>
      <pubDate>Thu, 26 Mar 2026 08:10:26 +0000</pubDate>
      <link>https://dev.to/mscalessio/i-tried-to-run-symphony-for-real-on-rentello-it-broke-in-exactly-the-right-place-j4a</link>
      <guid>https://dev.to/mscalessio/i-tried-to-run-symphony-for-real-on-rentello-it-broke-in-exactly-the-right-place-j4a</guid>
      <description>&lt;p&gt;A few weeks ago, I published a build diary about porting OpenAI's Symphony to Claude Code.&lt;/p&gt;

&lt;p&gt;That piece was about getting the machine to exist.&lt;/p&gt;

&lt;p&gt;This one is about what happened when I tried to trust it.&lt;/p&gt;

&lt;p&gt;Not in a toy repo. Not with a fake ticket. Not with a carefully staged demo that only exercised the happy path. I wired the workflow into a real project, Rentello, pointed it at a real Linear board, and tried to make Codex, Claude, and Symphony-family orchestrators coexist on the same execution model.&lt;/p&gt;

&lt;p&gt;That is when the most useful bug of the whole project appeared.&lt;/p&gt;

&lt;p&gt;It wasn't a crash inside the orchestration loop.&lt;br&gt;
It wasn't a retry bug.&lt;br&gt;
It wasn't even a Linear label bug.&lt;/p&gt;

&lt;p&gt;It was a hidden architectural assumption inside one innocent file: &lt;code&gt;WORKFLOW.md&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase two of the same story
&lt;/h2&gt;

&lt;p&gt;The first article was the infrastructure story.&lt;/p&gt;

&lt;p&gt;I had taken OpenAI's Symphony architecture, ported the orchestration ideas to work with Claude Code, and documented the engineering journey: configuration, polling, workspaces, retries, MCP tooling, the terminal dashboard, the CLI quirks, the Linear GraphQL edges. It was the "can I build this?" phase.&lt;/p&gt;

&lt;p&gt;This sequel starts after that.&lt;/p&gt;

&lt;p&gt;Once the port existed, the obvious next question was: can this become a reliable workflow across agent surfaces?&lt;/p&gt;

&lt;p&gt;I didn't just want a Claude-only orchestrator anymore.&lt;/p&gt;

&lt;p&gt;I wanted a system where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;planning could happen in Codex App or Codex web&lt;/li&gt;
&lt;li&gt;work could be split into Linear issues that were safe for parallel or sequential execution&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;exec:agent&lt;/code&gt; work could be handled by generic agent surfaces&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;exec:symphony&lt;/code&gt; work could be handled by Symphony-family orchestrators&lt;/li&gt;
&lt;li&gt;the same repository could remain deterministic whether the active surface was Codex or Claude&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That sounds tidy when you say it fast.&lt;/p&gt;

&lt;p&gt;In practice, it meant turning "AI can work on tickets" into an actual contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  The experiment: make the workflow explicit
&lt;/h2&gt;

&lt;p&gt;The first big change was conceptual, not technical.&lt;/p&gt;

&lt;p&gt;I stopped treating the workflow as an informal set of conventions and turned it into a real repository contract:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;shared execution-owner labels: &lt;code&gt;exec:agent&lt;/code&gt; and &lt;code&gt;exec:symphony&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;wave labels like &lt;code&gt;wave:1&lt;/code&gt;, &lt;code&gt;wave:2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;a stable issue template&lt;/li&gt;
&lt;li&gt;a single persistent &lt;code&gt;## Workpad&lt;/code&gt; comment per issue&lt;/li&gt;
&lt;li&gt;a shared machine-readable contract plus shared docs/templates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important detail here is that I did &lt;strong&gt;not&lt;/strong&gt; want separate "Codex rules" and "Claude rules" at the planning layer.&lt;/p&gt;

&lt;p&gt;The workflow itself needed to be generic.&lt;/p&gt;

&lt;p&gt;The orchestrator choice should affect how the work is run, not how the work is defined.&lt;/p&gt;

&lt;p&gt;So I introduced a shared contract and kept runtime-specific prompts only where they belonged: under &lt;code&gt;.codex/&lt;/code&gt; and &lt;code&gt;.claude/&lt;/code&gt;, as adapters to the same underlying rules.&lt;/p&gt;

&lt;p&gt;That gave me a clean execution model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;exec:agent&lt;/code&gt; means "owned by the non-Symphony agent surface"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;exec:symphony&lt;/code&gt; means "owned by a Symphony-compatible orchestrator"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;blockedBy&lt;/code&gt; remains authoritative&lt;/li&gt;
&lt;li&gt;wave labels are scheduling metadata, not permission to ignore dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That was the theory.&lt;/p&gt;

&lt;p&gt;Then I decided to test it for real.&lt;/p&gt;

&lt;h2&gt;
  
  
  The smoke test: two issues, one dependency, no excuses
&lt;/h2&gt;

&lt;p&gt;I created a new Linear project called &lt;code&gt;Agent Workflow Smoke Test&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then I used the repo's planning contract to generate the smallest meaningful DAG I could think of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;MAS-26&lt;/code&gt; — &lt;code&gt;exec:agent&lt;/code&gt;, &lt;code&gt;wave:1&lt;/code&gt;, &lt;code&gt;research&lt;/code&gt;, &lt;code&gt;docs&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MAS-27&lt;/code&gt; — &lt;code&gt;exec:symphony&lt;/code&gt;, &lt;code&gt;wave:2&lt;/code&gt;, &lt;code&gt;research&lt;/code&gt;, &lt;code&gt;docs&lt;/code&gt;, &lt;code&gt;blockedBy MAS-26&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal was deliberately low-risk: a docs-only smoke test with a real sequencing edge.&lt;/p&gt;

&lt;p&gt;If the contract was sound, I should be able to verify:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;planning against the real repo&lt;/li&gt;
&lt;li&gt;issue creation in Linear with the right labels and dependency edges&lt;/li&gt;
&lt;li&gt;Codex App dispatch respecting &lt;code&gt;exec:agent&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Symphony respecting &lt;code&gt;exec:symphony&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;a single persistent workpad model across the whole system&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first pass went well.&lt;/p&gt;

&lt;p&gt;The repo-level checks passed.&lt;br&gt;
The Linear issues were created correctly.&lt;br&gt;
The labels were right.&lt;br&gt;
The &lt;code&gt;blockedBy&lt;/code&gt; relation was correct.&lt;/p&gt;

&lt;p&gt;Then I moved &lt;code&gt;MAS-26&lt;/code&gt; into &lt;code&gt;Todo&lt;/code&gt; and let Codex App handle the bootstrap.&lt;/p&gt;

&lt;p&gt;That part worked exactly the way I wanted.&lt;/p&gt;

&lt;p&gt;Codex App picked up only the &lt;code&gt;exec:agent&lt;/code&gt; issue, moved it to &lt;code&gt;In Progress&lt;/code&gt;, and created a single &lt;code&gt;## Workpad&lt;/code&gt; comment. More importantly, the next run reused that same comment and updated it in place instead of spamming the issue with milestone chatter.&lt;/p&gt;

&lt;p&gt;That detail matters more than it sounds.&lt;/p&gt;

&lt;p&gt;A lot of agent workflows die the death of a thousand comments. If every automation pass creates another "status update," the issue becomes unreadable. Reusing one persistent workpad is the difference between an automation system that looks operationally credible and one that looks like a bot farm.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figmjnrnsabi8n5y1p6mn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figmjnrnsabi8n5y1p6mn.png" alt="Linear issue created" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0p4o6at05yjtx6f2kjf8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0p4o6at05yjtx6f2kjf8.png" alt="Notes in the issue" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At that point, the first conclusion felt obvious:&lt;/p&gt;

&lt;p&gt;the workflow contract was working.&lt;/p&gt;

&lt;p&gt;That conclusion was wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  The first false conclusion
&lt;/h2&gt;

&lt;p&gt;Once the Codex App leg worked, I moved to the Symphony leg.&lt;/p&gt;

&lt;p&gt;I launched &lt;code&gt;openai/symphony&lt;/code&gt; against the repo's &lt;code&gt;WORKFLOW.md&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;And immediately noticed something odd: it was looking at the &lt;strong&gt;real&lt;/strong&gt; Rentello project instead of the smoke-test one.&lt;/p&gt;

&lt;p&gt;The reason was simple in hindsight.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;WORKFLOW.md&lt;/code&gt; was doing two very different jobs at once:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;it was the shared execution contract&lt;/li&gt;
&lt;li&gt;it was the runtime entrypoint for a specific orchestrator, with a specific project slug and a specific launcher configuration&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That meant my "shared workflow file" wasn't really shared at all.&lt;/p&gt;

&lt;p&gt;It was secretly carrying environment-specific assumptions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which Linear project to poll&lt;/li&gt;
&lt;li&gt;which states count as active&lt;/li&gt;
&lt;li&gt;which command to spawn for the agent runtime&lt;/li&gt;
&lt;li&gt;what approval policy and sandbox mode to use&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So before I even got to the deeper integration problem, I had already hit a design smell:&lt;/p&gt;

&lt;p&gt;I was treating a runtime wrapper as if it were a universal contract.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzgae0gty885304tgtm75.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzgae0gty885304tgtm75.png" alt="Wrong project slug in symphony" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I worked around it temporarily with a copied workflow file that targeted the smoke-test project.&lt;/p&gt;

&lt;p&gt;That got me to the real failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real failure: not labels, not blockers, not Linear
&lt;/h2&gt;

&lt;p&gt;The next visible symptom was noisy and misleading.&lt;/p&gt;

&lt;p&gt;Symphony would pick up the smoke-test issue, then back off with timeouts and failed runs.&lt;/p&gt;

&lt;p&gt;At first glance, it looked like the kind of thing that sends you down the wrong rabbit hole:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;maybe the new &lt;code&gt;exec:*&lt;/code&gt; label routing was wrong&lt;/li&gt;
&lt;li&gt;maybe the issue state transitions were inconsistent&lt;/li&gt;
&lt;li&gt;maybe the &lt;code&gt;blockedBy&lt;/code&gt; logic had a bug&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of that was the root cause.&lt;/p&gt;

&lt;p&gt;The actual problem was much deeper and much simpler:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;openai/symphony&lt;/code&gt; expected a Codex-compatible app-server runtime, while the repo's &lt;code&gt;WORKFLOW.md&lt;/code&gt; was still configured for Claude-style execution.&lt;/p&gt;

&lt;p&gt;In other words, the orchestrator and the launcher contract disagreed about what "the agent" even was.&lt;/p&gt;

&lt;p&gt;The workflow file still contained a Claude-oriented runtime block. &lt;code&gt;openai/symphony&lt;/code&gt; expected something more like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a Codex app-server command&lt;/li&gt;
&lt;li&gt;a compatible approval policy&lt;/li&gt;
&lt;li&gt;the sandbox settings the OpenAI implementation expects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the visible failure was timeout and retry noise.&lt;br&gt;
The real failure was a runtime mismatch.&lt;/p&gt;

&lt;p&gt;That distinction matters.&lt;/p&gt;

&lt;p&gt;If I had blamed the new workflow model, I would have "fixed" the wrong layer.&lt;/p&gt;

&lt;p&gt;The issue was not &lt;code&gt;exec:agent&lt;/code&gt; vs &lt;code&gt;exec:symphony&lt;/code&gt;.&lt;br&gt;
The issue was not the shared issue template.&lt;br&gt;
The issue was not the workpad model.&lt;/p&gt;

&lt;p&gt;The issue was that one literal &lt;code&gt;WORKFLOW.md&lt;/code&gt; cannot honestly be the runtime entrypoint for both &lt;code&gt;openai/symphony&lt;/code&gt; and a Claude-oriented Symphony port.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcj3txywb0ej5wqib1vs2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcj3txywb0ej5wqib1vs2.png" alt="" width="800" height="319"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix that changed the architecture
&lt;/h2&gt;

&lt;p&gt;This was the moment the architecture got better.&lt;/p&gt;

&lt;p&gt;The correct lesson was not "make &lt;code&gt;WORKFLOW.md&lt;/code&gt; more clever."&lt;/p&gt;

&lt;p&gt;It was: &lt;strong&gt;stop asking one file to represent two incompatible runtime contracts.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So I split the model into three layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;a shared execution contract&lt;/li&gt;
&lt;li&gt;a shared Symphony instruction body&lt;/li&gt;
&lt;li&gt;runtime-specific workflow wrappers&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Concretely, the repo now has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a shared machine-readable contract for labels, states, and workpad expectations&lt;/li&gt;
&lt;li&gt;shared docs/templates for planning and issue execution&lt;/li&gt;
&lt;li&gt;a shared Symphony body template&lt;/li&gt;
&lt;li&gt;&lt;code&gt;WORKFLOW.openai.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;WORKFLOW.claude.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;a small render script to generate the wrappers from the shared body&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That was the missing separation all along.&lt;/p&gt;

&lt;p&gt;The shared workflow body defines &lt;strong&gt;how the issue should be executed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The runtime wrapper defines &lt;strong&gt;how the orchestrator should launch the agent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Those are not the same concern.&lt;/p&gt;

&lt;p&gt;Trying to force them into one file had worked only because I was not yet testing both runtimes seriously enough.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkhbct1dssewggpzqhblt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkhbct1dssewggpzqhblt.png" alt="Architecture diagram" width="800" height="161"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once I made that split, the system became easier to reason about immediately.&lt;/p&gt;

&lt;p&gt;The repository now says, in effect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the execution rules are shared&lt;/li&gt;
&lt;li&gt;the runtime launcher is not&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a much healthier contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  The second failure: the GraphQL ghost was still waiting
&lt;/h2&gt;

&lt;p&gt;Of course, fixing the runtime mismatch didn't magically make the full run clean.&lt;/p&gt;

&lt;p&gt;After the split, Symphony could start correctly.&lt;br&gt;
It could pick up &lt;code&gt;MAS-27&lt;/code&gt;.&lt;br&gt;
It could begin doing real work.&lt;/p&gt;

&lt;p&gt;And then it hit another boundary failure: stale assumptions in the Linear GraphQL layer.&lt;/p&gt;

&lt;p&gt;The logs told the story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Field "identifier" is not defined by type "IssueFilter".&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Cannot query field "blockedByIssues" on type "Issue".&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This was a different class of bug entirely.&lt;/p&gt;

&lt;p&gt;Now the runtime was correct, but the orchestrator's GraphQL expectations were out of sync with the actual Linear schema. The result was a partial execution: the issue moved into &lt;code&gt;In Progress&lt;/code&gt;, but the run did not complete cleanly, and no workpad comment was created on &lt;code&gt;MAS-27&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That was important to verify because from the outside, a half-working orchestration system can look deceptively healthy.&lt;/p&gt;

&lt;p&gt;The dashboard shows activity.&lt;br&gt;
The issue state changes.&lt;br&gt;
The agent session starts.&lt;/p&gt;

&lt;p&gt;But the operational contract is still broken if the workpad never appears and the run dies on a schema edge.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb11wdo0obfy6x2ofmtfh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb11wdo0obfy6x2ofmtfh.png" alt="This is a schema/client mismatch, not a workflow-contract bug." width="800" height="182"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually worked
&lt;/h2&gt;

&lt;p&gt;This is the part I care about most, because I don't want the story to sound more broken than it really was.&lt;/p&gt;

&lt;p&gt;A lot &lt;strong&gt;did&lt;/strong&gt; work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the shared execution-owner model with &lt;code&gt;exec:agent&lt;/code&gt; and &lt;code&gt;exec:symphony&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;the planning contract based on shared docs/templates&lt;/li&gt;
&lt;li&gt;the Linear issue generation for &lt;code&gt;MAS-26&lt;/code&gt; and &lt;code&gt;MAS-27&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;blocker gating via &lt;code&gt;blockedBy&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Codex App automation dispatching only &lt;code&gt;exec:agent&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;in-place reuse of a single &lt;code&gt;## Workpad&lt;/code&gt; comment on Linear&lt;/li&gt;
&lt;li&gt;Symphony respecting the &lt;code&gt;exec:symphony&lt;/code&gt; lane&lt;/li&gt;
&lt;li&gt;the runtime split into &lt;code&gt;WORKFLOW.openai.md&lt;/code&gt; and &lt;code&gt;WORKFLOW.claude.md&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not a small list.&lt;/p&gt;

&lt;p&gt;In fact, the test did exactly what a good smoke test should do:&lt;/p&gt;

&lt;p&gt;it validated the architectural core, then exposed the next real boundary failure.&lt;/p&gt;

&lt;p&gt;First the runtime wrapper problem.&lt;br&gt;
Then the stale GraphQL problem.&lt;/p&gt;

&lt;p&gt;That is progress.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual lesson
&lt;/h2&gt;

&lt;p&gt;If you want cross-agent orchestration, the wrong abstraction is "one workflow file for everything."&lt;/p&gt;

&lt;p&gt;What you actually need is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one shared execution contract&lt;/li&gt;
&lt;li&gt;one shared behavioral prompt body, if the execution model is the same&lt;/li&gt;
&lt;li&gt;separate runtime launch wrappers for each orchestrator/runtime pair&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The reason is simple.&lt;/p&gt;

&lt;p&gt;Codex and Claude are not just different models. In this kind of system, they are different &lt;strong&gt;runtime surfaces&lt;/strong&gt; with different launcher assumptions, different session protocols, and different orchestration expectations.&lt;/p&gt;

&lt;p&gt;The workflow itself should stay stable across those surfaces.&lt;/p&gt;

&lt;p&gt;The launcher should not.&lt;/p&gt;

&lt;p&gt;That is the architectural correction this test forced me to make.&lt;/p&gt;

&lt;p&gt;And in hindsight, it's exactly the kind of correction you only get from trying to run the system for real.&lt;/p&gt;

&lt;p&gt;The happy path rarely teaches you where your abstractions are lying.&lt;br&gt;
Live orchestration does.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vs0n17o2molunwblox9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1vs0n17o2molunwblox9.png" alt="Same workflow rules, different launchers." width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this goes next
&lt;/h2&gt;

&lt;p&gt;The runtime split is in.&lt;br&gt;
The generic &lt;code&gt;exec:agent&lt;/code&gt; / &lt;code&gt;exec:symphony&lt;/code&gt; workflow is in.&lt;br&gt;
The Codex App leg is validated.&lt;/p&gt;

&lt;p&gt;The next cleanup is clear:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;patch the stale Linear GraphQL assumptions in the Symphony runtime&lt;/li&gt;
&lt;li&gt;rerun the &lt;code&gt;MAS-27&lt;/code&gt; path cleanly&lt;/li&gt;
&lt;li&gt;verify the full end-to-end chain with both executor lanes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That will be the point where the system stops being "an interesting orchestration prototype" and starts becoming something I would trust against a real backlog.&lt;/p&gt;

&lt;p&gt;And honestly, that is the part I find most interesting now.&lt;/p&gt;

&lt;p&gt;Porting the orchestrator was fun.&lt;/p&gt;

&lt;p&gt;Finding the seam between shared execution contracts and runtime-specific launch contracts was the part that actually made it usable.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>linear</category>
      <category>programming</category>
    </item>
    <item>
      <title>I Ported OpenAI's Symphony to Claude Code: A Complete Build Diary</title>
      <dc:creator>Alessio Masucci</dc:creator>
      <pubDate>Fri, 13 Mar 2026 15:20:17 +0000</pubDate>
      <link>https://dev.to/mscalessio/i-ported-openais-symphony-to-claude-code-a-complete-build-diary-4h61</link>
      <guid>https://dev.to/mscalessio/i-ported-openais-symphony-to-claude-code-a-complete-build-diary-4h61</guid>
      <description>&lt;p&gt;A few weeks ago, OpenAI open-sourced &lt;a href="https://github.com/openai/symphony" rel="noopener noreferrer"&gt;Symphony&lt;/a&gt; — an Elixir-based orchestrator that polls a project board, claims tickets, spins up Codex agents in isolated workspaces, manages multi-turn sessions, and handles retries. It's the missing piece between "AI can write code" and "AI can work through a backlog."&lt;/p&gt;

&lt;p&gt;I watched the demo and immediately thought: &lt;strong&gt;I want this, but for Claude Code.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the build diary of that port. Every phase, every bug, every design decision — and the one architectural flaw that made me question how AI agents should handle failure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Credit first: this isn't my idea
&lt;/h2&gt;

&lt;p&gt;Let me be clear upfront: the architecture is OpenAI's. &lt;a href="https://github.com/openai/symphony" rel="noopener noreferrer"&gt;Symphony&lt;/a&gt; is their project, their design, their Elixir implementation. The polling loop, workspace isolation, multi-turn agent sessions, exponential backoff — all of that comes from their work.&lt;/p&gt;

&lt;p&gt;What I built is a port. Same concept, different stack:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;OpenAI Symphony&lt;/th&gt;
&lt;th&gt;My Port&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Language&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Elixir&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Codex&lt;/td&gt;
&lt;td&gt;Claude Code CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tracker&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Linear&lt;/td&gt;
&lt;td&gt;Linear&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Runtime&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;BEAM/OTP&lt;/td&gt;
&lt;td&gt;Node.js&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I ported it because I use Claude Code, not Codex. And I wanted to understand every layer deeply enough to extend it for my own workflow.&lt;/p&gt;

&lt;p&gt;I had a side project — Rentello — with a Linear backlog growing faster than I could work through it. Well-scoped tickets, clear acceptance criteria. A capable AI agent could handle most of them. I just needed the orchestration layer.&lt;/p&gt;

&lt;p&gt;So I started building.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 1: One file to rule them all
&lt;/h2&gt;

&lt;p&gt;The first design decision was putting everything in a single &lt;code&gt;WORKFLOW.md&lt;/code&gt; file per project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;tracker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;linear&lt;/span&gt;
  &lt;span class="na"&gt;project_slug&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;symphony-claude-12ab34cd56fg78"&lt;/span&gt;
  &lt;span class="na"&gt;active_states&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Todo&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;In Progress&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Merging&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Rework&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;workspace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;root&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~/symphony-workspaces&lt;/span&gt;
&lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;after_create&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;git clone --depth 1 https://github.com/mscalessio/symphony-claude .&lt;/span&gt;
&lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;max_concurrent_agents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="na"&gt;max_turns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
&lt;span class="na"&gt;codex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude&lt;/span&gt;
  &lt;span class="na"&gt;approval_policy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bypassPermissions&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="s"&gt;You are working on Linear ticket `{{ issue.identifier }}`...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;YAML front-matter defines the config. Everything below the separator is a Liquid template that becomes the agent's prompt. One file controls everything: which project to watch, which states trigger work, how to set up workspaces, and the full behavioral instructions.&lt;/p&gt;

&lt;p&gt;I used Zod for schema validation — type safety at both compile time and runtime. If the YAML is malformed, the system says exactly what's wrong before anything starts. The config is hot-reloadable: a file watcher detects changes to WORKFLOW.md, re-parses, re-validates, and swaps the live config without restarting.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 2: Talking to Linear — GraphQL lessons
&lt;/h2&gt;

&lt;p&gt;Connecting to Linear's GraphQL API seemed straightforward. I needed three queries:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Candidate issues&lt;/strong&gt; — all tickets matching the project and active states&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State refresh&lt;/strong&gt; — current state for running workers' tickets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Terminal cleanup&lt;/strong&gt; — issues in terminal states that need workspace removal&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The candidate query worked immediately. Then I tried to filter blocker relations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight graphql"&gt;&lt;code&gt;&lt;span class="err"&gt;inverseRelations(&lt;/span&gt;&lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;"&lt;/span&gt;&lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="err"&gt;")&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;HTTP 400.&lt;/strong&gt; &lt;code&gt;GRAPHQL_VALIDATION_FAILED&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Linear's &lt;code&gt;inverseRelations&lt;/code&gt; field doesn't accept a &lt;code&gt;type&lt;/code&gt; argument. Unlike GitHub's API, there's no server-side relation filtering. Every poll tick was failing silently — no candidates fetched, no agents dispatched. The system looked idle, but it was actually crashing every 5 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Fetch all relations, filter client-side in the normalization layer. Simple, but it cost me an afternoon of staring at "0 candidates" wondering why.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; GraphQL APIs are snowflakes. What works on one platform won't work on another. Read the schema, don't assume.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 3: Building the brain — the orchestration engine
&lt;/h2&gt;

&lt;p&gt;The orchestrator is a state machine running on a timer. Every tick runs three phases:&lt;/p&gt;

&lt;h3&gt;
  
  
  Reconciliation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Check every running agent for stalls (no activity in 5+ minutes)&lt;/li&gt;
&lt;li&gt;Refresh ticket states from Linear — if a human moved a ticket to "Done" while an agent was working, kill the agent&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Validation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Sanity-check the config (project slug not empty, API key set, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Dispatch
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Find eligible tickets: not already running, in an active state, has a concurrency slot, no unresolved blockers&lt;/li&gt;
&lt;li&gt;Sort by priority, then creation date&lt;/li&gt;
&lt;li&gt;Spawn workers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each agent is tracked as a &lt;code&gt;RunningEntry&lt;/code&gt; with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An &lt;code&gt;AbortController&lt;/code&gt; for clean cancellation&lt;/li&gt;
&lt;li&gt;Token counters (input + output)&lt;/li&gt;
&lt;li&gt;A stall timer (last activity timestamp)&lt;/li&gt;
&lt;li&gt;The current turn number and session ID&lt;/li&gt;
&lt;li&gt;The most recent stream event (for TUI display)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The retry system was the trickiest part. I needed two modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Normal retries&lt;/strong&gt; — the agent completed a turn but the ticket is still in an active state. Quick 1-second delay, re-dispatch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Abnormal retries&lt;/strong&gt; — crashes, timeouts, API errors. Exponential backoff: 10s, 20s, 40s, up to a configurable maximum (default 5 minutes).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This prevents thundering herd behavior when Linear has a transient outage.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 4: The subprocess bet
&lt;/h2&gt;

&lt;p&gt;The biggest architectural decision was how to run Claude. Two options:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option A: Claude SDK.&lt;/strong&gt; Make API calls directly. Manage conversation state, tool execution, and permission enforcement in my code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option B: Claude CLI.&lt;/strong&gt; Spawn &lt;code&gt;claude -p --output-format stream-json&lt;/code&gt; as a child process. Parse the JSON event stream from stdout.&lt;/p&gt;

&lt;p&gt;I chose the CLI. Here's the reasoning:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concern&lt;/th&gt;
&lt;th&gt;SDK&lt;/th&gt;
&lt;th&gt;CLI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool execution&lt;/td&gt;
&lt;td&gt;I implement it&lt;/td&gt;
&lt;td&gt;CLI handles it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session state&lt;/td&gt;
&lt;td&gt;I manage it&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--resume&lt;/code&gt; handles it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Permissions&lt;/td&gt;
&lt;td&gt;I enforce them&lt;/td&gt;
&lt;td&gt;CLI enforces them&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;I build it&lt;/td&gt;
&lt;td&gt;Stream events give it to me&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coupling&lt;/td&gt;
&lt;td&gt;Tight to SDK version&lt;/td&gt;
&lt;td&gt;Loose to CLI interface&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The CLI does the hard work. The orchestrator just observes and reacts. The multi-turn loop is elegant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;spawnClaudeTurn&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;workspacePath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// resume if we have one&lt;/span&gt;
    &lt;span class="nx"&gt;mcpConfigPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;success&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Check if ticket is still active&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tracker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchIssueStatesByIds&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;activeStates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;turnNumber&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;maxTurns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The downside — dependency on CLI argument parsing — would bite me later. Hard.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 5: The MCP server trick
&lt;/h2&gt;

&lt;p&gt;Each agent needs to interact with Linear: read ticket details, post workpad comments, update states. But I didn't want API keys in the prompt.&lt;/p&gt;

&lt;p&gt;I built a tiny MCP (Model Context Protocol) server — a standalone Node.js process exposing a single &lt;code&gt;linear_graphql&lt;/code&gt; tool. For each agent session, the orchestrator writes a per-workspace MCP config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"symphony-linear"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/path/to/linear-graphql-server.js"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"LINEAR_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lin_api_..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"LINEAR_ENDPOINT"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://api.linear.app/graphql"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent calls &lt;code&gt;linear_graphql&lt;/code&gt; like any other tool. It doesn't know about authentication. It doesn't know about the orchestrator. Complete isolation.&lt;/p&gt;

&lt;p&gt;This turned out to be the cleanest design decision in the project. Each workspace is a self-contained unit: its own directory, its own git clone, its own MCP config, its own credentials.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 6: The terminal dashboard
&lt;/h2&gt;

&lt;p&gt;I wanted the same visual experience as the original Elixir Symphony — a live terminal dashboard showing running agents, token throughput, and retry queues.&lt;/p&gt;

&lt;p&gt;I built it with raw ANSI escape codes. No blessed, no ink, no external library.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; SYMPHONY  3 agents | 1,247 tok/s | 0:14:32 | 847,291 tokens
 ─────────────────────────────────────────────────────────
 ID      STATE        PID    AGE    TURN  TOKENS  EVENT
 MAS-5   In Progress  4821   2:31   1     12,847  Writing tests...
 MAS-8   In Progress  4935   1:07   1      5,221  Reading file...
 MAS-12  Todo         5012   0:03   1        423  Planning...

 BACKOFF QUEUE
 MAS-3   retry in 18s (attempt 2) — Linear API timeout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternate screen buffer (like vim), hidden cursor, 1-second refresh, SIGWINCH resize handling. When the TUI is active, Pino logs redirect to a file so they don't corrupt the display.&lt;/p&gt;

&lt;p&gt;The renderer is a pure function — state in, ANSI string out. Easy to test: 24 tests covering column formatting, truncation, empty states, and terminal width edge cases.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 7: The cascade — three bugs hiding behind each other
&lt;/h2&gt;

&lt;p&gt;Everything was built. 144 unit tests passing. TypeScript compiling clean. Time for the real test: a live ticket against a real Linear board.&lt;/p&gt;

&lt;p&gt;I started the orchestrator. It claimed MAS-5. And crashed in 200 milliseconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 1: ENAMETOOLONG
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error: Invalid MCP configuration:
Failed to read file: Error: ENAMETOOLONG: name too long, open
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system was trying to &lt;code&gt;open()&lt;/code&gt; a 3,000-character string as a filename. But what string?&lt;/p&gt;

&lt;p&gt;I traced the shell command being constructed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nt"&gt;--output-format&lt;/span&gt; stream-json &lt;span class="nt"&gt;--mcp-config&lt;/span&gt; /path/config.json &lt;span class="s2"&gt;"You are working on..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last argument — the rendered prompt template — was a positional arg. With &lt;code&gt;--mcp-config&lt;/code&gt; present, Claude CLI changes how positional arguments are parsed. The prompt was being treated as a file path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Always pipe the prompt through stdin. Never pass it as a positional argument.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 2: The silent flag requirement
&lt;/h3&gt;

&lt;p&gt;Fixed Bug 1. Rebuilt. Ran again.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error: When using --print, --output-format=stream-json requires --verbose
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A new CLI validation rule. Not in any changelog I could find. Just a new gate that appeared between "last time this worked" and now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Add &lt;code&gt;--verbose&lt;/code&gt; to the args. One line.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 3: The GraphQL ghost
&lt;/h3&gt;

&lt;p&gt;Fixed Bug 2. An agent spawned. It read the ticket. It planned. It wrote code. It completed its first turn successfully.&lt;/p&gt;

&lt;p&gt;Then the orchestrator tried to check if the ticket was still active:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight graphql"&gt;&lt;code&gt;&lt;span class="err"&gt;Cannot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;query&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;"&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="err"&gt;"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;"&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="err"&gt;".&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The state-refresh query was using &lt;code&gt;nodes(ids: [...])&lt;/code&gt; — a Relay convention for fetching any node by global ID. GitHub has it. Shopify has it. Linear doesn't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This query had been broken since day one.&lt;/strong&gt; It never ran because Bug 1 killed the process before any agent could complete a turn. Fix Bug 1, Bug 2 appears. Fix Bug 2, Bug 3 appears.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Replace with &lt;code&gt;issues(filter: { id: { in: $ids } })&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The meta-lesson
&lt;/h3&gt;

&lt;p&gt;Layered failures are the defining challenge of agent orchestration systems. In a traditional web app, a database error is a database error. In an agent system, a CLI parsing quirk masks a missing GraphQL field, which masks a design flaw in the retry loop.&lt;/p&gt;

&lt;p&gt;You have to fix bugs in strict sequential order. Each fix reveals the next failure. And unit tests — all 144 of them — caught none of these because each component worked perfectly in isolation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 8: The design flaw
&lt;/h2&gt;

&lt;p&gt;With all three code bugs fixed, the system worked end-to-end. An agent claimed a ticket, implemented it, opened a PR, and moved the card to "Human Review."&lt;/p&gt;

&lt;p&gt;But while tracing the full lifecycle during debugging, I read the Rework flow. When a reviewer requests changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Close the existing PR&lt;/li&gt;
&lt;li&gt;Delete all progress notes&lt;/li&gt;
&lt;li&gt;Create a fresh branch from main&lt;/li&gt;
&lt;li&gt;Start the entire implementation from scratch&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A reviewer says "rename this variable" and the agent throws away everything — 47 changed files, hours of compute — to re-implement from zero.&lt;/p&gt;

&lt;p&gt;This isn't a code bug. It's a design flaw in the prompt template. The Rework instruction treats every review cycle as a total reset rather than an incremental fix.&lt;/p&gt;

&lt;p&gt;The right approach: read the review comments, address each one on the existing branch, push updates to the same PR. But that requires a more sophisticated prompt — one that can parse GitHub review comments, map them to specific code locations, and make targeted changes while preserving everything else.&lt;/p&gt;

&lt;p&gt;I haven't fixed it yet. It's the next evolution. But I would never have found it if the three code bugs hadn't forced me to trace the full lifecycle manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sometimes the best thing a bug gives you is a reason to actually read your own system.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The final architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WORKFLOW.md
    |
    v
[Config Loader] ---&amp;gt; [Zod Validator] ---&amp;gt; [Resolver ($VAR, ~)]
    |
    v
[Orchestrator]
    |-- tick() every N seconds
    |-- [Reconciler] --&amp;gt; stall detection + Linear state refresh
    |-- [Dispatcher] --&amp;gt; eligibility + priority sorting
    |-- [Retry Queue] --&amp;gt; exponential backoff
    |
    v
[Worker Runner]
    |-- create workspace
    |-- write MCP config
    |-- run hooks
    |-- multi-turn loop:
    |     spawn claude CLI --&amp;gt; parse stream --&amp;gt; accumulate usage
    |     check ticket state --&amp;gt; continue or break
    |-- cleanup
    |
    v
[Claude CLI Subprocess]
    |-- stdin: prompt
    |-- stdout: stream-json events
    |-- MCP: linear_graphql tool
    |
    v
[Observers]
    |-- TUI Dashboard (ANSI)
    |-- HTTP API + Web Dashboard
    |-- Pino structured logs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;~2,200 lines of source code. 144 tests. 11 test files. 10+ modular layers.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Integration tests from day one.&lt;/strong&gt; I had excellent unit test coverage and zero integration tests. The three-bug cascade would have been caught by a single test that spawned a real Claude process against a mock Linear API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Stdin from the start.&lt;/strong&gt; Positional arguments for the CLI prompt were a ticking bomb. Stdin is universally stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Incremental rework, not reset.&lt;/strong&gt; The "throw everything away" approach to review feedback was baked into the initial prompt design. I should have designed for the review cycle from the beginning, not as an afterthought.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. GraphQL schema introspection.&lt;/strong&gt; Instead of assuming query patterns from other APIs, I should have introspected Linear's schema on startup. Would have caught the &lt;code&gt;nodes&lt;/code&gt; and &lt;code&gt;inverseRelations&lt;/code&gt; issues immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AI agent systems have emergent behavior.&lt;/strong&gt; Individual components work perfectly. The failures only appear at integration boundaries — between your code and the CLI, between the CLI and the API, between the API and the schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CLI tools are moving targets.&lt;/strong&gt; When you shell out to an external tool, you're taking a dependency on its argument parsing, its validation rules, its undocumented behaviors. Treat it like an untrusted external service, not a function call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Run the full loop early.&lt;/strong&gt; The gap between "all tests pass" and "the system works" was enormous. End-to-end execution with real APIs is where the truth lives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debug sessions are design reviews.&lt;/strong&gt; I found the biggest design flaw — destructive rework — while debugging unrelated code bugs. Tracing the full lifecycle forced me to read my system as a user would experience it.&lt;/p&gt;




&lt;p&gt;The system is running. It watches my Linear board, claims tickets, writes code, and opens PRs. Most of the time it works. Sometimes it throws away a perfectly good PR because someone asked it to rename a variable.&lt;/p&gt;

&lt;p&gt;We're working on that part.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're building similar systems — autonomous agents operating on real codebases with real project trackers — I'd love to hear what's working for you. The tooling is evolving fast, and half the battle is just keeping up with the tools we depend on.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>typescript</category>
      <category>automation</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
