DEV Community

Gustavo Gondim
Gustavo Gondim

Posted on • Originally published at ggondim.notion.site

Multi-agents on multi-projects with multi-providers via multi-channels

TL;DR

This is a “state of multi-agentic-driven development” article. It is a personal consolidation of my last month's learnings, thoughts and vision.

During the research for this article, I also compiled a dataset of agent configuration formats and features across providers, and I tracked the emergence of common patterns and standards in the ecosystem.

In the end, I made a multi-agent development repoitory pattern called Monoswarm, built to work with many agent formats, many providers, and many projects at the same time. It is a monorepo with a shared .ai submodule for agent definitions and symlinks to provider-specific config files in each project. The orchestration logic is still a missing piece, but it would be an OpenClaw plugin that routes events between agents and workflows without relying on LLM inference.

If my opinion offends or contradicts someone, remember everyone is still learning about this industrial revolution everyday and there is no certainty about the outcome.

Obviously, Claude helped me to write this article, but I needed to rewrite myself almost entirely as it didn't fully capture my perspective.

Table of contents

Why multi-agents

The default mental model for AI coding assistants is a single agent answering questions in a chat window. This works for isolated tasks, like explaining a function, generating a snippet, or fixing a bug. But as soon as you try to build a sustained development workflow, a single agent becomes a bottleneck for two distinct reasons:

  • Role separation. Different tasks demand different system prompts, personas, and domain knowledge. Also, each agent benefits from its own identity - a name, a behavioral profile, constraints on how it should respond, and explicit boundaries on what it should not do.

  • Skill/tool set. AI coding agents operate through tool calls: file operations, shell commands, API requests, browser actions. A code review agent needs read-only access to a repository and the ability to post review comments. A deployment agent needs access to CI/CD pipelines and infrastructure tooling. A testing agent needs to run test suites and parse their output. Granting all tools to a single agent increases the attack surface and the likelihood of unintended side effects. It also forces the LLM to navigate a larger tool catalog on every invocation, which dilutes attention and wastes context window tokens on irrelevant tool definitions.

Why multi-projects

  • Parallel clones. Having a well-defined team of agents - a programmer, a reviewer, a tester, etc. - solves the role separation problem. But a single team can only work on one project at a time. If you manage multiple repositories, you need the ability to spawn clones of the same agent configuration across different projects simultaneously. A "reviewer" agent for project A and a "reviewer" agent for project B should share the same behavioral template but operate in completely independent sessions, with no shared context or state between them.

This is a parallel workforce problem. The agent definitions (system prompts, tool permissions, model selection) act as blueprints. Each project instantiates its own copy of the team, running in its own workspace with its own git branch, session history, and checkpointing. In platforms like OpenClaw, this maps to isolated session keys per project, but very thighted to the channel/routing layer. In Ralph Orchestrator, each loop operates on its own workspace directory. The multiplier is straightforward:

  N agent roles × M projects = N×M concurrent sessions
Enter fullscreen mode Exit fullscreen mode
  • Project-level instructions. Beyond cloning, each project also carries its own custom instructions. Project-level directives, like coding standards, architectural guidelines, and definition of done, need to be injected into every agent that works on that project, layered on top of the agent's own role-specific instructions. The orchestration system must support this two-dimensional configuration:
  agent identity (role) × project context (goals and constraints)
Enter fullscreen mode Exit fullscreen mode

Why multi-providers

  • Model diversity and volatility. The AI model landscape shifts weekly. A model that leads benchmarks today may be surpassed tomorrow by a release from a different provider. Locking your entire agent infrastructure to a single provider means you cannot capitalize on these shifts without re-architecting.

  • Cost optimization. Provider diversity also has a direct cost dimension. Different models have vastly different pricing per token, and not every task requires the most expensive model. An agent that plans or reviews work benefits from a high-reasoning model like Claude Opus or GPT-5.2-Codex. A worker agent executing straightforward file edits or running shell commands can use a faster, cheaper model (Haiku, GPT-5-mini, or a local open-weight model using Ollama) without meaningful quality loss.

  • Feature specialization. Finally, providers differ in features beyond raw model quality. Some offer native tool use with parallel execution, others have larger context windows, others support image input or structured output with JSON schema validation. Some have better streaming performance, others have more generous rate limits. A multi-provider setup lets you match each agent to the provider whose feature set best fits its role, rather than accepting the lowest common denominator of a single vendor.

ℹ️ Projects like OpenRouter and OpenCode attempt to abstract this fragmentation by providing a unified API layer across multiple providers. They solve the interface problem: you get a single endpoint that speaks the same protocol regardless of the underlying model. But they are wrappers, not consolidators. You still manage separate API keys, separate billing accounts, separate rate limits, and separate subscription tiers across providers. The operational complexity of multi-provider doesn't disappear, it just moves one layer down.

Why multi-channels

AI coding agents today are mostly confined to IDE integrations and terminal CLIs. You open vscode, or run Claude Code in a shell, and interact with the agent in that context. This works when you're sitting at your workstation, but it breaks as soon as you step away from the keyboard.

On daily work, development happens across a spectrum of contexts, each one in a different communication channel (Slack, email, messaging apps, etc.). If your agents only listen to one channel in your machine, you miss the opportunity to capture these moments as actionable inputs. OpenClaw's architecture was built around this idea from the start, treating channels as interchangeable transport layers. The same agent, with the same identity and memory, can receive tasks from a Telegram message, a Discord command, a REST API call, or a CLI invocation.

This matters for two practical reasons:

  • First, task acquisition. Not every task starts at your desk. You might spot a bug while reviewing a PR on your phone, or get a production alert on Slack while commuting. Multi-channel support turns these moments into actionable inputs. The task enters the pipeline without waiting for you to open a terminal.

  • Second, background and remote work. Agents don't need to run on your local machine. An OpenClaw Gateway running on a home server, a VPS, or any always-on host can execute agent sessions independently of your workstation. You close your laptop, and the agents keep working. This decouples agent execution from your personal computing environment entirely. GitHub's Copilot coding agent already demonstrated this model: you assign an issue to @copilot and it works autonomously in a GitHub Actions runner. The difference with a multi-channel setup is that you retain interactive access to these remote agents through messaging platforms, turning what would be a fire-and-forget job into a supervised but location-independent workflow.

Exploring deterministic agent orchestration

Once you accept the multi-agent premise, the next question is: who decides what runs when? There are two schools:

  1. let the LLM orchestrate (non-deterministic), or
  2. let code orchestrate (deterministic).

The distinction matters because LLMs are unreliable routers. They forget steps (specially after context compaction), miscount iterations, and silently skip transitions. Relying on LLMs is relying on inference.

Work needs repeatable, auditable processes: determinism makes outcomes predictable and debuggable. Organizations already enforce this by layering procedures and state machines on top of inherently non-deterministic human behavior. AI agents require the same treatment. Leaving sequencing and routing decisions to LLM inference introduces fragile, non-repeatable behavior, just like relying on human memory alone, and that's not a good look for "AI improving human workflows".

For reliable, predictable, and inspectable multi-agent workflows, orchestration must be deterministic and implemented in code or a typed runtime, not delegated to the model.

In that world, we have some options for deterministic orchestration:

  • Lobster is OpenClaw's built-in workflow engine. It takes the deterministic path: YAML-defined pipelines where steps run sequentially, data flows as JSON between them, and approval gates halt execution until explicitly confirmed. The LLM never decides what happens next, Lobster does. Each step can invoke any OpenClaw tool, including agent-send for inter-agent messaging and llm-task for structured LLM calls with schema validation. The result is a system where LLMs do what they're good at (generating and analyzing code) while a typed runtime handles the plumbing (sequencing, looping, conditional branching). However, Lobster was originally designed for single-agent pipelines. It lacked native loop support for sub-workflows, a gap that I unsuccessfully tried to fil with a Lobster pull request.

  • Ralph Orchestrator takes a different approach. It implements the "Ralph Wiggum technique" (autonomous agent loops with hard context resets between iteration) but it augments Ralph with a "hat-based orchestration framework", which means agents emit structured events (e.g., [event:code_complete], [event:review_rejected]) that trigger transitions to other listening agents. The routing logic is still non-deterministic, as it relies on the LLM to emit the right event at the right time, but at least the decision of "what happens next" is externalized from the agent's system prompt and implemented in code. This is a step in the right direction, but it still leaves a lot of room for error (what if the agent forgets to emit an event? what if it emits the wrong one? what if it emits multiple events?).

  • OpenProse is a markdown-first orchestration language. It lets you define agents, spawn parallel sessions, and merge results, all in .prose files with a declarative syntax. It is the most expressive option for multi-agent workflows and this expressiveness comes at a cost: OpenProse programs are interpreted by the LLM itself, which means flow control is ultimately non-deterministic. The LLM reads the .prose spec and simulates execution, which works until it doesn't. For workflows where predictability matters more than flexibility, OpenProse is better suited as a planning and preparation layer (define agents, gather context) that hands off to a deterministic engine like Lobster for the actual execution.

OpenClaw pitfalls

Despite being the most complete open-source agent platform available (multi-agent, multi-channel, multi-provider, with a plugin ecosystem and 150K+ GitHub stars), OpenClaw carries significant baggage that makes adoption non-trivial.

  • Too much context. OpenClaw's architecture revolves around concepts like "souls" (persistent agent identity files), memory compaction, personality evolution, and self-improvement loops. For developers who just want a coding agent pipeline, this is cognitive overhead. Take a look at what OpenClaw injects into the system prompt of every agent session. Context is a precious resource and OpenClaw's assumes a rigid and expensive structure in a time we don't know nothing on how to optimize it in development workflows. There are some recent papers that evidences this absence of certainty:

  • Opinionated structure. OpenClaw imposes a specific organizational model: agents with isolated workspaces, skills as installable packages, tools with allowlists and permission scoping, a Gateway daemon as the central runtime. This structure makes sense for OpenClaw's core use case (a personal AI assistant connected to messaging platforms), but it becomes friction when you try to use it as pure orchestration infrastructure. You can't easily bring your own agent definition format, your own project layout, or your own tool integration pattern. Everything must conform to OpenClaw's conventions in a time when there are no established conventions in the first place.

But it is the only viable option.

IMHO, this is the uncomfortable reality. As of early 2026, no other open-source project combines multi-agent support, multi-channel transport, a roughly-made workflow engine (Lobster), and a plugin architecture with an active community.

Alternatives solve subsets of the problem:

  • Ralph Orchestrator seems to be good at agent orchestration, but the event/hat-based model is still non-deterministic at all, as it relies on each agent to emit the right event at the right time. Also, it lacks the bridge between the execution layer and the communication layer (you run workflos using the command line, so you don't have the option to trigger them from an external channel).
  • Gas Town Hall is a hard code solution and too lyrical for deterministic orchestration, without the multi-channel capability.
  • Agor is beautiful and one of my favorite Agentic-development projects, but is is more a graphical interface for managing multiple agents than an orchestration engine. It lacks a deterministic workflow layer - the only way to coordinate agents is through LLM inference.

None of them offer the full stack. If you need the complete picture - agents, channels, orchestration, tools, memory - OpenClaw is the only game in town, pitfalls included.

Lack of standards + nightly revolutions

The AI coding agent space has no equivalent of HTTP, SQL, or even REST in terms of common protocols/patterns. Each provider ships its own agent protocol, its own tool calling format, its own configuration schema, and its own orchestration primitives. Claude Code uses CLAUDE.md and markdown-based project instructions. GitHub Copilot uses .github/copilot-instructions.md and AGENTS.md Cursor uses .cursor/rules. Windsurf, Cline, Augment, ...each has its own convention. There is no shared specification for how an agent should discover project context, what format its instructions should follow, or how it should report results.

Some open initiatives are trying to close this gap:

  • AGENTS.md is gaining traction as a de facto standard for project-level agent instructions - a single file that any compliant agent can read to understand project conventions, regardless of provider.
  • The Model Context Protocol (MCP) standardizes how agents connect to external tools and data sources through a server-client architecture, giving agents a portable way to access databases, APIs, and file systems without provider-specific integrations.
  • Agent Skills proposes a shared format for agent capabilities that can be installed and discovered across platforms.

But these are early-stage efforts, and adoption is fragmented. MCP has the most momentum, backed by Anthropic and adopted by multiple editors and platforms. AGENTS.md is simple enough to gain organic adoption but lacks a formal spec. Agent Skills is still finding its audience.

Meanwhile, the ground shifts constantly. A new model release, a new agent framework, a new orchestration pattern, sometimes multiple in the same week. Any architecture you build today must account for the fact that the ecosystem's conventions will look different in three months.

⚠️ Betting on a single provider's format is a guaranteed migration headache. Betting on emerging standards is a calculated risk, but at least the migration path is shared with the rest of the community.

Comparison of provider capabilities

I have been tracking the agent configuration formats and features (agents, instructions, skills, prompts, tools, etc.) of every major provider in a Notion workspace.

While this is a still incomplete and rapidly evolving dataset, it makes clear that there is a primitive common alignment across providers emerging.

AI Agents Dataset

Old solutions that work

While the ecosystem chases new standards and frameworks, the most reliable tools for managing multi-agent, multi-project configurations are decades-old Unix and Git primitives. This is how I'm planning to survive the next few months of rapid change:

  • Git submodules for agent workforce. Your agent definitions (system prompts, skills, tool configurations, behavioral profiles) are just files. They belong in a repository. When multiple projects need the same agent team, a git submodule lets you share a single source of truth for agent configurations across all of them. Update the submodule, pull in every project, and every agent team is in sync. No package registry, no plugin marketplace, no sync daemon. Just Git.

  • Symlinks to deduplicate and universalize provider files. Different AI coding tools expect their configuration in different paths and formats: .cursor/rules, AGENTS.md, .github/copilot-instructions.md, .clinerules, and so on. The content is often largely the same - project conventions, coding standards, architectural guidelines - but each provider demands its own file. Symlinks let you maintain a single canonical source and point every provider-specific path to it. One file to edit, N providers served. Do you want to know where did I found this solution? In the OpenClaw's monorepo itself.

  • Dynamic file includes via references. Some agent instruction formats support file references or includes, loading content from other files at runtime. This enables composable instructions: a base set of project conventions shared across all agents, with role-specific overrides layered on top. Instead of duplicating instructions across agent configs, you reference a shared file and keep the delta minimal.

  • Monorepos for related projects and shared documentation. When your projects share agents, libraries, or infrastructure, a monorepo eliminates the coordination overhead of keeping multiple repositories in sync. Agent configurations, shared skills, project-specific overrides, and orchestration workflows all live in one tree. Cross-project references are just relative paths. This combined to the previous points creates a powerful synergy: a shared .ai submodule for the workforce, symlinks for provider configs, and project-level instructions all coexisting in a single monorepo structure.

The Monoswarm pattern

Combining the primitives from the previous section into a coherent structure yields what I call the Monoswarm pattern: a monorepo layout designed to host and manage a swarm of AI coding agents across multiple projects.

The core structure:

monoswarm/
├── .ai/
│   ├── common/                 # git submodule - shared AI definitions
│   └── ...                     # project-level - local AI overrides
├── .claude
│   └── CLAUDE.md               → ../.ai/common/instructions/always-on.md (symlink)
├── .github/
│   ├── copilot-instructions.md → ../.ai/common/instructions/always-on.md (symlink)
│   └── custom.instruction.md   → ../.ai/instructions/project-specific.md (symlink)
├── packages/                   # project source code, splitted in repositories
├── docs/                       # project-level documentation
└── AGENTS.md                   → ../.ai/common/instructions/always-on.md (symlink)
Enter fullscreen mode Exit fullscreen mode
  • The .ai/common directory is a git submodule: a standalone repository containing every agent definition, skill, tool, prompts and other resources. It is the single source of truth for the workforce. Every project in the monorepo mounts it, and every developer (or CI runner) that clones the monorepo gets the same agent team. Updating agent behavior across all projects is a submodule bump.

  • The .ai/ directory also contains project-specific overrides: instructions or definitions that only apply to a subset of projects. This is where you put project-level and agent-specific context that needs to be injected into agents working on that project, without affecting the global workforce.

  • Symlinks bridge the gap between the shared .ai configs and each provider's expected file paths. Where each provider expects its own configuration file, you point it to the shared source or the project-specific override as needed. This way, you maintain a single canonical set of instructions and definitions, but every provider gets what it needs without duplication.

  • The docs/ directory holds cross-cutting documentation that humans and agents can reference: architecture decision records, API contracts, shared conventions. This is context that doesn't belong to any single project but is relevant to agents working across the monorepo.

💡 The next step? Building a CLI tool to automate the setup of this structure, manage submodule updates, and help building symlinks for existing provider configs.

Still a missing piece: an OpenClaw orchestration plugin

The Monoswarm pattern solves the configuration and file structure problem. OpenClaw provides the agent runtime, channels, and tool ecosystem. Lobster handles deterministic workflow execution. But there is still a gap between them: there is no orchestration layer that ties agent events to workflow transitions across projects.

  • Using the power of internal hooks. OpenClaw's plugin architecture exposes lifecycle hooks: TypeScript handlers that fire on events like message_sent, tool_result_persist, session_start, and others. These hooks are the natural extension point. An orchestration plugin can intercept structured events emitted by agents (e.g., [event:code_complete], [event:review_rejected]) and route them to the appropriate next step without the LLM making that decision. The agent writes code and emits a completion event; the plugin catches it and triggers the review agent. The review agent rejects; the plugin routes back to the programmer with the feedback. The LLM never touches the routing logic.

  • Building an event hub. The plugin acts as a lightweight event bus within the OpenClaw Gateway. Agents publish events, the hub matches them against registered workflow rules, and dispatches the corresponding actions (spawning sessions, sending messages to other agents, triggering Lobster pipelines), just like the hat-based orchestration framework proposed by Ralph Orchestrator. Event schemas are defined per project, so code_complete in project A can trigger a different workflow than in project B. The hub maintains a registry of active pipelines and their current state, enabling pause, resume, and inspection.

  • Integrating DAGs for complex workflows. Simple linear pipelines (code → review → test) are a starting point, but real development workflows branch. A review might pass on the first attempt or require multiple iterations. A test failure might route back to the programmer or escalate to a human. These are directed acyclic graphs, not sequences. The orchestration plugin needs to support conditional transitions, fan-out (parallel agents working on different aspects), fan-in (merging results before proceeding), and iteration caps. All defined declaratively, all executed deterministically.

  • Inserting human-in-the-loop gateways. Not every transition should be automatic. Deploying to staging, merging to main, approving a security-sensitive change - these require human judgment. The plugin should support approval gates at any point in the DAG, exposed through OpenClaw's channel system. This is Lobster's approval mechanism elevated to the orchestration level, operating across agents and projects rather than within a single pipeline.

💡 The next step? Building a prototype of this plugin, as I proposed in my last article.

Non-addressed theme: Where is multi-instance?

This article covers multi-agents, multi-projects, multi-providers, and multi-channels. There is a fifth dimension that deserves mention but sits outside the scope of this discussion: multi-instance.

Most developers operate in at least two contexts: personal and work. Each project have their own repositories, their own API keys, their own cost budgets. These are fundamentally different trust boundaries with different data isolation needs.

OpenClaw already supports this through multiple Gateway instances running on the same host with isolated profiles. A personal Gateway and a work Gateway can coexist on the same machine - or run on separate hosts entirely - with zero shared state between them.

Although it is not the focus of this article, the proposed submodule and symlink patterns could also be used for extending/reusing agent definitions across instances. A personal Gateway could mount the same .ai/common submodule as the work Gateway, if the agent definitions are generic enough and not too sensitive to be shared.

Top comments (0)