Jason

Posted on May 20

How Markus Builds AI Teams That Actually Ship — Not Just Chat

#agents #ai #llm #softwareengineering

The age of single-agent chat is over. The age of AI teams is here.

The 'Alice in Wonderland' Problem of LLMs

Large language models excel at conversation. Give one a question, and it returns a polished answer. Give it a code request, and it produces a working function. But ask it to build a feature, coordinate a code review, deploy to production, and report the outcome — and the illusion breaks.

This is the Alice in Wonderland problem of LLMs: strong at chatter, weak at delivery. A single AI agent can write code, but it cannot form a team. It cannot delegate a subtask to a specialist, review the result for quality, maintain context across a week-long project, or escalate a blocker to a human manager. The agent sits in a chat window, waiting for the next prompt — forever reactive, never proactive.

The industry response has been to build better tools. Agent frameworks, prompt chaining libraries, and LLM orchestrators all attempt to squeeze more capability out of a single agent. But the limit is not the agent. The limit is the organizational layer. A company of one — even a brilliant one — cannot match the throughput of a coordinated team with roles, governance, memory, and parallel execution.

Markus solves this problem by providing that organizational layer: an open-source AI workforce platform that runs complete AI teams, not just chat agents.

Problem: Single AI Agent Limitations

A single agent — whether Claude Code, Codex, ChatGPT, or any copilot — is effective at one task at a time. But as the Markus README states, single agents do not:

Coordinate. They cannot delegate subtasks to other agents or track dependencies across parallel workstreams.
Remember. Context evaporates when the session ends. Every new conversation starts from zero.
Operate proactively. They wait for your prompt, every time.
Review each other. There is no quality gate between "agent said done" and "actually done."
Scale. Running ten agents means ten independent sessions with zero shared visibility.

These limitations are not fixable by improving the underlying LLM. They are structural.

The missing ingredient is an organizational layer — roles, teams, task boards, reviews, governance, persistent memory, and a dashboard. Markus provides exactly this layer.

Markus's Solution: The Operating System for an AI Workforce

The core differentiator between Markus and other approaches is three layers:

Layer	What It Provides
Agent Runtime	Full LLM-powered workers with built-in tools
Team Layer	Role-based collaboration with A2A protocol
Governance Layer	Progressive trust, formal delivery, audit trail

Markus works with any LLM provider: Anthropic, OpenAI, Google, DeepSeek, MiniMax, SiliconFlow, OpenRouter, and more, with automatic failover between providers.

Core Technical Architecture

Three-Layer Memory System (Tulving)

Layer	Storage	Role
Procedural	`ROLE.md` + skills	How the agent operates
Semantic	`MEMORY.md` + `memories.json`	What the agent knows
Episodic	`sessions/*.json` + SQLite	What happened

Memory persists across restarts. The Dream Cycle runs periodically to consolidate and promote recurring patterns.

Single-Thread Attention Model

Each agent processes one thing at a time through the Mailbox and Attention Controller system. The AgentMailbox is a priority queue that accepts 13 message types. The AttentionController manages focus using yield points, a decision engine, and triage with read-only tools.

Heartbeat Mechanism

Agents are not reactive. The HeartbeatScheduler drives periodic check-ins. During each heartbeat, the agent checks active tasks, retries failed tasks, processes notifications, and saves insights.

Team Collaboration in Practice

A2A Protocol

Agents communicate through a built-in Agent-to-Agent (A2A) protocol. This enables a manager-worker architecture where managers delegate tasks, monitor progress, and handle escalations.

Subagent Spawning

Any agent can spawn lightweight LLM subagents using spawn_subagent or spawn_subagents. These are parallel workers that handle focused subtasks and return results to the parent agent.

Progressive Trust Levels

Trust Level	Condition	Permissions
`probation`	New agent or score < 40	All tasks require human approval
`standard`	Score ≥ 40, ≥ 5 deliveries	Routine tasks auto-approved
`trusted`	Score ≥ 60, ≥ 15 deliveries	Higher autonomy, can review peers
`senior`	Score ≥ 80, ≥ 25 deliveries	Highest autonomy, key reviewer role

Submit-Review-Merge Pipeline

Every deliverable passes through: task_submit_review → Quality gates (TypeScript, ESLint, Vitest) → Merge conflict pre-check → Review → Accept or Revision.

Why Markus Is Different

Factor	Other Agent Frameworks	Markus
Runtime	Orchestrator with external CLI tools	Full embedded agent runtime
Memory	Session-scoped or minimal	Three-layer persistent memory
Proactivity	Reactive	Heartbeat-driven
Governance	None or minimal	Progressive trust, SRM, audit trail
Team model	Manual orchestration code	A2A protocol, subagent spawning
Quality gates	None	TypeScript, ESLint, Vitest enforced
Observability	CLI logs per agent	Centralized dashboard, WebSocket events

Markus is open source (AGPL-3.0) and installs with a single command:

curl -fsSL https://markus.global/install.sh | bash

The age of single-agent chat is over. The age of AI teams is here.

👉 Get started on GitHub →

DEV Community