DEV Community

Mathias Markl
Mathias Markl

Posted on

How I Got Multiple AI Coding Agents to Stop Losing Track of Their Work

I recently wrote about agent-comm — a communication layer that lets multiple AI coding agents talk to each other. But talking is only half the problem. The other half? Knowing what work needs to be done, who's doing it, and what stage it's in.

The problem

When you run multiple AI agents in parallel — say three Claude Code sessions working on different parts of a feature — things fall apart fast:

  • No shared backlog. Each agent only knows what you told it in its prompt. There's no central place to see all pending work.
  • No pipeline visibility. Is the spec done? Has anyone started implementing? Did tests pass? Nobody knows.
  • No dependency tracking. Agent B can't start until Agent A finishes, but there's nothing enforcing that.
  • No artifacts. Specs, plans, test results — they live in chat context and vanish when the session ends.

I needed something that gives AI agents the same project management primitives that human teams take for granted — but exposed as MCP tools they can call directly.

The fix: a pipeline for AI agents

agent-tasks is an open-source pipeline-driven task management server that AI coding agents can use via MCP. Think of it as a lightweight Jira — but designed for machines, not humans.

Every task flows through configurable pipeline stages:

backlog → spec → plan → implement → test → review → done
Enter fullscreen mode Exit fullscreen mode

Agents claim tasks, advance them through stages, attach artifacts at each step, and block on dependencies — all through 31 MCP tools.

What agents can do with it

1. Pipeline-driven workflow

Tasks aren't just "open" or "closed." They move through stages, and each stage produces artifacts:

  • Spec stage: Agent writes a specification and attaches it
  • Plan stage: Agent breaks the task into subtasks
  • Implement stage: Agent writes code, attaches a summary
  • Test stage: Agent runs tests, attaches results
  • Review stage: Another agent reviews and approves

2. Dependencies and blocking

task_add_dependency(task_id=5, depends_on=3, type="blocks")
Enter fullscreen mode Exit fullscreen mode

Agent 5 literally cannot advance until task 3 is done. The system detects cycles too — no deadlocks.

3. Multi-agent collaboration

Agents can be assigned roles on tasks:

  • Collaborator: actively working on it
  • Reviewer: needs to approve before advancement
  • Watcher: gets notified of changes

4. Approval gates

task_request_approval(task_id=5, stage="review")
# Another agent:
task_approve(task_id=5)
Enter fullscreen mode Exit fullscreen mode

This enforces a maker-checker pattern. No task moves to "done" without explicit sign-off.

5. Artifacts with versioning

Every spec, plan, test result, and review note is stored as a versioned artifact attached to the task. Previous versions are chained so you can see how decisions evolved.

6. Full-text search

Need to find that task about the auth middleware rewrite? task_search(query="auth middleware") uses SQLite FTS5 to search across all task titles and descriptions.

Setup

Install it:

npm install -g agent-tasks
Enter fullscreen mode Exit fullscreen mode

Add to your MCP config (Claude Code settings.json, Cursor, etc.):

{
  "mcpServers": {
    "agent-tasks": {
      "command": "npx",
      "args": ["-y", "agent-tasks"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

That's it. The dashboard auto-starts at http://localhost:3422.

A real coordination pattern

Here's how I use it with three agents working on a feature:

Agent 1 (planner) creates the task and writes a spec:

task_create(title="Add rate limiting to API", project="backend")
task_claim(task_id=42)
task_add_artifact(task_id=42, name="spec", content="...")
task_advance(task_id=42)  # spec → plan
Enter fullscreen mode Exit fullscreen mode

Agent 2 (implementer) picks up the next unblocked task:

task_next(project="backend")  # returns task 42 at plan stage
task_claim(task_id=42)
task_expand(task_id=42, subtasks=["Add middleware", "Add Redis store", "Add config"])
# ... implements each subtask ...
task_advance(task_id=42)  # implement → test
Enter fullscreen mode Exit fullscreen mode

Agent 3 (reviewer) reviews and approves:

task_claim(task_id=42)
task_add_artifact(task_id=42, name="review-notes", content="LGTM, clean implementation")
task_approve(task_id=42)
task_advance(task_id=42)  # review → done
Enter fullscreen mode Exit fullscreen mode

The whole flow is visible in the dashboard — a kanban board that updates in real-time via WebSocket.

The dashboard

The built-in dashboard gives you a kanban view of all tasks across pipeline stages. You can:

  • Drag tasks between columns
  • Filter by project, assignee, or priority
  • Expand task details with artifacts and comments
  • View artifact diffs between versions
  • Track subtask progress
  • Toggle dark/light theme

No frameworks — it's vanilla HTML/CSS/JS with morphdom for efficient DOM updates.

Task detail panel showing subtasks, artifacts, and metadata

Dashboard in dark mode

Technical details

  • TypeScript + Node.js, zero-framework architecture
  • SQLite with WAL mode for concurrent reads
  • 3 transport layers: MCP (stdio), REST API (18 endpoints), WebSocket
  • 31 MCP tools covering tasks, subtasks, dependencies, artifacts, comments, approvals, and collaboration
  • 337+ tests with vitest
  • MIT licensed

It pairs naturally with agent-comm — when a task changes, agents get notified through the communication bridge.

What's next

I'm actively using this to coordinate up to 5 Claude Code agents working simultaneously. If you're running multi-agent setups and losing track of what's happening, give it a try.

GitHub: github.com/keshrath/agent-tasks

Feedback welcome — open an issue or drop a comment below.

Top comments (0)