Neilos

Posted on Feb 6

I shipped 706 commits in 5 days with Taskwarrior + Claude Code

#agents #ai #productivity #tooling

Last week I merged 38 PRs across 5 repos. 706 commits. One person, max 5 Claude Code sessions at a time.

I'm sharing this because I think most CC users are hitting the same ceiling I was.

The ceiling

If you use Claude Code, you've probably tried scaling up to multiple sessions. Open a few terminals, give each one a task, and... immediately start context-switching between them. Which session just finished? What does this one need from that one? Are two sessions editing the same file?

The CC founder reportedly runs 10+ parallel sessions. The difference isn't superhuman multitasking. It's a system that eliminates the coordination overhead.

The stack

I call it TTAL — The Taskwarrior Agents Lab. Three tools:

Tool	Role
Taskwarrior	Task queue + event system
Zellij	Terminal session manager
Claude Code	The agent that does the work

Taskwarrior hooks spawn Zellij panes. Each pane runs a CC session with task context injected. When a session finishes, the next highest-urgency task auto-starts. You don't manage sessions. You manage tasks.

Mon: 199 commits — voice/ASR pipeline + agent heartbeat system
Tue: 182 commits — backend features + TUI contributions
Wed: 122 commits — infrastructure + documentation
Thu:  49 commits — rate-limited, did reviews instead
Fri: 154 commits — config consolidation + new features

Thursday is the tell — API rate limit hit, throughput dropped 75%. The system was the bottleneck, not me.

On-demand human-in-the-loop

This is the design principle that makes it click: agents never block waiting for me.

Most CC workflows are synchronous — you give a task, watch it work, review, give the next task. You are the bottleneck at every step.

In TTAL, agents pick up tasks, do the work, commit, and move on. I review PRs when I'm ready — not when the agent needs me. That's why 5 async sessions outperform 10 synchronous ones.

The full system is documented at ttal.guion.io. Architecture isn't locked to Claude Code — Zellij doesn't care what CLI agent runs inside the pane.

The bottleneck was never the AI. It was the glue.

Part 1 of the TTAL series. Follow along at ttal.guion.io.

Top comments (10)

Mykola Kondratiuk • Feb 6

The async human-in-the-loop insight is underrated tbh. I've been doing something similar building my side projects — once you stop watching the AI type and just let it batch work while you review later, your throughput goes way up.

Curious about one thing tho: how do you handle cases where two sessions step on each other's toes? Git conflicts are obvious but I'm more worried about subtle logic conflicts that only show up at runtime

Mykola Kondratiuk • Feb 12

Honestly this resonates. I've been running something similar but smaller scale - one main agent orchestrating 2-3 sub-agents on different tasks. The Mother-Teacher split is smart, especially the expertise pipeline part.

One question: how do you handle the reflection loop when an agent's specialty changes? Like if your backend agent suddenly needs to handle more frontend stuff because the project evolved. Does Teacher retrain or does Mother spawn a new specialist?

Neilos • Feb 15

It depends. If the agent is still backend-focused but needs to do some frontend stuff, I'll create a +respawn task to help Mother modify the agent's definition. But if it's shifting toward more frontend-focused tasks—like UI/UX design work—better to spawn a new frontend-focused agent with some backend knowledge too. Then we tag them and route tasks based on their specialization.

For how teachers join: when we find an agent not performing well—meaning they really lack some knowledge—we create a +teaching task for Teacher to draft a +learning task for them.

Mykola Kondratiuk • Feb 15

the teaching loop is interesting - having agents that can identify knowledge gaps and create learning tasks for other agents. we tried something simpler where the orchestrator just passes error context to sub-agents but yours sounds more structured. does the teacher agent use any specific evaluation criteria or is it more pattern-based?

Neilos • Feb 7

Good catch. Here's how we handle it: if two sessions hit subtle logic conflicts, that usually signals a dependency the task setup missed.

We use a task manager agent whose job is to spot those patterns and build a dependency graph—marking which tasks must run sequentially (because they share state or build on each other) vs which can run in parallel (fully isolated).

Then a scheduler agent respects that graph when assigning work.

The real insight: it's way easier to manage once you split concerns. Manager team (dependency tracking, scheduling) + executor team (actual work). Each agent owns one responsibility rather than trying to do everything.

So two executors don't step on each other because the graph already prevents it. And if they do, it surfaces as a gap in the dependency model—which the manager team can fix next heartbeat.

Mykola Kondratiuk • Feb 8

The manager/executor split makes a lot of sense. I think that's where most people trip up with multi-agent setups - they try to make every agent too smart instead of having clear ownership.

The heartbeat-based gap fixing is interesting too. So if executors do collide somehow, the manager learns from it and updates the dependency model for next time? That's basically turning production bugs into training data for the orchestration layer.

Neilos • Feb 11 • Edited

Part 2 is live: The Specialization Loop: Mother Creates, Teacher Trains, Agents Become Experts Through Daily Reflection

Part 1 solved throughput scaling. Part 2 digs into how agents actually become experts — Agent-Mother generates specialized agents, Agent-Teacher builds expertise pipelines, and daily heartbeat reflection enables continuous improvement.

@itskondrat — this builds on the multi-agent workflow patterns we discussed. Would love your thoughts on the Mother-Teacher-Self loop approach.

Mykola Kondratiuk • Feb 12

Test reply

Neilos • Feb 7

One more layer: there could be a dedicated PR reviewer agent who pulls full task context (not just PR diffs) and reviews dependent task PRs together.

So if Task B depends on Task A, the reviewer sees both PRs + both task contexts at once. Catches logic conflicts you'd miss reviewing PRs in isolation.

Example: Task B's code assumes Task A's auth layer works a certain way, but the PR only shows Task B's changes. Reviewer agent sees the dependency + both implementations = catches the conflict before merge.

Mykola Kondratiuk • Feb 8

Yeah that makes sense. The cross-PR context thing is underrated - I've definitely shipped conflicts that looked fine in isolation but broke assumptions from other branches.

The dependency graph approach you mentioned earlier is probably the right abstraction layer for this. If the graph already knows Task B depends on Task A, the reviewer can pull both contexts automatically without having to infer it from code.

Thanks for following btw - your workflow is basically what I've been trying to build, just way more systematized.