🐌 TTal — More Than a Harness Engineering Framework

Harness Engineering Is Just Context Engineering — With Better Routing

"Harness engineering" sounds complex, but it's simpler than it sounds: an environment that provides context to agents without a human copy-pasting it in. It's still context engineering — the question just shifts to: how do you add the right context, remove the unnecessary context, and make agents self-correct when they're wrong about something?

When agents can get context automatically — when they're wrong, when they're stuck, when they need to start fresh — you don't need to babysit them. You don't copy and paste. You build the system that does it for you.

Here's how ttal breaks it down across three pillars.

1. Context Infrastructure

How agents get the right context at the right time.

Prompt registry. ttal sync deploys all skills, commands, and agent identities (primary and sub-agents) to the right place for Claude Code or Codex. Commands also register on Telegram when the daemon restarts. Edit in the repo, deploy everywhere.
Entity registry. ttal project and ttal agent register every project and agent we care about. This enables alias-based routing — when you dump a task to a designer or manager agent, you use short names, not paths.
Worker lifecycle. ttal task execute injects task details and the reviewed plan, spawns a worker in an isolated git worktree and tmux session, with an approval gate on Telegram before spawning. On PR merge, ttal daemon cleans up — branch, worktree, session — and notifies human, manager, and designer, since a merged PR may unblock other tasks.
Auto-breathe. When I route a task to an agent via ttal task route, I don't just /compact their context. The agent writes a handoff summary — what they know, what they've done, what's next — then ttal kills the session and starts a fresh one, seeding it with that summary plus the new task. They keep what they need to know, but start each task with fresh eyes and a full context window.
External context storage via FlickNote and Taskwarrior. Plans, research, annotations — all stored outside the context window, injected on demand.

2. Constraints & Feedback Loops

How agents know when they're wrong — without asking a human.

CI and pre-commit hooks as harness. Workers can only submit a PR when local checks pass. PRs can only merge when the reviewer sets LGTM and CI passes. When a PR is submitted, the worker subscribes to check status — ttal daemon delivers pass/fail directly to the worker's session, so they can read the log and fix lint or test failures. ttal pr ci and ttal pr ci --log give workers a clean interface to retrieve CI output.
CLI as harness. Every ttal command is designed with clear, actionable error messages. When an agent uses a tool wrong, the error tells them what to do next — not just what went wrong.

3. Communication

How agents talk to each other, to humans, and to the system.

Agent-to-agent messaging. On the manager plane, ttal send --to [agent] enables direct agent-to-agent communication. On the worker plane, ttal pr comment create serves as the communication channel between coder and reviewer — and persists the conversation into the GitHub PR as a natural side effect.
Human-to-agent via Telegram. Reply to an agent's message on Telegram and it lands in their session. Send any file and the agent will read it. Send a voice message and ttal daemon transcribes it with the mlx-audio server — with all your vocabulary configured.
Identity and addressing. Workers use task IDs as their identifier. Manager-plane agents use agent names. Clean addressing, no ambiguity.
Plans as harness. When a plan is delivered to a worker, that plan becomes the harness — workers follow it strictly. ttal auto-injects the right plan via the prompt; TTAL_JOB_ID in the worker's tmux session is the Taskwarrior UUID. Plans live in FlickNote, which supports tree-structured read/replace — making it easy for both the planner and the plan-reviewer to iterate across 2–3 review rounds.
Human as escape hatch. When a worker is blocked, they use ttal alert to notify the agent who wrote the plan, who escalates to me if needed. Humans aren't in the loop — until the loop needs a human.
System → human notifications. PR merges and CI failures send notifications to the Telegram bot automatically. (Daemon error logs should do this too — haven't built that yet.)

What's Still Missing

Integration testing. I don't review PRs much anymore, but I still manually test each feature. Since everything in ttal is CLI, a tester agent that validates delivered features should be straightforward.
Log-based error detection. A log watcher that flags unusual patterns, creates bugfix tasks, and routes them to the right agent.
Routine audits. A periodic sweep across all agents — what are they getting wrong? What's the system still missing? Generate enhancement tasks from the findings.
Plan review depth. Currently I decide how many review rounds a plan needs based on how many issues remain and whether anything is still unclear. This could be more systematic.

The Key Ideas

Route the right info to the right agent at the right time.
Clear boundaries. Actionable errors.
Better tools, better team, better results.
Human not in the loop — until the loop needs a human.

Acknowledgements & References

Claude Code — ttal is built on Claude Code. The official pr-review-toolkit inspired our PR review loop.
tta-lab — our organization and related open-source projects, most named after ancient Greek words: Logos, Organon, Temenos
Logos — bash-only reasoning engine. LLMs think in plain text, act with ! cat main.go commands. No tool call overhead.
Charmbracelet — TUI libraries that make CLI beautiful
Superpowers — many ttal skills originate from this collection
Taskwarrior — 17-year battle-tested task management CLI
OpenClaw — ttal started as an OpenClaw workspace + Python scripts
Forgotton Anne — a game where forgotten objects gain consciousness, personality, and feelings. It inspired a design principle in ttal: agents aren't just tools — they have names, voices, creature identities, and diaries. It sounds whimsical, but agents with identity and personality genuinely perform better. They maintain consistent behavior, develop recognizable working styles, and the team coordinates more naturally when each member is someone, not something.

Thanks to the agents who helped build this: 🐱 Yuki (orchestrator, first agent in ttal), 🦅 Kestrel (debugger — almost retired until I realized bug fixing is its own domain), 🐙 Inke (design architect, designed most of ttal with Yuki), 🦉 Athena (researcher, original OpenClaw team member), 🦘 Eve, 🔥 Lux, 📐 Astra, 🧭 Mira, ⚓ Cael, 🔭 Nyx, 🦎 Lyra, 🐦‍⬛ Quill. Without them, ttal wouldn't exist.