DEV Community

Cover image for Introducing Teatree: Parallel Multi-Repo Development with AI Agents
Adrien Cossa
Adrien Cossa

Posted on • Edited on

Introducing Teatree: Parallel Multi-Repo Development with AI Agents

I'm a Customer Success Engineer at Oper Credits. My daily work involves a multi-repo project — backend, frontend, translations, configuration — and I use AI coding agents constantly. The friction isn't writing code; agents handle that well. It's everything surrounding it: following different conventions across codebases, coordinating changes across services, managing local environments that diverge from what's in git, and encoding the workflow patterns we could all benefit from.

The agent can figure out most of these things, but it struggles with the specifics — it loops on troubleshooting, tries approaches that don't match the project's actual setup, and burns tokens on trial and error. I started putting together teatree to write down that knowledge so the agent doesn't have to rediscover it every session. It's also a way to define and automate your personal workflow without adding friction with your team — build it on your own, then push for adoption once it works.

This post walks through the architecture, the design choices I landed on, and how the pieces fit together. It's long because there's a lot of ground to cover. If you just want the quick pitch, the README has that.


Table of Contents

  1. What it looks like
  2. The problem
  3. Skills as markdown and scripts
  4. The lifecycle graph
  5. Multi-repo worktree management
  6. The overlay and extension system
  7. Auto-loading hooks
  8. The retrospective loop
  9. Companion skills
  10. Getting started
  11. When it helps (and when it doesn't)

What it looks like

Tell your AI agent what you want. Teatree skills guide it through the entire lifecycle:

https://gitlab.com/org/repo/-/issues/1234

The agent fetches the ticket, creates synchronized worktrees, provisions isolated databases and ports, implements the feature with TDD, writes a test plan, runs E2E tests, self-reviews, then pushes and creates the merge request.

Fix PROJ-5678

The agent fetches the failed test report from CI, reproduces locally, fixes, pushes, and monitors the pipeline until green.

Review https://gitlab.com/org/repo/-/merge_requests/456

The agent fetches the ticket for context, inspects every commit individually, and posts draft review comments inline on the correct file and line.

Run the test plan for !789

The agent generates a test plan from the MR changes, runs E2E tests, and posts evidence screenshots on the MR.

Follow up on my open tickets

The agent batch-processes your assigned tickets, checks CI statuses, nudges stale MRs, and starts work on anything that's ready.


The problem

AI coding agents can do a lot — reason about architecture, run tests, create merge requests. But without your project's specific context, they spend tokens and time rediscovering things you already know. Your repo layout, your CI conventions, your team's practices, your local tooling — none of that is in training data.

The friction is especially pronounced with:

  • Multi-repo setups — creating branches across 3+ repos for a single ticket, provisioning isolated databases, allocating non-conflicting ports
  • Atypical local environments — personal tooling that differs from what's in git, dev configurations the team hasn't adopted yet
  • Operational workflows — self-reviewing before pushing, creating properly formatted merge requests, monitoring pipelines, running retrospectives

The agent can attempt all of these. But without explicit guidance, it either asks twenty questions or confidently does the wrong thing — and when something fails, it loops instead of applying the fix you already know.

I tried shell scripts and aliases first, sometimes Python scripts too. They worked for the happy path but couldn't handle edge cases — the database import that fails because VPN is down, the port conflict because another worktree is still running, the CI format check that rejects your MR title. A shell script can't say "if the test fails, check if it's a known flake — here are the patterns." An AI agent can.

So I started writing this stuff down — as markdown instructions with tested Python and shell scripts for the mechanical parts. The markdown gives the agent enough context to handle edge cases; the scripts handle deterministic operations where you don't want the agent improvising.


Skills as markdown and scripts

A teatree skill starts with a markdown file (SKILL.md) with YAML frontmatter, but the heavy lifting often happens in scripts that ship alongside it. Teatree currently has 15 Python executables, 9 library modules, and 3 shell scripts — backed by 26 test files. Here's a simplified example of the markdown side:

---
name: t3-code
description: Writing code with TDD methodology.
requires:
  - t3-workspace
metadata:
  version: 0.0.1
---

# Writing Code (TDD)

## Dependencies

- **t3-workspace** (required) — provides dev servers for live reload.

## Workflow

### 1. Plan First (Non-Negotiable)

Always make a plan before writing code. Never jump straight to coding.
- Identify scope: which files, modules, and repos are affected.
- Review existing patterns in the codebase before writing new code.

### 2. TDD Cycle

Write failing test → Implement → Green → Refactor

### 3. Follow Conventions

- Language/framework conventions from the project's convention skills.
- Repository-specific patterns take precedence over generic guidance.
Enter fullscreen mode Exit fullscreen mode

A few things to note:

Skills contain both instructions and scripts. The markdown tells the agent when and why to do things. The Python scripts handle deterministic operations: worktree creation, port allocation, database provisioning, branch finalization. A script the agent calls is more robust than a 15-step procedure in a markdown file. Instructions for judgment calls, scripts for mechanical work.

Skills declare dependencies. The requires: field in the frontmatter tells the loading system which other skills need to be present. When t3-code is loaded, t3-workspace comes along automatically. This eliminates wasted round-trips where the agent reads a skill, sees "Load /t3-workspace now", and then has to make a second call.

Skills use progressive disclosure. Most SKILL.md files are 80–160 lines, with detailed procedures in references/ files that the agent reads on demand. This keeps the typical skill set well within a reasonable context budget.

Skills have rules marked (Non-Negotiable). These are things I've had to learn the hard way. "Always verify services respond via HTTP before declaring running" sounds obvious, but without it, the agent will say "servers started" without checking whether anything actually came up.


The lifecycle graph

Teatree organizes development into phases, each handled by a dedicated skill:

diagram

The flow is: ticket → code → test → review → ship → retro, with t3-workspace providing infrastructure to all phases and t3-debug available whenever something breaks.

Here's what each skill does:

Skill Phase What it handles
t3-setup Bootstrapping Interactive setup wizard, health checks, overlay scaffolding
t3-workspace Infrastructure multi-repo worktrees, port allocation, DB provisioning, env files, dev servers, cleanup
t3-ticket Intake Fetch the issue, extract acceptance criteria, detect affected repos, detect tenant/variant, create worktrees
t3-code Implementation Plan-first workflow, TDD cycle, convention enforcement, feature flag checks
t3-test Verification Test execution, CI interaction, E2E test plans, quality gates
t3-debug Troubleshooting Systematic 5-phase debugging protocol, user-hint-first investigation
t3-review Code review Self-review checklist, giving review, receiving feedback
t3-ship Delivery Commit formatting, branch finalization, MR creation, pipeline monitoring
t3-review-request Notifications Post MR links to review channels, check for duplicate requests
t3-retro Improvement Conversation audit, root cause analysis, skill updates, privacy scans
t3-contribute Contribution Push skill improvements to fork, open upstream issues
t3-followup Batch ops Process assigned tickets, check CI statuses, nudge stale MRs

The skills mirror how development actually works. Implementing a ticket touches intake, coding, testing, review, and delivery — often across multiple repos. Making the skills fully independent would mean duplicating knowledge across every one of them, which always diverges over time.

The follow-up dashboard

One skill worth highlighting is t3-followup. It runs your daily routine: batch-processing new tickets, checking CI statuses, advancing tickets through their lifecycle, and nudging reviewers about stale MRs.

As it works, it builds a persistent cache (followup.json) of all in-flight work — tickets, merge requests, pipeline statuses, review request states, and review comment tracking. From that cache, it generates an HTML dashboard:

t3-followup dashboard

The dashboard gives you a single view of everything that's in flight: ticket lifecycle status, pipeline results (color-coded pills), review request state, and tracked review comments. Everything is a clickable link — tickets, MRs, CI pipelines, Slack messages — so you can jump directly into any conversation.

The cache is a plain JSON file, so project overlays can inject extra fields (external tracker status, deployment state, tenant info) via the followup_enrich_data extension point. Stale tickets are purged automatically after their MRs have been merged for 14 days (configurable via T3_FOLLOWUP_PURGE_DAYS).


Multi-repo worktree management

This is where I started, and it's the feature I use most.

Suppose your project has three repos: acme-backend, acme-frontend, and acme-translations. You're about to work on ticket PROJ-1234. Running t3_ticket PROJ-1234 creates this structure:

diagram

Each ticket gets its own directory containing one git worktree per affected repo — lightweight checkouts that share the .git directory with the main clone but have their own branch and working tree. A shared .env.worktree file provides allocated ports, database name, and variant configuration.

After creating the worktrees, t3_setup provisions the environment:

  1. Symlinks.venv, node_modules, .python-version, and configurable shared directories are symlinked from the main repo (so you don't reinstall dependencies for every worktree)
  2. Environment files.env.worktree with unique ports, database URL, variant-specific overrides
  3. Database — creates an isolated DB, imports from a snapshot or dump, runs migrations
  4. direnv — auto-loads environment variables when you cd into the worktree
  5. Frontend dependencies — installs if the lockfile changed

Then t3_start brings everything up: Docker services, migrations, backend server, frontend dev server. Each worktree is fully isolated — its own database, its own ports, its own services. You can have ticket 1234 and ticket 5678 running simultaneously without conflicts.

Why this matters

Without isolation, the most common failure is contamination between tickets. You're working on ticket A, make a database change, then switch to ticket B which expected the old schema — migrations fail, the frontend shows stale data, and you spend time figuring out what went wrong. Worktree isolation avoids this. Each ticket is a clean room.

The other benefit is parallelism. While waiting for CI on ticket A, start working on ticket B in a completely separate environment. No branch switching, no stashing, no "wait, which database am I pointing at?"

Multi-tenant awareness

If your project serves multiple tenants — each with their own configuration, feature flags, and sometimes database — teatree handles that too. The variant system (wt_detect_variant) auto-detects the target tenant from ticket labels, descriptions, or external trackers, then provisions tenant-specific databases, environment variables, and configuration. Feature flag checks during code review ensure changes are properly scoped per tenant.

The project overlay wires in your tenant-to-variant mapping; teatree handles the rest. This means "set up a worktree for ticket X" automatically produces an environment configured for the correct tenant — no manual env file editing, no guesswork about which tenant you're in.

Why t3_ticket instead of raw git commands

The convention is <ticket>/<repo>/ — a ticket directory containing worktrees. Raw git worktree add creates flat worktrees at whatever path you give it, which breaks the ticket-directory structure that every other tool expects. t3_ticket enforces the convention, handles branch naming (with your prefix), and creates worktrees across all affected repos in one call. The skill file marks this as (Non-Negotiable) because flat worktrees cause subtle breakage downstream.


The overlay and extension system

Teatree knows how to create worktrees, allocate ports, and orchestrate a development lifecycle. It doesn't know how to start your backend, import your database, or create your merge requests. That project-specific knowledge lives in a project overlay.

The three-layer architecture

diagram

When teatree needs to do something project-specific (start the backend, import a database, create an MR), it calls an extension point through a registry. The registry resolves the implementation using a 3-layer priority:

Priority Layer Source Example
Highest Project Your overlay's project_hooks.py t3_start that runs Docker + Django + Angular
Middle Framework Framework integration (e.g., Django) wt_post_db that runs manage.py migrate
Lowest Default Teatree core fallback Usually a no-op or "not configured" message

The registry itself is simple — 45 lines of Python:

_LAYERS = ("default", "framework", "project")
_LAYER_RANK = {layer: i for i, layer in enumerate(_LAYERS)}
_registry: dict[str, list[tuple[str, Callable]]] = {}

def register(point: str, fn: Callable, layer: str = "default") -> None:
    entries = _registry.setdefault(point, [])
    entries[:] = [(lyr, func) for lyr, func in entries if lyr != layer]
    entries.append((layer, fn))
    entries.sort(key=lambda x: _LAYER_RANK[x[0]])

def get(point: str) -> Callable | None:
    entries = _registry.get(point)
    if not entries:
        return None
    return entries[-1][1]  # highest priority = last entry

def call(point: str, *args, **kwargs):
    fn = get(point)
    if fn is None:
        raise KeyError(f"No handler registered for extension point {point!r}")
    return fn(*args, **kwargs)
Enter fullscreen mode Exit fullscreen mode

Registering a handler at the "project" layer automatically overrides anything at "framework" or "default". The framework layer is there so teatree can ship framework integrations (Django is the first) that work out of the box but can still be overridden by project-specific needs.

What an overlay looks like

A project overlay is a directory with this structure:

acme-overlay/
├── SKILL.md                    # Skill description + loading order
├── scripts/
│   └── lib/
│       ├── bootstrap.sh        # Shell wrappers (sourced after teatree)
│       ├── shell_helpers.sh    # Env loading, variant detection
│       └── project_hooks.py    # Extension point overrides
├── hook-config/
│   ├── context-match.yml       # Patterns that trigger this overlay
│   └── reference-injections.yml # References to load per lifecycle phase
└── references/
    ├── prerequisites-and-setup.md
    ├── troubleshooting.md
    └── playbooks/
        └── README.md
Enter fullscreen mode Exit fullscreen mode

The project_hooks.py file registers your overrides:

from lib.registry import register

def register_acme():
    def wt_env_extra(envfile):
        with open(envfile, "a") as f:
            f.write("ACME_API_KEY=dev-key\n")

    def wt_db_import(db_name, variant, main_repo):
        # Import from your team's shared dump
        from lib.db import db_restore
        db_restore(db_name, f"{main_repo}/dumps/{variant}_latest.sql")
        return True

    def wt_run_backend(*args):
        import subprocess
        subprocess.run(["python", "manage.py", "runserver", "0.0.0.0:8000"],
                      check=False)

    register("wt_env_extra", wt_env_extra, "project")
    register("wt_db_import", wt_db_import, "project")
    register("wt_run_backend", wt_run_backend, "project")
Enter fullscreen mode Exit fullscreen mode

The teatree core scripts call registry.call("wt_run_backend"), and your project handler runs instead of the default "not configured" stub. You only override what you need — everything else falls through to the framework or default layer.

There are 25 extension points

They cover the full lifecycle:

Category Extension Points
Workspace setup wt_symlinks, wt_env_extra, wt_services, wt_detect_variant
Database wt_db_import, wt_post_db, wt_restore_ci_db, wt_reset_passwords
Dev servers wt_run_backend, wt_run_frontend, wt_build_frontend, wt_start_session
Testing wt_run_tests, wt_trigger_e2e, wt_quality_check
Delivery wt_create_mr, wt_monitor_pipeline, wt_send_review_request, wt_fetch_failed_tests, wt_fetch_ci_errors
Ticket management ticket_check_deployed, ticket_update_external_tracker, ticket_get_mrs
Follow-up followup_enrich_data, followup_enrich_dashboard

The /t3-setup wizard can scaffold an overlay for you. Tell it your repos, your backend framework, and your database, and it generates the skeleton with commented-out examples for each relevant extension point. From there, fill in the blanks — or ask your AI agent to fill them in if it already knows your codebase (e.g., after working in the repos for a while).

The sourcing chain

Shell functions are loaded in order:

# In .zshrc:
source ~/.teatree                                     # load config
source "$T3_REPO/scripts/lib/bootstrap.sh"            # teatree core functions
source "$T3_OVERLAY/scripts/lib/bootstrap.sh"         # project overlay overrides
Enter fullscreen mode Exit fullscreen mode

The overlay's bootstrap has a guard — it checks that teatree was sourced first (_T3_SCRIPTS_DIR must be set). This prevents confusing errors from running the overlay standalone.

Inside Python scripts, the pattern is similar:

import lib.init
lib.init.init()                 # registers defaults + auto-detects framework
from lib.project_hooks import register_project
register_project()              # registers project overrides at 'project' layer
from lib.registry import call as ext
ext("wt_post_db", project_dir)  # calls highest-priority handler
Enter fullscreen mode Exit fullscreen mode

Auto-loading hooks

Skills don't help if the agent doesn't load them. I got tired of manually telling it which skill to read, so I added a hook that suggests the right skills automatically based on what you're doing.

The mechanism is ensure-skills-loaded.sh, a hook that runs before every message (in Claude Code, this is a UserPromptSubmit hook; other agent platforms would use their own equivalent). It does three things:

diagram

1. Project context detection

The hook scans all skill directories for hook-config/context-match.yml files. If any pattern in the file matches the current working directory or the active-repo tracker, that skill is identified as the project overlay. This is how teatree knows you're working in a specific project without you having to say so.

# hook-config/context-match.yml
cwd_patterns:
  - "acme-backend"
  - "acme-frontend"
Enter fullscreen mode Exit fullscreen mode

If your $PWD contains acme-backend, the hook knows you're in the acme project and will suggest loading the ac-acme overlay alongside whatever lifecycle skill you need.

2. Intent detection

The hook parses the prompt to figure out which lifecycle phase you're in. It checks for:

  • URL patterns — a GitLab issue URL triggers t3-ticket, a Sentry URL triggers t3-debug
  • Keyword patterns — "implement" triggers t3-code, "push" triggers t3-ship, "broken" triggers t3-debug
  • End-of-session phrases — "done", "all set", "that's it" triggers t3-retro (only if at least one other skill was loaded this session)
  • Bare imperative verbs — "Fix the login page" triggers t3-code

If nothing matches and you're in project context, it defaults to t3-code — because most prompts in a project directory are about coding.

3. Dependency resolution and suggestion

Once the hook knows which skill you need, it:

  1. Parses the skill's requires: frontmatter to find dependencies
  2. Checks which skills are already loaded (tracked in a session file)
  3. Builds a suggestion list of skills that need loading
  4. Adds companion skills (e.g., ac-django for backend work in a Django project)
  5. Adds reference file injections from reference-injections.yml

The output looks like:

LOAD THESE SKILLS NOW: /t3-workspace, /t3-code, /ac-acme.
ACME references to read: references/prerequisites-and-setup.md
Enter fullscreen mode Exit fullscreen mode

The agent sees this as a system message and loads the skills before doing anything else. The wording is intentionally forceful ("LOAD THESE SKILLS NOW") — softer phrasing ("Consider loading...") gets ignored by models.

Symlink health checks

The hook also runs a once-per-session health check on skills that you maintain (determined by an ownership config):

  • Verifies skill symlinks are actual symlinks (not stale copies)
  • Checks that the source is a real git repository (not a downloaded zip)
  • Validates that symlinks point into git repos (so retrospective commits work)

If anything is broken, it either auto-fixes (re-running the installer) or warns with a specific remediation.


The retrospective loop

After every non-trivial session, t3-retro runs a retrospective — a systematic audit of the conversation that produces concrete skill improvements and optionally contributes them upstream.

diagram

What the audit catches

The retrospective categorizes issues into specific types:

Category What went wrong Example
False completion Claimed "done" without full verification Said feature was complete but didn't run the test suite
Skill not loaded A relevant skill existed but wasn't loaded Worked in project context without the overlay
Playbook miss A playbook covered the task but wasn't consulted Didn't check the deployment playbook before pushing
Over-engineering Did unnecessary work Built a migration when admin config would have sufficed
Under-engineering Missed required work Updated the backend but forgot the frontend changes
Hook gap Auto-loading should have triggered but didn't Hook didn't detect intent from "fix the flaky test"
Stale guidance Followed outdated instructions Playbook referenced pre-refactoring patterns

For each issue, the retrospective determines the root cause and writes the fix directly into the skill system — a new guardrail, an updated playbook, a troubleshooting entry, a hook pattern.

Where improvements go

The retrospective respects a clear hierarchy:

  • Project overlay ($T3_OVERLAY) — receives project-specific improvements (troubleshooting, playbooks, guardrails). This is the default target when T3_CONTRIBUTE is false.
  • Core skills ($T3_REPO) — only modified when T3_CONTRIBUTE=true, and only for generic improvements (missing verification steps, hook gaps, stale core guidance)
  • Personal config (memory files, agent config like AGENTS.md) — for user preferences and environment-specific facts. Also serves as a fallback location when the overlay isn't maintained by the user.

The contribution model

When you enable T3_CONTRIBUTE=true:

  1. The retrospective creates a local commit on the current branch in your fork. It never pushes automatically.
  2. A privacy scan checks for emails, home directory paths, API keys, internal hostnames, and any terms in $T3_BANNED_TERMS.
  3. When you're ready, /t3-contribute reviews what will be pushed, checks for fork divergence, and optionally opens an issue on the upstream repo.

The idea is that every user's failures make the system better for all users — but only through an explicit, reviewed contribution path. Nothing happens without your consent. The default is T3_CONTRIBUTE=false, which means the retrospective only improves your project overlay and personal config.

A concrete example

Suppose during a session, the agent set up a multi-repo worktree and claimed it was ready, but the backend server failed to start due to port conflicts with a previous worktree. The agent didn't verify that the infrastructure was actually running before declaring complete.

The retrospective would:

  1. Audit: Identify this as "false completion" — claimed infrastructure ready without verification evidence
  2. Root cause: The t3-workspace script runs through all setup steps but has no way for projects to define and verify health checks before the agent declares the worktree usable
  3. Fix (core): Add a new extension point wt_health_check to t3-workspace that projects can implement
  4. Fix (overlay): Implement wt_health_check in the project's project_hooks.py to curl the backend, check the frontend dev server, verify the database is accessible
  5. Verify: Check that the skill file parses, the extension point is registered correctly, and the overlay hook runs without errors
  6. Commit: If T3_CONTRIBUTE=true, commit the core extension point to the fork's teatree core skills; overlay changes go to the project overlay repo

Next time the agent sets up a worktree, t3-workspace runs the project's health checks before finishing — the core provides the mechanism, the project overlay provides the specifics. Both are enforced going forward.

It adds up

A single retrospective might fix one guardrail. After enough sessions, you've accumulated a lot of them — each one from a specific failure that actually happened.


Companion skills

Teatree handles the lifecycle — ticket intake, worktree management, TDD, review, delivery. It doesn't know about your programming language's conventions or your framework's best practices. That's what companion skills are for.

Companion skills are standalone skills that live in separate repos and are loaded alongside teatree when relevant. I maintain a few (souliane/skills) covering Django and Python conventions, but the best companion skill for your stack is one you find (or build) yourself. I wrote a separate post about skill-driven development and the skills I'm open-sourcing.

The project overlay's hook-config/context-match.yml wires companion skills to repo patterns:

companion_skills:
  ac-django:
    - "acme-backend"
  ac-python:
    - "acme-backend"
Enter fullscreen mode Exit fullscreen mode

When the hook detects you're working in acme-backend, it suggests loading ac-django and ac-python alongside the lifecycle skill. You get framework conventions without cluttering the core lifecycle skills with language-specific details.

This separation matters. Django conventions change on a different cadence than worktree management. Keeping them in separate skills means you can update one without touching the other, and teams using Flask or Express aren't burdened with Django-specific guidance.

Companion skills vs framework layer

These are different things. The framework layer is teatree's built-in middle priority in the 3-layer extension point registry — it ships stock implementations for common frameworks (e.g., a Django integration that auto-registers manage.py migrate as the post-DB hook). Companion skills are external standalone skills that teach the agent coding conventions — they don't register extension points, they provide guidelines. The framework layer handles infrastructure (how to run migrations); companion skills handle conventions (how to write good Django code).


Getting started

Prerequisites

  • An AI coding agent (the auto-loading hooks currently target Claude Code, but the skills and scripts work with any agent that can read files and run commands)
  • Python 3.12+
  • uv (Python package manager)

Installation

Teatree requires a local git clone — it has shared infrastructure (scripts/, references/, integrations/) that lives outside the individual skill directories, so npx skills add alone isn't enough.

Fork the repo on GitHub (or just clone it directly if you don't plan to contribute back), then:

git clone git@github.com:YOUR_USERNAME/teatree.git ~/workspace/teatree
cd ~/workspace/teatree
./scripts/install_skills.sh
Enter fullscreen mode Exit fullscreen mode

The install script creates symlinks from your agent's skills directory to the clone. Then open your agent and run /t3-setup — it handles config, shell integration, hooks, and optionally scaffolds a project overlay for your repos.

If you want the retrospective loop to write improvements back into skill files, set T3_CONTRIBUTE=true in ~/.teatree (created by /t3-setup). This requires a fork — the agent pushes to your fork, not to the upstream repo.

The setup wizard:

  1. Checks prerequisites — verifies all required tools are installed, reports a summary table
  2. Creates ~/.teatree — asks for workspace path, branch prefix, issue tracker, chat platform
  3. Scaffolds a project overlay (optional) — ask it about your repos, framework, and database, and it generates the skeleton
  4. Configures shell integration — adds sourcing lines to .zshrc or .bashrc
  5. Installs skill symlinks — creates the symlink chain from the agent's skills directory to your clone
  6. Configures hooks — sets up ensure-skills-loaded.sh and the statusline (Claude Code-specific; other agents would configure their own hooks)
  7. Runs a smoke test — verifies hooks parse, statusline runs, Python imports work

After setup, restart your agent (or start a new conversation). Try: "start working on ticket PROJ-1234" — the hook should suggest /t3-ticket + /t3-workspace, and the agent will take it from there.

You can re-run /t3-setup at any time as a health check. It validates the existing installation, checks for broken symlinks, verifies hook wording, and reports what needs fixing.

The directory structure after setup

~/
├── .teatree                    # Config file (sourced by shell)
├── .local/share/teatree/       # Runtime data (ticket cache, dashboard, MR reminders, cache)
├── .claude/                    # Claude Code example (adapt paths for your agent)
│   ├── CLAUDE.md               # Agent instructions (skill-loading block)
│   ├── settings.json           # Hooks, statusline
│   └── skills/
│       ├── t3-ticket -> ~/workspace/teatree/t3-ticket
│       ├── t3-code -> ~/workspace/teatree/t3-code
│       ├── ...
│       └── ac-acme -> ~/workspace/acme-overlay
└── workspace/
    ├── teatree/                # Teatree clone (or fork)
    ├── acme-overlay/           # Project overlay
    ├── acme-backend/           # Main repo clone
    ├── acme-frontend/          # Main repo clone
    └── ac/                     # Ticket worktrees
        ├── 1234/
        │   ├── acme-backend/   # Worktree
        │   ├── acme-frontend/  # Worktree
        │   └── .env.worktree   # Shared env
        └── 5678/
            └── ...
Enter fullscreen mode Exit fullscreen mode

The symlinks ensure that skill files always resolve to the live git clone. This is important for the retrospective — when the agent writes improvements to skill files, the changes land in a real git repository where they can be committed and pushed.


When it helps (and when it doesn't)

It helps most with: structured, repeatable processes that span multiple repos or require project-specific knowledge. Ticket intake, worktree setup, TDD cycles, code review, MR creation, CI debugging. The kind of work that eats hours but follows a pattern.

It helps less with: one-off creative decisions, highly ambiguous tasks, or projects simple enough that a single repo with npm start covers everything. If your development workflow is "edit a file and push," teatree is overkill.

The sweet spot is when you have enough friction that encoding it pays off through repetition. The project works for my workflow but hasn't been tested beyond that. If something doesn't click for your setup, open an issue or a PR. Or point your AI agent at the problem and let it fix things until it works for you — that's kind of the point.

A note on security

Teatree skills are prompt instructions — they control what your AI agent does. That makes the supply chain a security surface. The defaults are conservative: self-improvement is off (T3_CONTRIBUTE=false), pushing is disabled (T3_PUSH=false), and there is no auto-update mechanism. You opt in to each level of automation explicitly. If you use a fork from someone else, you're trusting that person's skill files as agent instructions — review changes before pulling.

Why "teatree"?

TEA's Extensible Architecture for work*tree* management. Also, teatree oil cuts through grime, which felt fitting.

GitHub | MIT License

Top comments (0)