I'm a Customer Success Engineer at Oper Credits. My daily work involves a multi-repo project — backend, frontend, translations, configuration — and I use AI coding agents constantly. The friction isn't writing code; agents handle that well. It's everything surrounding it: following different conventions across codebases, coordinating changes across services, managing local environments that diverge from what's in git, and encoding the workflow patterns we could all benefit from.
The agent can figure out most of these things, but it struggles with the specifics — it loops on troubleshooting, tries approaches that don't match the project's actual setup, and burns tokens on trial and error. I started putting together teatree to write down that knowledge so the agent doesn't have to rediscover it every session. It's also a way to define and automate your personal workflow without adding friction with your team — build it on your own, then push for adoption once it works.
This post walks through the architecture, the design choices I landed on, and how the pieces fit together. It's long because there's a lot of ground to cover. If you just want the quick pitch, the README has that.
Table of Contents
- What it looks like
- The problem
- Skills as markdown and scripts
- The lifecycle graph
- Multi-repo worktree management
- The overlay and extension system
- Auto-loading hooks
- The retrospective loop
- Companion skills
- Getting started
- When it helps (and when it doesn't)
What it looks like
Tell your AI agent what you want. Teatree skills guide it through the entire lifecycle:
https://gitlab.com/org/repo/-/issues/1234
The agent fetches the ticket, creates synchronized worktrees, provisions isolated databases and ports, implements the feature with TDD, writes a test plan, runs E2E tests, self-reviews, then pushes and creates the merge request.
Fix PROJ-5678
The agent fetches the failed test report from CI, reproduces locally, fixes, pushes, and monitors the pipeline until green.
Review https://gitlab.com/org/repo/-/merge_requests/456
The agent fetches the ticket for context, inspects every commit individually, and posts draft review comments inline on the correct file and line.
Run the test plan for !789
The agent generates a test plan from the MR changes, runs E2E tests, and posts evidence screenshots on the MR.
Follow up on my open tickets
The agent batch-processes your assigned tickets, checks CI statuses, nudges stale MRs, and starts work on anything that's ready.
The problem
AI coding agents can do a lot — reason about architecture, run tests, create merge requests. But without your project's specific context, they spend tokens and time rediscovering things you already know. Your repo layout, your CI conventions, your team's practices, your local tooling — none of that is in training data.
The friction is especially pronounced with:
- Multi-repo setups — creating branches across 3+ repos for a single ticket, provisioning isolated databases, allocating non-conflicting ports
- Atypical local environments — personal tooling that differs from what's in git, dev configurations the team hasn't adopted yet
- Operational workflows — self-reviewing before pushing, creating properly formatted merge requests, monitoring pipelines, running retrospectives
The agent can attempt all of these. But without explicit guidance, it either asks twenty questions or confidently does the wrong thing — and when something fails, it loops instead of applying the fix you already know.
I tried shell scripts and aliases first, sometimes Python scripts too. They worked for the happy path but couldn't handle edge cases — the database import that fails because VPN is down, the port conflict because another worktree is still running, the CI format check that rejects your MR title. A shell script can't say "if the test fails, check if it's a known flake — here are the patterns." An AI agent can.
So I started writing this stuff down — as markdown instructions with tested Python and shell scripts for the mechanical parts. The markdown gives the agent enough context to handle edge cases; the scripts handle deterministic operations where you don't want the agent improvising.
Skills as markdown and scripts
A teatree skill starts with a markdown file (SKILL.md) with YAML frontmatter, but the heavy lifting often happens in scripts that ship alongside it. Teatree currently has 15 Python executables, 9 library modules, and 3 shell scripts — backed by 26 test files. Here's a simplified example of the markdown side:
---
name: t3-code
description: Writing code with TDD methodology.
requires:
- t3-workspace
metadata:
version: 0.0.1
---
# Writing Code (TDD)
## Dependencies
- **t3-workspace** (required) — provides dev servers for live reload.
## Workflow
### 1. Plan First (Non-Negotiable)
Always make a plan before writing code. Never jump straight to coding.
- Identify scope: which files, modules, and repos are affected.
- Review existing patterns in the codebase before writing new code.
### 2. TDD Cycle
Write failing test → Implement → Green → Refactor
### 3. Follow Conventions
- Language/framework conventions from the project's convention skills.
- Repository-specific patterns take precedence over generic guidance.
A few things to note:
Skills contain both instructions and scripts. The markdown tells the agent when and why to do things. The Python scripts handle deterministic operations: worktree creation, port allocation, database provisioning, branch finalization. A script the agent calls is more robust than a 15-step procedure in a markdown file. Instructions for judgment calls, scripts for mechanical work.
Skills declare dependencies. The requires: field in the frontmatter tells the loading system which other skills need to be present. When t3-code is loaded, t3-workspace comes along automatically. This eliminates wasted round-trips where the agent reads a skill, sees "Load /t3-workspace now", and then has to make a second call.
Skills use progressive disclosure. Most SKILL.md files are 80–160 lines, with detailed procedures in references/ files that the agent reads on demand. This keeps the typical skill set well within a reasonable context budget.
Skills have rules marked (Non-Negotiable). These are things I've had to learn the hard way. "Always verify services respond via HTTP before declaring running" sounds obvious, but without it, the agent will say "servers started" without checking whether anything actually came up.
The lifecycle graph
Teatree organizes development into phases, each handled by a dedicated skill:
The flow is: ticket → code → test → review → ship → retro, with t3-workspace providing infrastructure to all phases and t3-debug available whenever something breaks.
Here's what each skill does:
| Skill | Phase | What it handles |
|---|---|---|
t3-setup |
Bootstrapping | Interactive setup wizard, health checks, overlay scaffolding |
t3-workspace |
Infrastructure | multi-repo worktrees, port allocation, DB provisioning, env files, dev servers, cleanup |
t3-ticket |
Intake | Fetch the issue, extract acceptance criteria, detect affected repos, detect tenant/variant, create worktrees |
t3-code |
Implementation | Plan-first workflow, TDD cycle, convention enforcement, feature flag checks |
t3-test |
Verification | Test execution, CI interaction, E2E test plans, quality gates |
t3-debug |
Troubleshooting | Systematic 5-phase debugging protocol, user-hint-first investigation |
t3-review |
Code review | Self-review checklist, giving review, receiving feedback |
t3-ship |
Delivery | Commit formatting, branch finalization, MR creation, pipeline monitoring |
t3-review-request |
Notifications | Post MR links to review channels, check for duplicate requests |
t3-retro |
Improvement | Conversation audit, root cause analysis, skill updates, privacy scans |
t3-contribute |
Contribution | Push skill improvements to fork, open upstream issues |
t3-followup |
Batch ops | Process assigned tickets, check CI statuses, nudge stale MRs |
The skills mirror how development actually works. Implementing a ticket touches intake, coding, testing, review, and delivery — often across multiple repos. Making the skills fully independent would mean duplicating knowledge across every one of them, which always diverges over time.
The follow-up dashboard
One skill worth highlighting is t3-followup. It runs your daily routine: batch-processing new tickets, checking CI statuses, advancing tickets through their lifecycle, and nudging reviewers about stale MRs.
As it works, it builds a persistent cache (followup.json) of all in-flight work — tickets, merge requests, pipeline statuses, review request states, and review comment tracking. From that cache, it generates an HTML dashboard:
The dashboard gives you a single view of everything that's in flight: ticket lifecycle status, pipeline results (color-coded pills), review request state, and tracked review comments. Everything is a clickable link — tickets, MRs, CI pipelines, Slack messages — so you can jump directly into any conversation.
The cache is a plain JSON file, so project overlays can inject extra fields (external tracker status, deployment state, tenant info) via the followup_enrich_data extension point. Stale tickets are purged automatically after their MRs have been merged for 14 days (configurable via T3_FOLLOWUP_PURGE_DAYS).
Multi-repo worktree management
This is where I started, and it's the feature I use most.
Suppose your project has three repos: acme-backend, acme-frontend, and acme-translations. You're about to work on ticket PROJ-1234. Running t3_ticket PROJ-1234 creates this structure:
Each ticket gets its own directory containing one git worktree per affected repo — lightweight checkouts that share the .git directory with the main clone but have their own branch and working tree. A shared .env.worktree file provides allocated ports, database name, and variant configuration.
After creating the worktrees, t3_setup provisions the environment:
-
Symlinks —
.venv,node_modules,.python-version, and configurable shared directories are symlinked from the main repo (so you don't reinstall dependencies for every worktree) -
Environment files —
.env.worktreewith unique ports, database URL, variant-specific overrides - Database — creates an isolated DB, imports from a snapshot or dump, runs migrations
-
direnv — auto-loads environment variables when you
cdinto the worktree - Frontend dependencies — installs if the lockfile changed
Then t3_start brings everything up: Docker services, migrations, backend server, frontend dev server. Each worktree is fully isolated — its own database, its own ports, its own services. You can have ticket 1234 and ticket 5678 running simultaneously without conflicts.
Why this matters
Without isolation, the most common failure is contamination between tickets. You're working on ticket A, make a database change, then switch to ticket B which expected the old schema — migrations fail, the frontend shows stale data, and you spend time figuring out what went wrong. Worktree isolation avoids this. Each ticket is a clean room.
The other benefit is parallelism. While waiting for CI on ticket A, start working on ticket B in a completely separate environment. No branch switching, no stashing, no "wait, which database am I pointing at?"
Multi-tenant awareness
If your project serves multiple tenants — each with their own configuration, feature flags, and sometimes database — teatree handles that too. The variant system (wt_detect_variant) auto-detects the target tenant from ticket labels, descriptions, or external trackers, then provisions tenant-specific databases, environment variables, and configuration. Feature flag checks during code review ensure changes are properly scoped per tenant.
The project overlay wires in your tenant-to-variant mapping; teatree handles the rest. This means "set up a worktree for ticket X" automatically produces an environment configured for the correct tenant — no manual env file editing, no guesswork about which tenant you're in.
Why t3_ticket instead of raw git commands
The convention is <ticket>/<repo>/ — a ticket directory containing worktrees. Raw git worktree add creates flat worktrees at whatever path you give it, which breaks the ticket-directory structure that every other tool expects. t3_ticket enforces the convention, handles branch naming (with your prefix), and creates worktrees across all affected repos in one call. The skill file marks this as (Non-Negotiable) because flat worktrees cause subtle breakage downstream.
The overlay and extension system
Teatree knows how to create worktrees, allocate ports, and orchestrate a development lifecycle. It doesn't know how to start your backend, import your database, or create your merge requests. That project-specific knowledge lives in a project overlay.
The three-layer architecture
When teatree needs to do something project-specific (start the backend, import a database, create an MR), it calls an extension point through a registry. The registry resolves the implementation using a 3-layer priority:
| Priority | Layer | Source | Example |
|---|---|---|---|
| Highest | Project | Your overlay's project_hooks.py
|
t3_start that runs Docker + Django + Angular |
| Middle | Framework | Framework integration (e.g., Django) |
wt_post_db that runs manage.py migrate
|
| Lowest | Default | Teatree core fallback | Usually a no-op or "not configured" message |
The registry itself is simple — 45 lines of Python:
_LAYERS = ("default", "framework", "project")
_LAYER_RANK = {layer: i for i, layer in enumerate(_LAYERS)}
_registry: dict[str, list[tuple[str, Callable]]] = {}
def register(point: str, fn: Callable, layer: str = "default") -> None:
entries = _registry.setdefault(point, [])
entries[:] = [(lyr, func) for lyr, func in entries if lyr != layer]
entries.append((layer, fn))
entries.sort(key=lambda x: _LAYER_RANK[x[0]])
def get(point: str) -> Callable | None:
entries = _registry.get(point)
if not entries:
return None
return entries[-1][1] # highest priority = last entry
def call(point: str, *args, **kwargs):
fn = get(point)
if fn is None:
raise KeyError(f"No handler registered for extension point {point!r}")
return fn(*args, **kwargs)
Registering a handler at the "project" layer automatically overrides anything at "framework" or "default". The framework layer is there so teatree can ship framework integrations (Django is the first) that work out of the box but can still be overridden by project-specific needs.
What an overlay looks like
A project overlay is a directory with this structure:
acme-overlay/
├── SKILL.md # Skill description + loading order
├── scripts/
│ └── lib/
│ ├── bootstrap.sh # Shell wrappers (sourced after teatree)
│ ├── shell_helpers.sh # Env loading, variant detection
│ └── project_hooks.py # Extension point overrides
├── hook-config/
│ ├── context-match.yml # Patterns that trigger this overlay
│ └── reference-injections.yml # References to load per lifecycle phase
└── references/
├── prerequisites-and-setup.md
├── troubleshooting.md
└── playbooks/
└── README.md
The project_hooks.py file registers your overrides:
from lib.registry import register
def register_acme():
def wt_env_extra(envfile):
with open(envfile, "a") as f:
f.write("ACME_API_KEY=dev-key\n")
def wt_db_import(db_name, variant, main_repo):
# Import from your team's shared dump
from lib.db import db_restore
db_restore(db_name, f"{main_repo}/dumps/{variant}_latest.sql")
return True
def wt_run_backend(*args):
import subprocess
subprocess.run(["python", "manage.py", "runserver", "0.0.0.0:8000"],
check=False)
register("wt_env_extra", wt_env_extra, "project")
register("wt_db_import", wt_db_import, "project")
register("wt_run_backend", wt_run_backend, "project")
The teatree core scripts call registry.call("wt_run_backend"), and your project handler runs instead of the default "not configured" stub. You only override what you need — everything else falls through to the framework or default layer.
There are 25 extension points
They cover the full lifecycle:
| Category | Extension Points |
|---|---|
| Workspace setup |
wt_symlinks, wt_env_extra, wt_services, wt_detect_variant
|
| Database |
wt_db_import, wt_post_db, wt_restore_ci_db, wt_reset_passwords
|
| Dev servers |
wt_run_backend, wt_run_frontend, wt_build_frontend, wt_start_session
|
| Testing |
wt_run_tests, wt_trigger_e2e, wt_quality_check
|
| Delivery |
wt_create_mr, wt_monitor_pipeline, wt_send_review_request, wt_fetch_failed_tests, wt_fetch_ci_errors
|
| Ticket management |
ticket_check_deployed, ticket_update_external_tracker, ticket_get_mrs
|
| Follow-up |
followup_enrich_data, followup_enrich_dashboard
|
The /t3-setup wizard can scaffold an overlay for you. Tell it your repos, your backend framework, and your database, and it generates the skeleton with commented-out examples for each relevant extension point. From there, fill in the blanks — or ask your AI agent to fill them in if it already knows your codebase (e.g., after working in the repos for a while).
The sourcing chain
Shell functions are loaded in order:
# In .zshrc:
source ~/.teatree # load config
source "$T3_REPO/scripts/lib/bootstrap.sh" # teatree core functions
source "$T3_OVERLAY/scripts/lib/bootstrap.sh" # project overlay overrides
The overlay's bootstrap has a guard — it checks that teatree was sourced first (_T3_SCRIPTS_DIR must be set). This prevents confusing errors from running the overlay standalone.
Inside Python scripts, the pattern is similar:
import lib.init
lib.init.init() # registers defaults + auto-detects framework
from lib.project_hooks import register_project
register_project() # registers project overrides at 'project' layer
from lib.registry import call as ext
ext("wt_post_db", project_dir) # calls highest-priority handler
Auto-loading hooks
Skills don't help if the agent doesn't load them. I got tired of manually telling it which skill to read, so I added a hook that suggests the right skills automatically based on what you're doing.
The mechanism is ensure-skills-loaded.sh, a hook that runs before every message (in Claude Code, this is a UserPromptSubmit hook; other agent platforms would use their own equivalent). It does three things:
1. Project context detection
The hook scans all skill directories for hook-config/context-match.yml files. If any pattern in the file matches the current working directory or the active-repo tracker, that skill is identified as the project overlay. This is how teatree knows you're working in a specific project without you having to say so.
# hook-config/context-match.yml
cwd_patterns:
- "acme-backend"
- "acme-frontend"
If your $PWD contains acme-backend, the hook knows you're in the acme project and will suggest loading the ac-acme overlay alongside whatever lifecycle skill you need.
2. Intent detection
The hook parses the prompt to figure out which lifecycle phase you're in. It checks for:
-
URL patterns — a GitLab issue URL triggers
t3-ticket, a Sentry URL triggerst3-debug -
Keyword patterns — "implement" triggers
t3-code, "push" triggerst3-ship, "broken" triggerst3-debug -
End-of-session phrases — "done", "all set", "that's it" triggers
t3-retro(only if at least one other skill was loaded this session) -
Bare imperative verbs — "Fix the login page" triggers
t3-code
If nothing matches and you're in project context, it defaults to t3-code — because most prompts in a project directory are about coding.
3. Dependency resolution and suggestion
Once the hook knows which skill you need, it:
- Parses the skill's
requires:frontmatter to find dependencies - Checks which skills are already loaded (tracked in a session file)
- Builds a suggestion list of skills that need loading
- Adds companion skills (e.g.,
ac-djangofor backend work in a Django project) - Adds reference file injections from
reference-injections.yml
The output looks like:
LOAD THESE SKILLS NOW: /t3-workspace, /t3-code, /ac-acme.
ACME references to read: references/prerequisites-and-setup.md
The agent sees this as a system message and loads the skills before doing anything else. The wording is intentionally forceful ("LOAD THESE SKILLS NOW") — softer phrasing ("Consider loading...") gets ignored by models.
Symlink health checks
The hook also runs a once-per-session health check on skills that you maintain (determined by an ownership config):
- Verifies skill symlinks are actual symlinks (not stale copies)
- Checks that the source is a real git repository (not a downloaded zip)
- Validates that symlinks point into git repos (so retrospective commits work)
If anything is broken, it either auto-fixes (re-running the installer) or warns with a specific remediation.
The retrospective loop
After every non-trivial session, t3-retro runs a retrospective — a systematic audit of the conversation that produces concrete skill improvements and optionally contributes them upstream.
What the audit catches
The retrospective categorizes issues into specific types:
| Category | What went wrong | Example |
|---|---|---|
| False completion | Claimed "done" without full verification | Said feature was complete but didn't run the test suite |
| Skill not loaded | A relevant skill existed but wasn't loaded | Worked in project context without the overlay |
| Playbook miss | A playbook covered the task but wasn't consulted | Didn't check the deployment playbook before pushing |
| Over-engineering | Did unnecessary work | Built a migration when admin config would have sufficed |
| Under-engineering | Missed required work | Updated the backend but forgot the frontend changes |
| Hook gap | Auto-loading should have triggered but didn't | Hook didn't detect intent from "fix the flaky test" |
| Stale guidance | Followed outdated instructions | Playbook referenced pre-refactoring patterns |
For each issue, the retrospective determines the root cause and writes the fix directly into the skill system — a new guardrail, an updated playbook, a troubleshooting entry, a hook pattern.
Where improvements go
The retrospective respects a clear hierarchy:
-
Project overlay (
$T3_OVERLAY) — receives project-specific improvements (troubleshooting, playbooks, guardrails). This is the default target whenT3_CONTRIBUTEisfalse. -
Core skills (
$T3_REPO) — only modified whenT3_CONTRIBUTE=true, and only for generic improvements (missing verification steps, hook gaps, stale core guidance) -
Personal config (memory files, agent config like
AGENTS.md) — for user preferences and environment-specific facts. Also serves as a fallback location when the overlay isn't maintained by the user.
The contribution model
When you enable T3_CONTRIBUTE=true:
- The retrospective creates a local commit on the current branch in your fork. It never pushes automatically.
- A privacy scan checks for emails, home directory paths, API keys, internal hostnames, and any terms in
$T3_BANNED_TERMS. - When you're ready,
/t3-contributereviews what will be pushed, checks for fork divergence, and optionally opens an issue on the upstream repo.
The idea is that every user's failures make the system better for all users — but only through an explicit, reviewed contribution path. Nothing happens without your consent. The default is T3_CONTRIBUTE=false, which means the retrospective only improves your project overlay and personal config.
A concrete example
Suppose during a session, the agent set up a multi-repo worktree and claimed it was ready, but the backend server failed to start due to port conflicts with a previous worktree. The agent didn't verify that the infrastructure was actually running before declaring complete.
The retrospective would:
- Audit: Identify this as "false completion" — claimed infrastructure ready without verification evidence
-
Root cause: The
t3-workspacescript runs through all setup steps but has no way for projects to define and verify health checks before the agent declares the worktree usable -
Fix (core): Add a new extension point
wt_health_checktot3-workspacethat projects can implement -
Fix (overlay): Implement
wt_health_checkin the project'sproject_hooks.pyto curl the backend, check the frontend dev server, verify the database is accessible - Verify: Check that the skill file parses, the extension point is registered correctly, and the overlay hook runs without errors
-
Commit: If
T3_CONTRIBUTE=true, commit the core extension point to the fork's teatree core skills; overlay changes go to the project overlay repo
Next time the agent sets up a worktree, t3-workspace runs the project's health checks before finishing — the core provides the mechanism, the project overlay provides the specifics. Both are enforced going forward.
It adds up
A single retrospective might fix one guardrail. After enough sessions, you've accumulated a lot of them — each one from a specific failure that actually happened.
Companion skills
Teatree handles the lifecycle — ticket intake, worktree management, TDD, review, delivery. It doesn't know about your programming language's conventions or your framework's best practices. That's what companion skills are for.
Companion skills are standalone skills that live in separate repos and are loaded alongside teatree when relevant. I maintain a few (souliane/skills) covering Django and Python conventions, but the best companion skill for your stack is one you find (or build) yourself. I wrote a separate post about skill-driven development and the skills I'm open-sourcing.
The project overlay's hook-config/context-match.yml wires companion skills to repo patterns:
companion_skills:
ac-django:
- "acme-backend"
ac-python:
- "acme-backend"
When the hook detects you're working in acme-backend, it suggests loading ac-django and ac-python alongside the lifecycle skill. You get framework conventions without cluttering the core lifecycle skills with language-specific details.
This separation matters. Django conventions change on a different cadence than worktree management. Keeping them in separate skills means you can update one without touching the other, and teams using Flask or Express aren't burdened with Django-specific guidance.
Companion skills vs framework layer
These are different things. The framework layer is teatree's built-in middle priority in the 3-layer extension point registry — it ships stock implementations for common frameworks (e.g., a Django integration that auto-registers manage.py migrate as the post-DB hook). Companion skills are external standalone skills that teach the agent coding conventions — they don't register extension points, they provide guidelines. The framework layer handles infrastructure (how to run migrations); companion skills handle conventions (how to write good Django code).
Getting started
Prerequisites
- An AI coding agent (the auto-loading hooks currently target Claude Code, but the skills and scripts work with any agent that can read files and run commands)
- Python 3.12+
- uv (Python package manager)
Installation
Teatree requires a local git clone — it has shared infrastructure (scripts/, references/, integrations/) that lives outside the individual skill directories, so npx skills add alone isn't enough.
Fork the repo on GitHub (or just clone it directly if you don't plan to contribute back), then:
git clone git@github.com:YOUR_USERNAME/teatree.git ~/workspace/teatree
cd ~/workspace/teatree
./scripts/install_skills.sh
The install script creates symlinks from your agent's skills directory to the clone. Then open your agent and run /t3-setup — it handles config, shell integration, hooks, and optionally scaffolds a project overlay for your repos.
If you want the retrospective loop to write improvements back into skill files, set T3_CONTRIBUTE=true in ~/.teatree (created by /t3-setup). This requires a fork — the agent pushes to your fork, not to the upstream repo.
The setup wizard:
- Checks prerequisites — verifies all required tools are installed, reports a summary table
-
Creates
~/.teatree— asks for workspace path, branch prefix, issue tracker, chat platform - Scaffolds a project overlay (optional) — ask it about your repos, framework, and database, and it generates the skeleton
-
Configures shell integration — adds sourcing lines to
.zshrcor.bashrc - Installs skill symlinks — creates the symlink chain from the agent's skills directory to your clone
-
Configures hooks — sets up
ensure-skills-loaded.shand the statusline (Claude Code-specific; other agents would configure their own hooks) - Runs a smoke test — verifies hooks parse, statusline runs, Python imports work
After setup, restart your agent (or start a new conversation). Try: "start working on ticket PROJ-1234" — the hook should suggest /t3-ticket + /t3-workspace, and the agent will take it from there.
You can re-run /t3-setup at any time as a health check. It validates the existing installation, checks for broken symlinks, verifies hook wording, and reports what needs fixing.
The directory structure after setup
~/
├── .teatree # Config file (sourced by shell)
├── .local/share/teatree/ # Runtime data (ticket cache, dashboard, MR reminders, cache)
├── .claude/ # Claude Code example (adapt paths for your agent)
│ ├── CLAUDE.md # Agent instructions (skill-loading block)
│ ├── settings.json # Hooks, statusline
│ └── skills/
│ ├── t3-ticket -> ~/workspace/teatree/t3-ticket
│ ├── t3-code -> ~/workspace/teatree/t3-code
│ ├── ...
│ └── ac-acme -> ~/workspace/acme-overlay
└── workspace/
├── teatree/ # Teatree clone (or fork)
├── acme-overlay/ # Project overlay
├── acme-backend/ # Main repo clone
├── acme-frontend/ # Main repo clone
└── ac/ # Ticket worktrees
├── 1234/
│ ├── acme-backend/ # Worktree
│ ├── acme-frontend/ # Worktree
│ └── .env.worktree # Shared env
└── 5678/
└── ...
The symlinks ensure that skill files always resolve to the live git clone. This is important for the retrospective — when the agent writes improvements to skill files, the changes land in a real git repository where they can be committed and pushed.
When it helps (and when it doesn't)
It helps most with: structured, repeatable processes that span multiple repos or require project-specific knowledge. Ticket intake, worktree setup, TDD cycles, code review, MR creation, CI debugging. The kind of work that eats hours but follows a pattern.
It helps less with: one-off creative decisions, highly ambiguous tasks, or projects simple enough that a single repo with npm start covers everything. If your development workflow is "edit a file and push," teatree is overkill.
The sweet spot is when you have enough friction that encoding it pays off through repetition. The project works for my workflow but hasn't been tested beyond that. If something doesn't click for your setup, open an issue or a PR. Or point your AI agent at the problem and let it fix things until it works for you — that's kind of the point.
A note on security
Teatree skills are prompt instructions — they control what your AI agent does. That makes the supply chain a security surface. The defaults are conservative: self-improvement is off (T3_CONTRIBUTE=false), pushing is disabled (T3_PUSH=false), and there is no auto-update mechanism. You opt in to each level of automation explicitly. If you use a fork from someone else, you're trusting that person's skill files as agent instructions — review changes before pulling.
Why "teatree"?
TEA's Extensible Architecture for work*tree* management. Also, teatree oil cuts through grime, which felt fitting.






Top comments (0)