Rick Fleming

Posted on Apr 29 • Originally published at taskeract.com

Past the Worktree

#aiagents #developerworkflow #checkpoints #gitworktrees

If you have used a desktop AI coding tool in the last several months, you have probably noticed they all solve the same problem the same way. You start a session for a task. The app spins up an isolated copy of your repo. The agent runs there. When you are done, you push, merge, or throw it away.

The thing under the hood doing the isolation, in nearly every case, is git worktree.

Pull up the docs for any of the desktop tools that wrap Claude Code, Codex, or similar agents in a multi-session GUI shell. You will find a worktrees tab in the settings, or a setup-script editor that targets per-thread environments, or a help article explaining that each agent runs in its own checked-out copy of the repo. The product names rotate. The mechanism does not.

This is what isolation looks like across the category. Worktrees are how it works.

The thing is, worktrees were not designed for this.

What worktrees were for

git worktree shipped in Git 2.5, in July 2015 (source). The original use case was a developer with one machine, one editor, and one annoying problem: you are hours into a feature branch, an urgent bug comes in, and you do not want to commit your half-baked work or stash it. So you spin up a second working tree on main in another directory, fix the bug, push, and come back. The shape of the feature is human, occasional, short-lived. Two or three working trees, hours to a couple of days alive, paired with one developer's attention.

You can squint and tell yourself multi-agent AI work is the same shape. You would be missing what is different about it.

A modern AI session might fork five attempts at the same task in a minute, or restore to a state from twenty checkpoints ago, or pick up a session another teammate started on a different machine. The cadence is wrong for worktrees. The number of branches is wrong. The lifecycle is wrong. And a few sharp limitations that humans only hit occasionally turn into near-constant friction for AI workloads.

What worktrees can't carry

A working AI session is more than a copy of your repo. The agent is having a conversation with you. It took a step at minute 18 that turned out to be wrong, you steered it back at minute 22, and at minute 40 you want to walk back to minute 17 to try a different approach entirely. None of that lives in git.

Worktrees only carry what git tracks. The agent's conversation history: not in git. The DVR-style recording of the session, with command output and intermediate file states scrubbable on a timeline: not in git. The decision trail: every moment you considered a different approach, every detour you abandoned, every save point you marked because the next change felt risky: not in git.

Tools that are worktree-based have to bolt these onto the side. They write conversation logs to disk somewhere outside the worktree. They track session metadata in their own database. They wire up cleanup logic so a deleted worktree also deletes the conversation file that referenced it. Each piece works individually. The shape of the seams between them is where the friction lives.

The fundamental version of the same point: every operation an agent or a user takes inside a worktree-based tool has to round-trip through git. Save your work? Make a commit. Revert? git reset or git checkout something. Switch to another attempt? Switch worktrees. Throw it away? Delete the worktree and clean up the registered ref. Decide you wanted that work after all? Hope you did not already delete it. Humans tolerate this because git is the price of admission for working in a team. AI agents do not have a stake in that price. They just want to write files, run tests, and try again.

Branching from a moment, not a ref

Worktrees branch off git refs. You ask for a new worktree and you point it at a branch, commit, or tag. The fork is a thing in git's mental model.

A real session works on a finer grain than that. You want to fork off the moment when the agent was about to try a different approach but you told it to keep going. You want to fork off the state right before a risky refactor, regardless of whether that state was ever a commit. You want to fork off "save point twenty minutes ago, before any of this conversation went where it went." Those moments are not refs. There is nothing for git to fork from.

In Taskeract, every checkpoint is a moment, and every moment is forkable. Click an older node in the history graph, work forward, save: a new line peels off the timeline at exactly that point. The base node is forkable. The first checkpoint is forkable. The hundredth checkpoint is forkable. Multiple lines can fan out from any single node. The graph is not a tree of refs grafted onto a parent branch. It is a tree of moments, and the entire tree is browsable, restorable, and brancheable from any vertex.

This is what we mean when we say it is not tied to a linear base point. Worktree models tend to assume one base, one feature branch off it, occasionally a stash or a temporary detour. Real AI work fans out. Five attempts at the same starting state. Three explorations from three different mid-session checkpoints. A fork off a fork off a fork. The graph holds all of it without any of it becoming a real branch on the remote.

What we built instead

Taskeract's isolation model is not git worktrees. It is not commits, not branches, not anything else from the git repertoire. It is a separate primitive we call a checkpoint.

A checkpoint is a save point. Not a commit. Whenever there is uncommitted work, a button in the session header shows the number of changes; click it (or press Mod+K) and a composer opens with a title field (pre-filled with the timestamp) and an optional description. Type something and hit Enter, or just hit Enter. You can also tell the agent to take the checkpoint, in which case the composer opens with a meaningful title and description the agent thought were worth recording, ready for you to accept, edit, or discard. Either way, the checkpoint records the state of the workspace, the agent's conversation, and the position in the session's recording at that moment, and a node is added to the session's history graph.

You do not write commit messages. You do not pollute the branch with WIP entries. You do not think about whether to stash or commit before switching contexts. Saves are cheap, the cost is invisible, and you can take fifty of them without any of them ever showing up in the git branch reviewers will eventually look at.

Restore is the inverse. Click any node in the graph and click Restore. The workspace materializes there, the conversation jumps to the matching turn, the recording rewinds to the same point, and the session resumes as if the intervening time had not happened. No detached-HEAD warning. No git reset --hard. No risk of losing uncommitted work, because anything you had unsaved gets auto-captured as a WIP checkpoint on the line you walked away from, available with a simple restore if you change your mind.

The graph itself is the source of truth. Not a branch listing. Not a reflog. Not a stash entry that you might forget about. The graph shows every checkpoint as a node, every fork as a divergent line, and the active node as wherever the workspace currently is. Worktree-based tools cannot really render this view because git does not carry enough of the necessary state.

Multi-attempt is a workflow now

This is what makes "give me five different ways to do X" actually work. The agent itself can take checkpoints, navigate them, and produce sibling tips on the timeline graph for you to pick from. Restore to a starting state, run an attempt, save the tip, restore back, run a different attempt, save another tip. Five forks off the same starting point, all visible in the same graph, all selectable, all comparable.

There is no worktree shuffle. There is no per-attempt branch ceremony. The agent does not have to convince git that this fifth experimental idea deserves a real branch name on the remote. The save points are the structure.

Diffs that read like code

Every checkpoint knows what changed in it, all the way back to the base branch. Select a checkpoint, click View diff, and the panel swaps for a full-screen diff of the cumulative changes that node represents. Close, and you are back at the graph with your selection intact.

The diff itself is semantic, not a wall of plus and minus signs. It understands the language you are reading: it lines up code by structure rather than by character offset, recognizes when a function moved instead of changed, and ignores the noise that text-only diff tools surface. Renaming a variable does not look like a hundred-line rewrite. Reformatting a block does not bury the one real change underneath. The diff reads more like a code review and less like a patch.

You can compare two checkpoints by hopping between them. You can read what a long line of work looks like in aggregate without ever materializing it on disk. You can spot the moment a regression went in by walking the timeline. Worktree-based tools cannot really do any of this on top of git. Most do not try.

Save versus publish

Saving and publishing are different operations.

On Pro, your teammates are already inside the session with you. The timeline syncs live, they can open it on their own machine, take their own checkpoints on it, branch off any node in the graph. They do not have to wait for anything to be "ready" to see what is going on. The session is the shared workspace.

Publish is the separate step for everything outside that boundary: pushing the work out to git as a branch that reviewers, CI, and the merge process can act on. Click Publish and your work goes out as one clean commit, ready for review or merge. Take fifty checkpoints to get there - the people looking at the PR see one tidy commit. Republish after more work and the branch updates. No rebase choreography, no squash-merge ritual, no interactive cleanup before pushing.

The fifty checkpoints are still in your timeline. They are still browsable, still restorable, still part of the session. They just are not in what got published, because nobody reviewing the PR needs to read fifty WIP entries to understand what changed. Worktree-based tools do not have a way to think about save and publish as separate things; in those models, every save is a commit, and every commit is something the git side will eventually see.

Sessions that travel

Pro adds the part that no worktree-based tool can really offer: the timeline travels.

Every checkpoint is end-to-end encrypted on the device that took it before it leaves the machine. The encrypted timeline syncs to your other devices, and to teammates in your organization. Open the same session from a different laptop and the workspace materializes at the latest saved checkpoint, the conversation history is there, the recording plays back, the graph is the same shape. Take a new checkpoint anywhere in that group and the rest see it within seconds. Only the devices on your account hold the keys; the cloud cannot read the contents.

The team angle is the strongest one. When you review a pull request, you can open the actual session that wrote it and watch the agent's full conversation, the timeline of decisions, and the recording. When you pick up an issue someone else started, the agent picker shows their existing session as a one-click attach. The work continues on your machine from where they left off, with the full graph intact. Cross-device is not a feature bolted on after the fact. It is what falls out of having a save primitive that is not tied to git's local-machine semantics.

This is genuinely hard to do with worktrees. Worktrees live on a single machine's filesystem. Their state lives in the git ref database, the working tree, and untracked files that git refuses to acknowledge. Reproducing one on a second machine means cloning, branch tracking, manual setup, and hoping the result matches. Reproducing the agent's conversation that goes with it is an entirely separate problem. Reproducing the recording is a third one. Nobody has built any of it because it is not really doable inside the model.

Where this goes

The worktree-based tools are good. They made parallel agent work tractable for the first time, and we have used them. The point of this article is not that they are bad. It is that they are a clever borrow from a different era, and the borrow is showing its limits.

A real AI session is more than a checkout of your repo. It is the workspace, the agent's conversation, the recording, and the graph of moments worth coming back to. None of that fits inside git worktree cleanly, because none of that was what git worktree was for.

We built Taskeract on top of git worktrees first. We hit every limit in this article, repeatedly, and a few we did not write down. So we ripped that layer out and rebuilt around the checkpoint graph instead. The save points carry everything: code, conversation, and recording, all in one node. The graph is the source of truth, every node in it is forkable, and the timeline travels across your devices and your team. Worktrees are no longer in the picture.

The future state of AI development is here. The question is whether your tooling lets you reach it.

DEV Community