I have been using Git since my first job. Branches, rebases, cherry-picks, stash, I use them all the time. Worktrees existed the whole time and I basically ignored them. No big reason. I just never needed them badly enough to stop and learn how they work.
That changed a few months ago. The reason was unexpected: I was trying to make AI agents more useful for maintenance work, not just for writing code.
some background on the problem
Our team maintains a FHIR converter. It pulls health data from local government systems, immunization records, maternal care programs, TB treatment data, and converts everything into FHIR R4 format for a national health platform. We cover two districts, with around 30 different data source types between them. Each source has its own data structure and its own issues.
One district had been getting less attention for a while. Not because we ignored it, more because we always had to fix the most urgent problems first, and the rest kept building up.
I track conversion success rates in BigQuery. One afternoon I ran a query to see how things actually looked:
SELECT
table_name,
COUNTIF(is_success = TRUE) AS success_count,
COUNTIF(is_success = FALSE) AS failure_count,
COUNT(*) AS total_count,
ROUND(COUNTIF(is_success = FALSE) / COUNT(*) * 100, 2) AS failure_rate_pct
FROM `your-project.raw_data.convert_to_fhir_report`
GROUP BY table_name
ORDER BY failure_rate_pct DESC
The result was worse than I expected:
100.0% | tb_tracker_table_a | 7,753 failures
99.4% | nutrition_app_table_a | 23,602 failures
77.2% | hepatitis_app_table_a | 22,437 failures
43.2% | health_service_table_c | 8,096 failures
28.1% | health_service_table_a | 168,713 failures
168,000 failed conversions on one table. Thirteen tables were above 25% failure rate in total.
The normal way to fix this is to go through them one by one. Pull some failed records, find the error, fix the template or the utility function, test it, then move to the next one. One table takes maybe 2–3 hours if you know the codebase well. Thirteen tables means a lot of days.
where agents help, and where they get stuck
Over the last few months I have been experimenting with AI agents for development work. Not just code suggestions, more like giving a full task to an agent and letting it work. Something like: here is the broken thing, here is the test setup, go fix it and tell me when you are done.
For converter failures, the workflow is:
- Pull 100 failed UUIDs from BigQuery
- Check data quality, fill rates, date format problems, anything structural
- Test UUIDs one at a time to find the actual error
- Fix the template or the shared utility function
- Run all 100, make sure at least 90% pass
- Delete the failed rows from the report table so the system picks them up again
An agent can do this whole process. To show what that looks like in practice: one table was failing because a date field was coming in as "2024-01-15 00:00:00" instead of "2024-01-15". Our convertStringToDate function expected the format %Y-%m-%d and returned empty when it received the datetime format with the time part. An empty date meant a required FHIR field was missing. The record got rejected.
The agent found this by testing failed records one at a time. The third UUID showed the error. The agent traced it back to the utility function, added a fallback to handle the extra time part, then ran the full test batch again to confirm the fix worked. The whole process took about 20 minutes of agent work. When I looked at the code change, it was 4 lines in one function, with a clear explanation of what was wrong.
Another table had a different issue. A field that should contain a boolean true was arriving as the text string "true". The FHIR validator rejected it with expected boolean: found "true". Same process, agent found it, traced it to a function returning ["true"] instead of [True], fixed it, and checked that no other templates were using the same function for non-boolean fields before making the change.
The per-table work was not the problem. The problem was that I could only run one agent at a time.
If I had one agent working on nutrition_app_table_a and I started another on tb_tracker_table_a, they would conflict. Both agents change tests/test_specific_uuids.py to set up their test. Both might also change shared functions in extra_logics/general.py if they find a common bug. On the same branch, in the same directory, they would be overwriting each other all the time.
So I was running them one after another, which made the whole thing much slower.
what worktrees actually do
A git worktree lets you check out multiple branches in separate folders at the same time. Each folder has its own files and its own changes, but they all share the same .git folder.
git worktree add ../converter-fix-1 -b fix/nutrition-app-a
git worktree add ../converter-fix-2 -b fix/tb-tracker-a
git worktree add ../converter-fix-3 -b fix/hepatitis-app-a
git worktree add ../converter-fix-4 -b fix/health-service-a
Four folders, four branches, one repository. The extra storage is small because you are not copying the whole .git folder, just the working files.
I knew roughly what worktrees do. What I did not realize until recently is that this is exactly what you need for parallel agents. Each agent gets its own folder. They never see each other's files. Agent 1 can be halfway through testing nutrition_app_table_a while Agent 2 is just starting on tb_tracker_table_a. No conflicts, no waiting for each other.
how I organized the actual run
With 13 failing tables, I sorted them by how many failed conversions they had. I also checked which tables share the same template file, because fixing one template sometimes fixes several tables at once.
The first batch had four worktrees running at the same time:
| worktree | table | failed conversions |
|---|---|---|
| 1 | health_service_table_a | 168,713 |
| 2 | health_service_table_b | 45,707 |
| 3 | nutrition_app_table_b | 28,977 |
| 4 | hepatitis_app_table_a | 22,437 |
Each agent received a prompt like this:
please create a new worktree for fixing health_service_table_a,
then run the fix-converter-failures workflow.
TABLE_NAME=health_service_table_a
CODE=health_service_code_a
BQ project=your-project, credentials=credentials.json
After the process is READY TO DEPLOY, merge back to the main branch,
commit and push, then delete the worktree.
The agents ran separately. While worktree 1 was checking fill rates on the patient service table, worktree 2 was already testing its third UUID on the lab table. When one agent found a bug in a shared utility function, it fixed the bug in its own branch and the other agents were not affected.
After the first batch finished and merged, I started the next four. The whole process felt more like reviewing work than doing it.
the other half: template coverage
Fixing failures is one problem. The other quality problem was coverage, converter templates that were working without errors but were quietly skipping fields that exist in the source data and should be mapped to FHIR.
We have a coverage check: compare each template against its data dictionary and count what percentage of the fields are actually mapped. If it is below 90%, the template needs more work.
With 15 templates to check, I used the same approach: four worktrees, grouped by data source type.
- worktree 1: maternal death reporting templates
- worktree 2: nutrition monitoring (three templates)
- worktree 3: community health screening (four templates)
- worktree 4: immunization + TB treatment
Agents run coverage analysis on each template. If it passes 90%, move on. If not, add the missing fields, test, validate, done. Merge and clean up the worktree.
This step found a different kind of problem than the failure fixes. A template can have a 0% failure rate and still be missing half the fields it should be mapping. Records were converting successfully but losing clinical data, lab results, risk factor flags, procedure details, because the template never mapped those fields at all. The failure rate query does not show this problem. Coverage analysis does.
Some templates were already above 90% and needed no changes. Others were well below. One community health screening template was correctly mapping vital signs but was missing several important fields, blood glucose, cholesterol, abdominal circumference, all present in the source data and listed in the data dictionary, just never connected. The agent added them, ran the tests, and validated against the FHIR server. That kind of work would normally sit in a backlog for weeks.
One thing I did not expect: some data dictionaries were split into multiple CSV files. One screening app reference, for example, came in four separate files with around 1,400 rows total. The agent had to combine them before running the analysis. A small issue, but good to know before you start.
a few things worth knowing if you try this
Worktrees share your git history and remotes. A commit you make in ../converter-fix-2 will show up in git log from your main folder. git push works normally from any worktree.
Config files that are not committed need to exist in each worktree folder separately. Credential files, .env, anything in .gitignore, each worktree needs its own copy. I keep a short setup note for this.
Name your worktrees after what they are doing, not just numbers. ../fix-nutrition-app is much easier to work with than ../worktree-2 when you have four terminals open at the same time.
Cleanup is one command:
git worktree remove ../converter-fix-1
The folder is removed and the worktree reference is cleaned up. The branch and its commits stay in the repository.
I did not find worktrees useful for a long time because I did not have a problem that needed them. Switching branches was enough for the work I was doing. It was only when I started using agents, and wanted more than one running at the same time, that the limitation became clear.
If you are working with agents on tasks that are similar and do not depend on each other, this combination is worth trying. The setup takes a few minutes. The main change is thinking in groups of tasks instead of doing them one at a time, which feels a little different at first but gets natural quickly.
Top comments (0)