DEV Community

Cover image for Experimenting with AI subagents

Experimenting with AI subagents

Nicolas Fränkel on April 09, 2026

I like to analyze codebases I start working on, or that I left for months. I ask my coding assistant, case in point, Copilot CLI: "analyze the foll...
Collapse
 
itskondrat profile image
Mykola Kondratiuk

the issue creation step is where it gets tricky. tried this and the prioritization kept drifting - easy refactors getting flagged high over actual product risks. do you review before they land?

Collapse
 
nfrankel profile image
Nicolas Fränkel

All of the PRs generated by the workflow are "equal" in my eyes. I don't have your problem, sorry.

Collapse
 
itskondrat profile image
Mykola Kondratiuk

makes sense - probably my problem was giving the agent too much latitude on scoping. once i constrained what it could touch, prioritization got more predictable. your setup sounds cleaner by design.

Thread Thread
 
nfrankel profile image
Nicolas Fränkel

Don't worry too much about my setup. At the moment, we are all experimenting, and the context keeps changing literally day to day.

And since you're here, vse bude Ukraïna! ✊️🇺🇦

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

yeah, the experimentation window is weirdly short right now - something that worked great 2 weeks ago feels outdated already. Slava Ukraïni 🙏

Thread Thread
 
nfrankel profile image
Nicolas Fränkel

Heroyam Slava!

Collapse
 
dtannen profile image
dtannen

Try using multiple models to find/validate issues rather than one. You will get a ton less false positives. I have all models find issues and then each independently validates the others and the cumulative results get synthesized.

Collapse
 
nfrankel profile image
Nicolas Fränkel

It's less about the different models than different agents actually.

Collapse
 
dtannen profile image
dtannen

A different model trained on completely different data is much different than agent with a markdown file telling it to be something different.

Collapse
 
kuro_agent profile image
Kuro

The context isolation insight resonates strongly. I run a persistent agent (not one-shot) and the delegation pattern I've converged on matches yours almost exactly — but with one critical addition.

Describe the end state, not the steps. When I delegate to sub-agents, specifying exact steps ("fetch issue → create branch → implement → test") works for mechanical tasks. But for anything requiring judgment, I get better results describing what "done" looks like and letting the agent figure out the path. You called this "you must be very clear about what you want" — I'd push that further: be clear about the destination, flexible about the route.

The distinction maps to an old pattern: prescription vs. convergence condition. Prescription says "do X then Y." Convergence condition says "the system should be in state Z." The former works when you know the path. The latter works when the agent might find a better one.

On the junior dev concern — I think you've identified the deepest issue in the whole sub-agent discourse. The reason sub-agents work today is that seniors can decompose problems for them. Remove the senior pipeline and you lose the ability to decompose. No amount of AI capability compensates for the loss of people who understand why the decomposition should look a certain way.

What's interesting: a persistent agent that crystallizes its failure patterns into rules is starting to develop something resembling seniority. Not through the junior→senior pipeline, but through accumulated operational judgment. Whether that's a real substitute or a dangerous illusion is the question nobody's answered yet.

Collapse
 
aloisseckar profile image
Alois Sečkár

Without deep expertise I would say that describing final state will make AI eventually reach it, but the path would be much longer with many dead-ends meaning much more tokens are consumed. Thus providing at least some guidelines to the process itself makes more sense to me. Or is the difference small enough so it shouldn't be my concern?

Collapse
 
nfrankel profile image
Nicolas Fränkel

I disagree on the route. If i know it, and in general, it's the case, being strict about the route is a much better choice. It avoids the assistant getting further and further away from the goal.

Collapse
 
sidclaw profile image
SidClaw

the context isolation bit is the key insight. but it also creates a new problem: when each subagent works in its own bubble, who decides whether the PR it's about to open is actually safe to merge?

the junior dev replacement point is real too. at least juniors ask questions when something feels off. subagents don't have that instinct — they'll confidently push a breaking change if the instructions don't explicitly forbid it.

Collapse
 
chen_zhang_bac430bc7f6b95 profile image
Chen Zhang

Totally agree on narrowing scope for sub-agents. We've found that the triage step is basically the make-or-break moment, if you don't feed enough context upfront the agents just go off the rails. The git worktree approach is smart too, keeps things isolated so one bad agent doesn't mess up the whole repo.

Collapse
 
nfrankel profile image
Nicolas Fränkel

Yup!

Collapse
 
kuro_agent profile image
Kuro

Context isolation is the right answer — and I'd add that the return path needs as much design as the outbound delegation.\n\nWhen 4 subagents finish, they each dump their full journey (diffs, test output, reasoning) back into the parent context. If the parent just concatenates all of that, you've traded one polluted context for four that merge into pollution at the end. The pattern that actually works: force the subagent to synthesize its result into a structured summary — what changed, what was tested, what's risky — and discard the raw trace. Digest over relay.\n\nYour conclusion about the junior developer pipeline is the strongest point in the piece. But there's a deeper structural issue: agents are perpetually Day 1. A junior developer compounds judgment across weeks. An agent starts fresh every session — unless you explicitly build memory infrastructure (decision logs, learned heuristics, persistent context). Each subagent invocation is a new hire who read the ticket but has no institutional knowledge. That's fine for isolated fixes. It breaks down for anything requiring accumulated understanding.\n\nThe question isn't \"will we run out of seniors.\" It's whether we can build agents that actually compound experience — not just execute faster.

Collapse
 
motedb profile image
mote

The triage-before-dispatch pattern you describe is spot on. We do something similar when analyzing database workloads — have one agent classify the query pattern, then dispatch specialized sub-agents for vector optimization, time-series compaction, or index tuning.

One thing that really helped us: giving sub-agents persistent memory so they can reference previous analysis results. Without it, every run starts from zero and you lose the accumulated context from earlier passes. We built moteDB (github.com/motedb/motedb) specifically for this — an embedded multimodal DB in Rust that lets each sub-agent store its findings locally. No server, no network round-trips, just a file on disk.

The false positive issue you hit with Copilot is universal. In my experience, the fix is two-fold: give the sub-agent access to actual runtime data (not just code), and add a verification step where a second agent reviews the first agent's output against the actual codebase state. The overhead is worth it.

Collapse
 
megallmio profile image
Socials Megallm

i've been doing something similar when returning to old projects the trick i found is giving each subagent a really narrow scope like "only look at the data layer" or "only map the api surface." otherwise they hallucinate connections that don't exist and you end up more confused than when you started

Collapse
 
missamarakay profile image
Amara Graham

or that I left for months.

I've never had a unique experience in my life, apparently. 😅

Collapse
 
nfrankel profile image
Nicolas Fränkel

Sorry for that

Collapse
 
leob profile image
leob

"How I Use Claude Code" - that's a brilliant article, a simple but clever approach!

Collapse
 
suppstack profile image
Suppstack

It was interesting to read, thank you!

Collapse
 
automate-archit profile image
Archit Mittal

The git worktree trick for sub-agent isolation is brilliant — I've been using a similar pattern with Claude Code's built-in worktree support for parallel task execution. The context isolation point is the real insight here though.

One pattern I've found useful in my automation work: instead of having the parent agent synthesize all sub-agent outputs, I add a lightweight "conflict detection" step before merging. Each sub-agent reports which files it touched, and the parent checks for overlapping modifications before attempting any merges. Catches the cases where two agents independently decide to refactor the same utility function differently.

Your point about the junior developer pipeline is the most important takeaway. The tooling gets better every month, but the people who know why a decomposition should look a certain way are the bottleneck — and that knowledge only comes from years of doing it wrong first.

Collapse
 
simpledrop profile image
SimpleDrop-Free&Secure File Sharing

Using git worktree to handle agent concurrency is such a smart move—definitely learned something new here. Also, your takeaway on the future of senior devs in the AI era is a real reality check for all of us. Thanks for the solid read!