Urvil Joshi

Posted on May 27 • Originally published at Medium on Apr 28

I Built an Orchestrator AI Agent That Takes My Github issue to Pull Requests.

#springboot #agents #claudecode #softwaredevelopment

A Claude Code workflow with one orchestrator, five subagents, and three human gates running on a real Spring Boot project, end to end

This is my Minimal dev setup in 2026.

Pixel Agents in VS Code to monitor my agents. Claude Code in the terminal. Together, they take a GitHub issue and turn it into a merged pull request with three approvals from me along the way.

🍥 The project: LinkStash

LinkStash is a Spring Boot URL shortener I built last week. To be upfront: this is not a serious production repo. It’s a demo I created specifically to show this workflow on a realistic codebase.

The repo has one open issue:

That’s the issue I want my workflow to handle.I’m going to invoke an agent and it will handle it with my inputs and reviews.

✨ Step 1 → CLAUDE.md sets the rules

Every Claude Code session reads CLAUDE.md first. It's the file you keep at the repo root that tells Claude how your project works. Conventions, what not to do, project structure basically all of it.

For LinkStash, mine includes things like:

Constructor injection only — never field injection
Records for DTOs
Don’t add Lombok
Don’t push to main
Always write a Flyway migration, never use ddl-auto: update

If you’ve never written one, run /init in Claude Code and it'll generate a starting point you can edit. The trick is keeping it tight long CLAUDE.md files dilute attention. Forty lines of clear rules beats two hundred lines of vague guidance.

If you use multiple coding agents(Claude, Codex, Copilot) : you can create AGENT.md for general conventions shared across all coding agents (Claude, Codex, Copilot), and keep agent specific md for your Coding Agent specific things.

✨Step 2 → The orchestrator agent

Here’s where it gets interesting. My main agent is called issue-resolver, and it lives in .claude/agents/.

It does three things on its own and pauses three times for me. The high-level flow:

Fetches the GitHub issue (via my MCP server — more on this below)
Spawns a subagent that explores the codebase and writes ARCHITECTURE.md
Spawns a subagent that drafts plan.md
Pauses for me to approve the plan
Spawns a subagent that implements the plan
Spawns a subagent that runs /ultrareview for self-critique
Pauses for me to triage findings
Spawns a subagent that applies accepted findings
Pauses for me to do a final review of the changes
Pushes and opens the PR

The key rule baked into the agent prompt: never modify code yourself, always delegate to subagents. The orchestrator only orchestrates. Each subagent has one job.

🔍A note on the MCP server

For fetching the issue, I’m using a custom MCP server I built in a previous video. You don’t have to do this the official GitHub MCP server has a gh_get_issue tool that does the same thing. Or you could use Claude Code skills.

I’m using my own because I built it for a related workflow already. Pick whichever fits your workflow.

✨Step 3 → Kicking off the loop

The whole invocation is one line:

@issue-resolver fetch and resolve issue #1

That’s it. The orchestrator goes to work. First it fetches the issue from my MCP server. Then the explore subagent reads the codebase and writes ARCHITECTURE.md : entities, endpoints, data flow, testing conventions.

Then the plan subagent runs and produces plan.md. This is where I get the first interesting moment.

✨ Step 4 → Gate 1: Plan approval

The plan came back with two open questions the agent flagged for me:

Which rate-limiting algorithm — token bucket or fixed window?
Should creating a link with a past expiresAt return 400?

This is exactly what I want. The agent isn’t guessing it’s asking back the engineering decisions.

I told it: greedy token bucket (Bucket4j default behavior) and yes, return 400 on past expiry validated against server time.

I sent the plan back. Plan came back updated. Both open questions resolved. Approved.

✨Step 5 → Implementation runs

The implement subagent takes over. New tables for API keys and link expiration. The Bucket4j filter. Updated controllers. Tests added. Tests run after each major change. All green.

✨Step 6 → Gate 2: /ultrareview findings

After implementation, the orchestrator spawns the review subagent. /ultrareview is Claude Code's high-effort self-critique mode.

Findings come back as a numbered list with severity, file location, and suggested fixes. I get a structured prompt:

Reply: "accept all" / "accept all except [numbers]" / "accept only [numbers]"

This is the second gate. I read each finding, decide which are real, which are nitpicks, which are wrong. If I disagree with one, I exclude it.

The honest part of this workflow is right here: even when /ultrareview is correct in principle, I’m the one who is responsible for the commit. I don’t accept findings blindly. I read them.

This time I accepted all of them as they were all fair calls.

✨Step 7 → Gate 3: Post-fix review

After applying findings subagent which will resolve the findings we found out , the orchestrator pauses one more time before pushing.

I scroll through the diff. If I have any additional changes I’d want even though /ultrareview didn’t flag them I describe them and the agent runs apply-findings again. If the diff looks clean, I type push.

You might say three gates is excessive that it slows the workflow down. For a demo, sure, you can argue that. But for production code, code that ships to clients, code that runs at scale you should know what your AI is shipping.

The gates aren’t friction. They’re the part of the workflow that keeps you accountable.

✨Step 8 →The PR

The PR has a clean summary, “Generated by Claude Code” attribution, all the commits, all the file changes. Ready for review.

🧰What I’d take from this if I were building it myself

A few things I learned that I’d recommend if you’re trying to set up your own version:

Keep your CLAUDE.md tight. Forty lines of clear rules beats two hundred lines of vague guidance.

Subagents for one job each. The temptation is to make smart, multi-purpose agents. Resist it. Each subagent does one thing explore, plan, implement, review, apply, resolve. Predictable and easy to debug.

Human gates are non-negotiable for real work. Demo? Skip them if you want. Production? Human gates. That’s the floor, not the ceiling.

🏁Closing

The point of this workflow isn’t “ AI does my job. ” It’s the opposite. AI does the typing. I do the deciding. Critical Decisions, in the right places. That’s the modern dev workflow if you’re trying to use these tools seriously instead of as a novelty :) .