DEV Community

rheorix
rheorix

Posted on

I built a multi-agent AI workflow with Claude Code + Java/Spring Boot (real-world experiment)

I’ve been experimenting with Claude Code to go beyond “AI as a copilot” and instead simulate a small team of AI agents working on software development tasks.

The idea was simple:
Instead of asking Claude to help with isolated snippets, I structured it into a workflow where different “agents” handle:

  • code generation
  • review & validation
  • architecture decisions
  • cost/governance constraints

All orchestrated through a Java / Spring Boot backend.

What I found interesting is that the real challenge wasn’t generating code — it was coordination, governance, and control over the system behavior.

In practice, the hard problems became:

  • preventing agents from diverging in logic
  • maintaining consistency across outputs
  • controlling cost and iteration loops
  • introducing human decision points at the right time

I documented the full setup, architecture, and lessons learned here:
https://www.rheorix.com/en/2026/05/19/how-i-built-a-team-of-ai-agents-with-claude-code/

→ Full code and repository on GitHub: https://github.com/rheorix/agentic-company

Curious if anyone else is experimenting with similar multi-agent setups — especially in production or near-production environments.

What patterns are you using for orchestration and governance?

Top comments (2)

Collapse
 
kcarriedo profile image
Kyle Carriedo

The "preventing agents from diverging in logic" and "introducing human decision points at the right time" pain points you describe are the two hardest parts of multi-agent Claude Code workflows and they're related. Logic divergence surfaces because agents lack a shared state checkpoint to reference before each turn; the human-decision-point problem surfaces because there's no structured way for an agent to say "I need a decision before I proceed" and pause without either blocking or losing context.

One pattern that's helped: an external coordination file that every agent reads on startup and writes to at task boundaries. Agents record their current goal, last completed step, and any unresolved decisions. A coordinator reads it to decide which session needs attention. Surfaces divergence early rather than at integration time.

Building something specifically for this — a control plane that gives visibility into which agents are working/blocked/idle and sequences handoffs. If you're hitting the diverging-logic wall regularly, happy to share notes at claudeverse.ai.

Collapse
 
rheorix profile image
rheorix

Thanks for the insight. I haven't hit this problem yet, but it's good to know there are solid approaches for dealing with it. I'll definitely keep it in mind and reach out if I run into it.