Jaesang Lee

Posted on Feb 9

Building an AI Orchestration System

#agents #ai #automation #showdev

TL;DR

Built an AI Orchestration system to improve both development speed and code quality.

Ported the existing team development process (Jira Ticket → Dev → PR → Review) to AI using Rules, Workflow, Agents, and Skills.

Integrated with Jira Automation so that when a human defines requirements, AI automatically handles development, PR creation, and review, with automatic status transitions.

Introduction

With the rapid advancement of AI technology, the performance of AI Assistants has improved significantly, and the amount of code I write manually has noticeably decreased. While looking for ways to better utilize AI, I came across the idea of AI Orchestration—using Sub Agents like team members and having a Main Agent act as a PM to form a single development team.

Since my team already had a well-defined development process, I thought, "Why not teach this process directly to the AI?" This post summarizes the structure and design intent of the AI Orchestration system I built during this process.

Previous Methods and Problems

Before building AI Orchestration, I worked by writing spec documents, having an AI Agent plan the development, and then letting it implement the code. Waiting for the AI to finish was time-consuming, so I would clone the same repository multiple times to work in parallel. While the AI worked in Clone A, I would start another task in Clone B.

The reason was simple: I didn't want to waste time waiting, but running multiple agents in a single repository caused code conflicts and made context management difficult.

However, this approach had major issues:

Severe Token Waste: Every time I started a new conversation, the AI had to read the code from scratch, consuming millions of tokens a day and quickly hitting usage limits.
Difficult Context Management: Working across multiple clones made it hard to track what was done where.
Inconsistent Quality: Since the AI wrote code freely, it often failed to follow coding conventions or team rules.

In the end, what I wanted was clear: I wanted the AI to execute the entire flow just like a developer in my team—receiving a Jira ticket, analyzing code, developing, running builds/tests, creating a PR, and getting reviews. I wanted to teach the AI the process humans use so it could work like a team member.

Main Body

To solve these problems, I built my own AI Orchestration system based on the Cursor IDE. The entire system is divided into the essential Core Orchestration and the Advanced Optimization that scales and optimizes it.

Part 1: Core Orchestration (The Foundation)

This is the fundamental backbone of AI Orchestration. Just building this part allows the AI to write and review code while adhering to team rules.

1. Docs Organization

The first step was organizing documentation so the AI could quickly understand the repository. Since the biggest cause of token waste was the AI reading source code every time to grasp context, I created documents that allowed it to understand the project without reading the code.

The most important principle here was AI Readable. Instead of descriptive text good for humans, I wrote in structures easy for AI to parse. Tech stacks were organized in tables, architecture in text diagrams, and execution methods in command lists.

Since it was a Monorepo structure, I created an overview document covering the entire project and organized the roles and dependencies for each Application and Library. This allowed the AI to understand the whole project by referencing just a few documents instead of reading code, significantly reducing token usage.

2. Rules System

If Docs taught the project structure, Rules taught how to work. Cursor IDE has a feature where AI automatically references rule files placed in the .cursor/rules/ directory. I used this to document team coding conventions, Git rules, PR rules, test rules, security rules, etc.

The rule files are divided into three categories:

Global Rules: Rules applied to all tasks, such as code style, Git branch naming, commit messages, and PR writing guidelines.
Workflow Rules: The entire work flow from ticket receipt to PR creation.
Domain-specific Rules: Rules that differ by work area, such as gRPC, REST API, Frontend, etc.

Strategy for Token Efficiency

The part I focused on most while designing Rules was token efficiency. With over 10 rule files, loading all of them would waste the Context Window. So, I classified the rules into two types:

Always Applied: Only core rules essential for all tasks (like Global and Workflow rules) are always loaded.
Conditionally Applied: Domain-specific rules are automatically loaded only when modifying matching file patterns.

For example, gRPC rules are loaded only when gRPC-related files are modified, and Frontend rules are loaded only when Frontend files are modified. This prevented Frontend rules from being unnecessarily loaded during Backend work.

3. Workflow

If Docs teach the project and Rules teach the regulations, Workflow teaches the order of work. This was the core of AI Orchestration.

I mapped the flow of how developers in my team work directly to the AI workflow. I defined the entire process from ticket receipt to PR creation and code review in 9 steps.

There are two key design points in this workflow:

First, it maps 1:1 with the team process.

What Humans Did	What AI Does
Check Jira ticket, analyze requirements	Analyze missing requirements in Plan mode after retrieving ticket
Identify related code	Code analysis using Skills
Write code	Implementation adhering to Rules
Verify build/test locally	Sequential verification of Build, Test, Lint
Self-check before review request	Sub Agent analyzes code quality
Create PR	Automatically create PR on GitHub
Peer code review	Review Agent writes inline comments

Second, it does not proceed to the next step if the previous step fails. If the build breaks, it doesn't run tests; if tests fail, it doesn't create a PR. The AI analyzes the failure cause, fixes it, and retries. If it fails more than 3 times, it asks for human help. This is identical to how humans work.

4. Agents (Main & Review)

While Workflow defines the overall flow, Agents are responsible for professional judgment within that flow.

Main Agent acts like a PM, coordinating the entire process and executing tasks according to the Workflow.

Review Agent writes inline comments directly on the GitHub PR after it's created. This is the same way humans conduct code reviews.

Frontend Review Agent: Reviews component design, state management patterns, rendering optimization, etc.
Backend Review Agent: Reviews API design, DB query efficiency, error handling, security, etc.
Spec Review Agent: Reviews whether all initially defined requirements are met.

When designing Review Agents, I intentionally used a different model than the one used for development. I found that if the same model writes code and reviews it, it tends to miss its own mistakes—just like humans often miss things in self-reviews. Different models have different reasoning patterns, so one can catch what the other missed, increasing review efficiency.

Part 2: Advanced Optimization (Scaling Up)

If Core guarantees the quality of a single task, the Advanced stage is about scaling for volume, speed, and sustainability.

5. Supervisor / Worker & Git Worktree

To handle multiple tasks simultaneously, I defined two roles, Supervisor and Worker, and adopted a Git Worktree strategy to physically support them.

Supervisor is a PM role that coordinates multiple tasks. It checks 'Ready' tasks in Jira, processes non-overlapping tasks in parallel, and processes overlapping ones sequentially.

Git Worktree allows multiple working directories to be connected to a single Git repository. Unlike cloning, it shares the .git object, making it lightweight and saving disk space. I created independent Worktrees for each Jira ticket, ensuring each task is worked on in a separate directory. This eliminated code conflict issues even when running multiple tasks in parallel.

6. Skills & MCP

If Agents handle judgment and analysis, Skills are the tools. They are reusable functional units with clearly defined inputs and outputs that perform set tasks without reasoning.

I specifically mapped existing team Code Generator Scripts to Skills. This ensured that even when the AI wrote code, it perfectly adhered to the team's boilerplate and conventions.

I also integrated MCP (Model Context Protocol) to allow the AI to directly handle external services like GitHub and Jira. Through GitHub MCP, it can create PRs, write comments, and review. Through Jira MCP, it can retrieve tickets and change statuses.

With this integration, the AI could finally execute the existing team process End-to-End. By connecting Jira Automation rules, status transitions are also automated, creating a structure where humans only need to organize requirements and give final approval.

7. Docs Update Workflow

To build a sustainable AI Orchestration, documents must change as code changes. If documents become outdated, the AI works based on incorrect Context, leading to poor quality.

To solve this, I defined document updates as a workflow itself. The key is that the AI self-detects when a document update is needed and suggests it.

For example, if the AI discovers a new pattern not in the existing rules during development or code review, it suggests, "Should I add this to the documentation?" If the user approves, a separate lightweight workflow runs. Thanks to this flow, Docs and Rules could remain "living documents" alongside the code.

Conclusion

Changes After Implementation

The biggest change after introducing AI Orchestration is that the role of humans has changed. Previously, I spent time writing code or checking AI-written code line by line. Now, I just need to write requirements in detail on a Jira ticket and pass the link.

Here is a summary of the perceived changes:

Category	Before	After
Token Usage	Read entire code every time	Read only necessary code based on overview.md
Code Quality	AI often broke conventions, requiring fixes	Consistent quality via enforced Rules & pre-verification
CI Optimization	PR CI execution time 16~20 min	AI analyzed logs & optimized to 3~4 min
Migration	ESLint → Biome took days	Delegated to AI, completed in 2~3 hours

Limitations and Future Direction

Of course, it's not perfect yet. It performed well for developing new features, but it sometimes struggled with refactoring existing code or handling domain-specific business logic. These parts are improving by continuously reinforcing the Context referenced by the AI through the Document Update Flow introduced earlier.

Ultimately, what I felt through this work is that the core of AI Orchestration is not instructing the AI to "write code," but "teaching it how our team works." If you systematically organize the team's process, rules, and quality standards and port them to AI, the AI can operate as a member of the team, not just a simple code generation tool.

DEV Community