TL;DR
I built AutoMaintainer, a multi-agent AI system that transforms GitHub issues into production-ready pull requests during the Qwen Cloud AI Hackathon. It coordinates specialized agents (Issue Analyst, Developer, QA, Security, Documentation, Reviewer) to solve problems like a real engineering teamβall while keeping humans in control. Here's what I learned.
The Problem
Open-source maintainers face a brutal reality:
- π Overwhelming issue backlogs
- π Repetitive bug fixes and documentation gaps
- β±οΈ Code review bottlenecks
- π΄ Burnout from handling everything solo
Existing AI tools help write code, but they don't orchestrate the entire workflow: planning, development, testing, security review, documentation, and human approval.
What if we could build an AI engineering team that collaborates like real developers?
The Solution: AutoMaintainer
AutoMaintainer is a multi-agent orchestration system that mirrors a real software company:
- Issue Analyst β Reads GitHub issues, extracts requirements, assesses severity
- Architect β Analyzes repo structure, designs the implementation approach
- Developer β Writes code, updates files, creates new modules
- QA Tester β Generates tests, validates fixes, checks edge cases
- Security Agent β Scans for vulnerabilities, prevents dangerous patterns
- Documentation β Updates changelogs, PR summaries, release notes
- Reviewer β Scores code quality, recommends improvements
- Human Approval Gateway β Final human review before merge
The result? A pull request that's analyzed, built, tested, secured, documented, and reviewedβall before a human ever sees it.
Tech Stack
Frontend
- Next.js β React framework for the dashboard UI
- Tailwind CSS β Rapid, utility-first styling
- TypeScript β Type safety for the frontend layer
Backend
- FastAPI (Python) β Lightweight, async-first API
- Qwen-compatible LLM API β AI model integration for all agents
- SQLite + Async (aiosqlite) β Persistent pipeline and memory storage
- Redis-ready architecture β Prepared for distributed queuing
Integrations
- GitHub API β Issue fetching, context retrieval, PR creation
- Docker sandbox β Isolated code execution environment
Deployment
- Vercel β Frontend hosting
- FastAPI + Uvicorn β Backend server
- Prepared for Alibaba Cloud β Scalable backend infrastructure
Key Learnings
1. Agent Orchestration is Hard (But Worth It)
Managing multiple AI agents requires careful state management. Each agent needs:
- Clear system prompts defining their role
- Access to shared context (repo info, previous agent outputs)
- Structured output formats
- Error handling and recovery
Lesson: Use TypedDict/Pydantic models for inter-agent communication. Don't let agents operate in silos.
class BaseAgent(ABC):
async def execute(self, pipeline: PipelineRun, context: Dict[str, Any]) -> Dict[str, Any]:
# Each agent receives pipeline state + shared context
pass
2. Human Approval is Non-Negotiable
Full automation sounds great until your AI agent merges a breaking change. The approval gatewayβwhere humans review before mergeβis the most important feature.
Lesson: Build trust through transparency. Show humans exactly what changed, why, and what the risks are.
3. Context is Everything
Agents perform better when they have:
- Repository structure and file tree
- README and documentation
- Previous PRs and patterns used before
- Code analysis from GitHub
Lesson: Invest in a memory system early. Store patterns, decisions, and failed attempts. Your second PR fix will be 10x better.
4. LLM Integration Requires Fallbacks
Qwen API calls can fail. Responses might be malformed JSON. Rate limits matter.
Lesson:
- Implement retry logic with exponential backoff
- Validate all LLM outputs
- Have graceful degradation for non-critical agents
- Monitor token usage and costs carefully
5. FastAPI's Async Model is Perfect for Agent Pipelines
Since agents run sequentially and each calls an LLM (I/O bound), async/await handles pipeline execution beautifully.
async def _run_pipeline(self, pipeline: PipelineRun, issue_body: str):
# Run agents sequentially with async I/O
await self.agents[AgentRole.ISSUE_ANALYST].execute(pipeline, context)
await self.agents[AgentRole.ARCHITECT].execute(pipeline, context)
# ... and so on
Lesson: Async from the start. Don't retrofit it later.
6. Testing AI Agents is Different
You can't just assert outputs are correct. Instead:
- Mock the LLM responses for deterministic tests
- Test that agents properly update pipeline state
- Verify error handling and fallback paths
- Use demo/example pipelines to showcase behavior
7. GitHub Integration is Underrated
GitHub's API is powerful but requires:
- Proper pagination for large repos
- Tree fetching for architecture context
- Webhook handling for issue triggers
- PR creation with proper formatting
Lesson: Read the GitHub API docs thoroughly. Small mistakes break your integration.
8. Qwen Models Are Excellent
Qwen-plus and Qwen-turbo performed well for:
- Code analysis and generation
- Structured reasoning
- Following system prompts
- Cost-effectiveness
Lesson: Qwen-compatible APIs are a solid alternative to closed models. Fast iteration during hackathons.
What Surprised Me
1. The value of specialized roles
I initially thought one "super agent" could do everything. Nope. Splitting concerns (analysis β planning β development β review) creates a pipeline where each step can fail gracefully and be improved independently.
2. Memory is a game-changer
After the first PR, the system learned the repo's patterns. The second issue fix was significantly better because the agents had context about previous decisions.
3. Security agents caught real issues
I expected security scanning to be a formality. It flagged actual vulnerabilities and dependency risks that other agents missed.
4. Humans still want to understand the "why"
Even if the AI is 99% confident, reviewers want to see:
- What problem was solved?
- Why was this approach chosen?
- What alternatives were considered?
This means good documentation and explainability matter more than raw automation.
Challenges
1. Rate Limiting & Cost
Running multiple LLM calls per issue adds up fast. Caching agent responses and batching requests helps.
2. Handling Complex Repos
Large codebases with thousands of files are hard to summarize for LLM context windows. Need better summarization + semantic search.
3. Error Recovery
When an agent fails (malformed code, test failures), how do we recover? Current approach: retry with more context. Future: ask a different agent for help.
4. Async Debugging
Multi-agent pipelines running concurrently are hard to debug. Structured logging is essential.
Results
β
Issues β PRs: Can process a GitHub issue from analysis to ready-for-review PR
β
Human-in-the-loop: Maintainers always approve before merge
β
Multi-agent coordination: Agents share context and build on each other's work
β
Qwen integration: Smooth LLM API integration with fallback handling
β
Deployed: Frontend on Vercel, backend ready for cloud deployment
What's Next
- Improve context summarization β Better handling of massive codebases
- Add caching layer β Reduce LLM calls for repeated patterns
- Deploy to Alibaba Cloud β Scale the backend for concurrent pipelines
- Contributor onboarding β Use agents to generate learning roadmaps for new contributors
- Multi-repo learning β Train on patterns across multiple open-source projects
Key Takeaway
Building an AI engineering team isn't about replacing developersβit's about amplifying maintainers.
AutoMaintainer shows that by combining specialized agents, human oversight, and smart integrations, we can tackle the real bottleneck in open-source: turning ideas into tested, reviewed, production-ready code.
The hackathon forced me to ship something real in a short time. That constraint was valuable. It pushed me to focus on the core workflow (issue β plan β code β test β review β approve) instead of getting lost in feature creep.
If you're an open-source maintainer drowning in issues, or an AI engineer curious about multi-agent systems, check out the repo: https://github.com/okuhlecharlieman/AutoMaintainer
Thank You
Huge thanks to:
- Qwen & Alibaba Cloud for the hackathon and excellent LLM API
- GitHub for the robust API
- FastAPI & Next.js teams for excellent frameworks
- Open-source maintainers everywhere for the inspiration
Let me know what you build next. π
Want to discuss multi-agent systems, LLM integration, or open-source automation? Drop a comment below!
Tags: #ai #hackathon #agents #opensource #qwen #fastapi #nextjs
Top comments (0)