This is a submission for the Agentic Postgres Challenge with Tiger Data
What I Built
Code Cleanup Agents - A production-ready code analysis system that uses Tiger Data's zero-copy database forks to enable true multi-agent architecture. Four specialized AI agents work in parallel across isolated database forks, analyzing code for security vulnerabilities, quality issues, performance problems, and best practice violations.
The Problem
Modern codebases are complex, and AI-generated code is everywhere. Developers need comprehensive analysis that goes beyond simple linting - they need security scanning, performance optimization, quality checks, and best practices enforcement. Traditional tools run sequentially or require complex infrastructure to parallelize.
The Solution
By leveraging Tiger Data's fast, zero-copy database forks, I built a system where each AI agent gets its own isolated database workspace. They analyze code simultaneously without interference, then merge their findings into the main database.
Live Demo: https://code-cleanup-agents.onrender.com
Repository: https://github.com/LuminArk-AI/code-cleanup-agents.git
Key Features
β 4 Specialized AI Agents working in parallel
- π Security Agent: Detects SQL injection, hardcoded secrets, dangerous functions
- β¨ Quality Agent: Finds code smells, duplicates, missing documentation
- β‘ Performance Agent: Identifies N+1 queries, missing indexes, inefficient patterns
- π Best Practices Agent: Enforces language idioms, naming conventions, coding standards
β Real-time Analysis - Upload code, get comprehensive results in seconds
β Semantic Code Search - Find code patterns using natural language queries
β Beautiful Web Interface - Clean, intuitive UI with live feedback
β Production-Ready - Deployed, tested, and battle-proven on real codebases
Demo
Architecture Diagram
βββββββββββββββββββ
β Code Upload β
ββββββββββ¬βββββββββ
β
ββββββΌββββββ
βCoordinatorβ
β (Main DB) β
βββ¬βββ¬βββ¬βββ¬β
β β β β
ββββΌβββΌββββΌββββΌββββββββββββββ
β πβββ¨βββ‘ββπ β
βSecββQuaββPerββBestPracticesβ
βForββForββForββ (Main) β
βk 1ββk 2ββk 3ββ β
ββββ¬βββ¬ββββ¬ββββ¬ββββββββββββββ
β β β β
ββββ΄βββ΄βββ΄βββΊ Merge Results
β
ββββββΌβββββ
β Display β
βββββββββββ
Screenshots
1. Tiger Dashboard - Real Database Forks
https://imgur.com/a/u3lData
Four database services: main + 3 isolated forks for agent workspaces
2. Live Analysis
https://imgur.com/a/2divPTx
*Real-time analysis finding 60+ issues across 4 specialized
3. Parallel Execution Proof
https://imgur.com/a/hXWqfDn
All agents working simultaneously in their own database forks
4. Semantic Code Search
Find code patterns using natural language with pg_trgm
5. Real-World Results
https://imgur.com/a/MA4PBol
Finding actual issues in production code - 15 improvements identified
How I Used Agentic Postgres
1. Fast, Zero-Copy Forks - The Foundation
This is the killer feature. Each agent gets its own database fork created instantly:
# Agent isolation via forks
security_engine = create_engine(SECURITY_FORK_URL)
quality_engine = create_engine(QUALITY_FORK_URL)
performance_engine = create_engine(PERFORMANCE_FORK_URL)
# Parallel execution
with ThreadPoolExecutor(max_workers=3) as executor:
security_future = executor.submit(analyze_security, security_engine, code)
quality_future = executor.submit(analyze_quality, quality_engine, code)
performance_future = executor.submit(analyze_performance, performance_engine, code)
Why this matters: Traditional approaches would require complex locking, transactions, or separate database instances. With Tiger's zero-copy forks, agents can't interfere with each other's work, and we get true parallel processing for free.
Proof it works: Tiger monitoring shows simultaneous activity spikes across all three forks during analysis:
2. pg_text Search for Semantic Code Discovery
Implemented fuzzy text matching using PostgreSQL's pg_trgm extension:
CREATE EXTENSION pg_trgm;
CREATE INDEX code_content_trgm_idx
ON code_submissions
USING gin (code_content gin_trgm_ops);
-- Semantic search query
SELECT filename, code_content,
similarity(code_content, 'authentication logic') as score
FROM code_submissions
WHERE similarity(code_content, 'authentication logic') > 0.2
ORDER BY score DESC;
This enables natural language queries like:
- "Show me all database connection code"
- "Find authentication logic"
- "Where do we handle passwords?"
The similarity scores guide developers to relevant code even when exact keywords don't match.
3. Hybrid Search Combining Multiple Methods
The system combines:
- Trigram similarity for fuzzy text matching
- Metadata filtering for structured searches
- Regex patterns for precise code pattern matching
def hybrid_search(query):
# Semantic search via pg_trgm
semantic_results = search_by_similarity(query)
# Pattern matching
pattern_results = search_by_regex(query)
# Combine and rank results
return merge_and_rank(semantic_results, pattern_results)
4. Intelligent Resource Allocation
I didn't just fork everything blindly. The architecture demonstrates understanding of when to use forks:
Critical analyses in isolated forks:
- Security scanning (Fork 1) - Needs isolation for sensitive data
- Quality analysis (Fork 2) - Heavy processing, many writes
- Performance checking (Fork 3) - Complex queries on code structure
Lightweight checks on main DB:
- Best Practices agent - Simple pattern matching, low overhead
This shows that forks are a tool for specific use cases, not a hammer for every nail.
5. Real-time Collaborative Analysis
Using PostgreSQL's connection pooling with asyncpg, the system supports concurrent writes from multiple agents. Each agent stores findings independently, then the coordinator merges results:
def merge_findings(submission_id):
"""Merge agent findings from forks to main database"""
with main_db.connect() as conn:
# Gather from all forks
security_findings = get_from_fork(security_fork, submission_id)
quality_findings = get_from_fork(quality_fork, submission_id)
# Merge into main
for finding in security_findings + quality_findings:
conn.execute(insert_merged_finding(finding))
This demonstrates Tiger's Fluid Storage - data flows seamlessly between forks and the main database.
6. Tiger MCP Ready Architecture
While I focused on the core functionality first, the system is architected for Tiger MCP integration. Each agent is structured as an independent service that communicates through the database:
class Agent:
def __init__(self, fork_engine, agent_type):
self.db = fork_engine
self.type = agent_type
def analyze(self, code, submission_id):
"""Agent operates independently in its fork"""
issues = self._scan(code)
self._save_to_fork(issues, submission_id)
return issues
This agent pattern is perfect for MCP-based orchestration in future iterations.
Overall Experience
What Worked Exceptionally Well
1. Zero-Copy Forks Are a Game Changer
Coming from traditional databases, the instant fork creation was mind-blowing. No copying data, no waiting, no complex setup. Just:
# Create fork via Tiger CLI
tiger db fork main-db --name security-agent-fork
# Boom. Ready to use.
This enabled true multi-agent architecture without the usual infrastructure headaches.
2. PostgreSQL's Rich Feature Set
Using pg_trgm for semantic search felt like discovering a superpower. No external search engine needed - just Postgres doing what it does best.
3. Developer Experience
Tiger's dashboard, monitoring, and CLI made development smooth. Seeing real-time activity graphs across forks was incredibly satisfying.
What Surprised Me
The Performance
I expected some overhead from multiple database connections, but the system analyzes a 200-line file with all 4 agents in under 2 seconds. Zero-copy forks really are zero overhead.
Finding Real Issues
When I ran my own code through it, the Best Practices agent found 10 legitimate improvements I hadn't noticed. This went from "hackathon project" to "tool I'll actually use" real quick.
How Natural It Felt
The fork-based architecture just makes sense. Each agent having its own workspace is intuitive - it mirrors how human teams work.
Challenges & Learnings
Challenge 1: Getting Fork Credentials
Initially struggled with the Tiger CLI on Windows PowerShell. Solution: Used the web dashboard to create forks and grab connection strings manually. Worked perfectly.
Learning: Have fallbacks. The coordinator gracefully handles missing fork URLs by using the main DB.
Challenge 2: Balancing Fork Usage
With Tiger's free tier (4 services total: 1 main + 3 forks), I had to think strategically about which agents needed isolation. This constraint actually led to better architecture.
Learning: Forks aren't free infrastructure - use them intentionally where they provide real value.
Challenge 3: Merge Conflicts
Initially had agents overwriting each other's results. Fixed by:
- Agent-specific tables in each fork
- Timestamp-based ordering
- Clear merge strategy in coordinator
Learning: Even with isolated forks, you need a clean data flow design.
What I'd Do Differently
1. Earlier Integration Testing
I built each agent independently, then integrated. Should have tested the full pipeline sooner - caught some edge cases late.
2. More Language Support
Currently focused on Python. With more time, I'd add robust support for JavaScript, Java, Go, etc. The agent architecture makes this straightforward.
3. Async from the Start
The ThreadPoolExecutor works great, but full async/await would be even cleaner:
async def analyze_code():
results = await asyncio.gather(
security_agent.analyze(code),
quality_agent.analyze(code),
performance_agent.analyze(code)
)
Key Takeaways
1. Database Forks Enable New Architectures
Before Tiger, true multi-agent systems required microservices, message queues, or complex coordination. Forks make it trivial.
2. PostgreSQL is Underrated for AI Systems
We reach for vector databases and specialized tools, but Postgres + extensions handles 90% of use cases beautifully.
3. Start Simple, Scale Smart
My initial plan had 6 agents and complex orchestration. Focusing on 4 well-implemented agents was the right call.
Future Roadmap
With the foundation solid, here's what's next:
- AI-Powered Fix Generation: Use Claude API to automatically generate code fixes
- GitHub Integration: Analyze entire repositories, generate PR comments
- Historical Tracking: Use TimescaleDB features to track code quality over time
- Team Features: Multi-user support, project management, quality dashboards
- CI/CD Integration: Run as part of automated pipelines
- More Languages: JavaScript, TypeScript, Java, Go, Rust support
Try It Yourself
Code Cleanup Agents
Live Demo: https://code-cleanup-agents.onrender.com
GitHub: https://github.com/LuminArk-AI/code-cleanup-agents.git
Setup Instructions:
Running locally
- Get Tiger Data account
- Create databases
- Clone this repo
- Add your database URLs to .env
- Run
python app.py
Conclusion
Building Code Cleanup Agents taught me that the right infrastructure unlocks new possibilities. Tiger Data's Agentic Postgres features - especially zero-copy forks - enabled an architecture that would have been impractical with traditional databases.
This isn't just a hackathon project. I'm using it on my own code, it's finding real issues, and the agent-based architecture is extensible for countless improvements.
The future of development tooling is multi-agent systems. Agentic Postgres makes that future practical today.
Built with: Python, Flask, PostgreSQL, Tiger Data, SQLAlchemy, and lots of coffee β
GitHub: https://github.com/LuminArk-AI
Thanks to Tiger Data and the DEV community for an amazing challenge!

Top comments (4)
Well Document
thank you for you feedback!
impressive
thanks! I enjoyed building it!