TL;DR: I built an AI agent system that automates my entire GitHub workflow — from discovering issues to submitting PRs to managing reviews. After 100+ hours and 50+ PRs, here's the complete architecture, real code, honest failures, and lessons that no tutorial will tell you.
The Problem: Open Source Contribution is Broken
Let me paint you a picture. It's 2026, and the open source ecosystem has a paradox:
- Millions of open issues across GitHub
- Thousands of developers looking for contribution opportunities
- Hundreds of paid bounties sitting untouched
- Yet most developers spend hours just finding a good first issue
I was one of those developers. I'd spend 2-3 hours searching for issues, reading codebases, understanding context — before writing a single line of code. The ratio was terrible: 80% searching, 20% coding.
So I built an AI agent to flip that ratio.
What We're Building
By the end of this guide, you'll have a system that:
- Discovers relevant issues across GitHub automatically
- Evaluates each opportunity (difficulty, competition, payout potential)
- Clones the repository and analyzes the codebase
- Generates a fix or feature implementation
- Submits a professional PR with proper description and tests
- Monitors review feedback and addresses comments
This isn't theoretical. This system has submitted 50+ real PRs across GitHub, with 21 merged (41% acceptance rate).
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ GitHub Workflow Agent │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Scout │───▶│ Triage │───▶│ Worker │ │
│ │ Module │ │ Engine │ │ Module │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ GitHub │ │ Scoring │ │ Code │ │
│ │ Search │ │ Algorithm│ │ Generator│ │
│ │ API │ │ │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ │
│ │ PR │ │
│ │ Submitter│ │
│ └──────────┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ │
│ │ Review │ │
│ │ Monitor │ │
│ └──────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Step 1: The Scout Module — Discovering Opportunities
The scout module uses GitHub's Search API to find issues that match our criteria.
Setting Up GitHub CLI
First, authenticate with GitHub:
# Install GitHub CLI
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null
sudo apt update && sudo apt install gh
# Authenticate
gh auth login
The Search Strategy
Here's the key insight: don't search for "bounty". Everyone does that, and the competition is brutal. Instead, search for:
# search_queries.py
SEARCH_QUERIES = [
# High-value, low-competition
'"good first issue" label:bug -label:"reserved"',
'"help wanted" language:python created:>2026-05-01',
'"bounty" -org:SecureBananaLabs -org:ClankerNation',
# Specific technology stacks
'label:"good first issue" language:typescript stars:>100',
'label:bug language:rust -is:pr',
# Abandoned PRs (patience harvesting)
'is:pr is:open label:"help wanted" updated:<2026-05-15',
]
Implementation
# scout.py
import subprocess
import json
from typing import List, Dict
class BountyScout:
def __init__(self, blacklist_path: str = "blacklist.txt"):
self.blacklist = self._load_blacklist(blacklist_path)
def _load_blacklist(self, path: str) -> set:
"""Load repos to avoid (scam repos, banned hunters)."""
try:
with open(path) as f:
return {line.strip() for line in f if line.strip() and not line.startswith('#')}
except FileNotFoundError:
return set()
def search_issues(self, query: str, limit: int = 50) -> List[Dict]:
"""Search GitHub issues using gh CLI."""
cmd = [
"gh", "search", "issues", query,
"--state", "open",
"--sort", "created",
"--limit", str(limit),
"--json", "number,title,repository,labels,createdAt,comments"
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
print(f"Search failed: {result.stderr}")
return []
issues = json.loads(result.stdout)
# Filter out blacklisted repos
return [
issue for issue in issues
if issue["repository"]["nameWithOwner"] not in self.blacklist
]
def find_opportunities(self) -> List[Dict]:
"""Run all search queries and aggregate results."""
all_issues = []
for query in SEARCH_QUERIES:
issues = self.search_issues(query, limit=30)
all_issues.extend(issues)
# Deduplicate by issue URL
seen = set()
unique = []
for issue in all_issues:
key = f"{issue['repository']['nameWithOwner']}#{issue['number']}"
if key not in seen:
seen.add(key)
unique.append(issue)
return unique
Step 2: The Triage Engine — Evaluating Opportunities
Not all issues are worth pursuing. The triage engine scores each opportunity based on multiple factors.
The Scoring Algorithm
# triage.py
class BountyTriage:
def __init__(self):
self.weights = {
"stars": 0.15,
"competition": -0.25, # Negative = more competition = lower score
"recency": 0.20,
"label_match": 0.15,
"bounty_value": 0.25,
}
def score_issue(self, issue: Dict) -> float:
"""Score an issue from 0-100."""
score = 50 # Base score
# Stars (higher = better ecosystem)
stars = issue.get("repository", {}).get("stargazerCount", 0)
if stars > 1000:
score += 15
elif stars > 100:
score += 10
elif stars > 10:
score += 5
# Competition (fewer comments = less competition)
comments = issue.get("comments", {}).get("totalCount", 0)
if comments == 0:
score += 20 # No competition!
elif comments < 3:
score += 10
elif comments < 10:
score += 0
else:
score -= 15 # High competition
# Recency (newer = better)
created = issue.get("createdAt", "")
if created:
from datetime import datetime, timezone
created_dt = datetime.fromisoformat(created.replace("Z", "+00:00"))
age_days = (datetime.now(timezone.utc) - created_dt).days
if age_days < 1:
score += 15 # Fresh!
elif age_days < 7:
score += 10
elif age_days < 30:
score += 5
# Label matching
labels = {l["name"].lower() for l in issue.get("labels", [])}
if "good first issue" in labels:
score += 10
if "bounty" in labels:
score += 15
if "help wanted" in labels:
score += 5
# Clamp to 0-100
return max(0, min(100, score))
def triage_issues(self, issues: List[Dict]) -> List[Dict]:
"""Score and sort issues by priority."""
for issue in issues:
issue["triage_score"] = self.score_issue(issue)
return sorted(issues, key=lambda x: x["triage_score"], reverse=True)
The Blacklist
Some repos are traps. Here's my blacklist after 100+ hours:
# scam-repos.txt
# Repos that are fake, auto-generated, or ban hunters
SecureBananaLabs/bug-bounty
ClankerNation/OpenAgents
UnsafeLabs/Bounty-Hunters
OFFER-HUB/offer-hub-monorepo
Step 3: The Worker Module — Analyzing and Fixing
This is where the magic happens. The worker module:
- Clones the repository
- Analyzes the issue and codebase
- Generates a fix
- Runs tests
Repository Analysis
# worker.py
import subprocess
import os
from pathlib import Path
class RepoAnalyzer:
def __init__(self, repo_url: str, issue_number: int):
self.repo_url = repo_url
self.issue_number = issue_number
self.clone_dir = Path(f"/tmp/repos/{repo_url.split('/')[-1]}")
def clone(self) -> Path:
"""Clone the repository."""
if self.clone_dir.exists():
subprocess.run(["git", "pull"], cwd=self.clone_dir, check=True)
else:
subprocess.run(["git", "clone", self.repo_url, str(self.clone_dir)], check=True)
return self.clone_dir
def analyze_structure(self) -> Dict:
"""Analyze repository structure and conventions."""
analysis = {
"language": self._detect_language(),
"test_framework": self._detect_test_framework(),
"linting": self._detect_linting(),
"ci": self._detect_ci(),
"conventions": self._read_contributing_guide(),
}
return analysis
def _detect_language(self) -> str:
"""Detect primary language from files."""
if (self.clone_dir / "package.json").exists():
return "javascript/typescript"
elif (self.clone_dir / "requirements.txt").exists() or (self.clone_dir / "pyproject.toml").exists():
return "python"
elif (self.clone_dir / "Cargo.toml").exists():
return "rust"
elif (self.clone_dir / "go.mod").exists():
return "go"
return "unknown"
def _detect_test_framework(self) -> str:
"""Detect testing framework."""
if (self.clone_dir / "jest.config.js").exists():
return "jest"
elif (self.clone_dir / "pytest.ini").exists():
return "pytest"
elif (self.clone_dir / "vitest.config.ts").exists():
return "vitest"
return "unknown"
def _read_contributing_guide(self) -> str:
"""Read CONTRIBUTING.md if it exists."""
contributing = self.clone_dir / "CONTRIBUTING.md"
if contributing.exists():
return contributing.read_text()[:2000] # First 2000 chars
return ""
Code Generation
Here's where I use AI (Claude/GPT) to generate the fix:
# code_generator.py
from anthropic import Anthropic
class CodeGenerator:
def __init__(self):
self.client = Anthropic()
def generate_fix(self, issue: Dict, repo_analysis: Dict, relevant_files: List[str]) -> str:
"""Generate a fix for the issue."""
prompt = f"""You are an expert open source contributor. Generate a fix for this GitHub issue.
## Issue
Title: {issue['title']}
Body: {issue.get('body', 'No description')}
## Repository Analysis
Language: {repo_analysis['language']}
Test Framework: {repo_analysis['test_framework']}
Conventions: {repo_analysis['conventions'][:500]}
## Relevant Files
{self._format_files(relevant_files)}
## Requirements
1. Follow the repository's coding conventions exactly
2. Include tests if the repo has a test framework
3. Keep changes minimal and focused
4. Use proper commit message format: `type(scope): description`
5. Do NOT change files unrelated to the issue
Generate the complete fix with all necessary file changes.
"""
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text
Step 4: PR Submission — Professional Descriptions
The PR description is often more important than the code. Here's my template:
# pr_submitter.py
class PRSubmitter:
PR_TEMPLATE = """## Summary
{summary}
## Changes
{changes}
## Testing
{testing}
## Related Issues
Fixes #{issue_number}
## Screenshots (if UI changes)
{screenshots}
"""
def create_pr(self, repo: str, title: str, body: str, branch: str, base: str = "main"):
"""Create a PR using gh CLI."""
cmd = [
"gh", "pr", "create",
"--repo", repo,
"--title", title,
"--body", body,
"--head", branch,
"--base", base
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
raise Exception(f"PR creation failed: {result.stderr}")
return result.stdout.strip()
def format_pr_body(self, issue: Dict, changes: List[str], tests: str) -> str:
"""Format a professional PR body."""
changes_text = "\n".join(f"- {change}" for change in changes)
return self.PR_TEMPLATE.format(
summary=f"This PR addresses #{issue['number']} by {changes[0].lower()}.",
changes=changes_text,
testing=tests,
issue_number=issue['number'],
screenshots="N/A" if "ui" not in issue['title'].lower() else "TODO: Add screenshots"
)
Step 5: Review Monitor — Addressing Feedback
PRs don't get merged automatically. You need to respond to reviews:
# review_monitor.py
class ReviewMonitor:
def check_reviews(self, repo: str, pr_number: int) -> List[Dict]:
"""Check for new review comments."""
cmd = [
"gh", "api", f"repos/{repo}/pulls/{pr_number}/reviews",
"--jq", '.[] | {state: .state, user: .user.login, body: .body}'
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
return []
return json.loads(result.stdout)
def check_ci_status(self, repo: str, pr_number: int) -> Dict:
"""Check CI status."""
cmd = [
"gh", "api", f"repos/{repo}/pulls/{pr_number}",
"--jq", '{mergeable: .mergeable, mergeStateStatus: .mergeStateStatus}'
]
result = subprocess.run(cmd, capture_output=True, text=True)
return json.loads(result.stdout)
def ping_maintainer(self, repo: str, pr_number: int):
"""Ping maintainer after 2+ days of no review."""
comment = """Hi! 👋 This PR is ready to merge — all CI checks pass, no conflicts.
Would appreciate a review when you get a chance. Thanks! 🙏"""
subprocess.run([
"gh", "pr", "comment", str(pr_number),
"--repo", repo,
"--body", comment
], check=True)
Real Results: 50+ PRs, 21 Merged
Here are my actual results after running this system:
Success Stories
| Repository | PR | Description | Result |
|---|---|---|---|
| Aigen-Protocol | #40, #42, #43 | C# client, Japanese translation | ✅ Merged |
| HELPDESK.AI | 7 PRs | Various fixes and features | ✅ Merged |
| mobile-money | 9 PRs | Provider integrations, validation | ✅ Merged |
| better-auth | #9811 | Kysely adapter fix | 🔄 In Review |
| microsoft/markitdown | #1961 | Unused import fix | 🔄 In Review |
| cloudflare/speedtest | #106 | Double '?' fix | 🔄 In Review |
Failure Analysis
Not everything works. Here's what I learned from failures:
Why PRs Get Rejected:
- Too broad — PRs that change too many files
- Wrong style — Not following repo conventions
- Missing tests — Most repos require tests
- Duplicate work — Someone else already submitted a fix
- Scam repos — Some repos never merge any PRs
Acceptance Rate by Strategy:
- Credibility repos (repos with prior merges): 75%
- High-star repos (>1K stars): 25%
- Random bounty repos: 5%
- Scam repos: 0% (blacklisted)
The Economics: Is It Worth It?
Let me be brutally honest about the economics:
Costs
- API calls (Claude/GPT): ~$0.50 per PR attempt
- Time (automated): ~15 minutes per PR
- Time (manual review): ~5 minutes per PR
Revenue (Potential)
- Merged PRs: 21 × $50-500 per bounty = $1,050 - $10,500
- Audience building: 29 articles, growing readership
- Reputation: Credibility on 3+ repos
ROI Calculation
Total investment: ~$25 in API costs + ~20 hours manual time
Potential return: $1,050 - $10,500 in bounties
ROI: 42x - 420x (if bounties pay out)
The catch: Most bounties don't pay immediately. Payment depends on:
- PR getting merged
- Maintainer actually paying
- Bounty platform processing payment
Lessons Learned (The Hard Way)
1. Quality Over Quantity
I started with a "spray and pray" approach — submit as many PRs as possible. Result: 0% acceptance rate outside credibility repos.
Fix: Focus on repos where you already have merged PRs. Build credibility first.
2. Comment First, Code Second
Before writing any code, comment on the issue with your proposed approach. This:
- Gets maintainer buy-in early
- Prevents wasted effort if approach is wrong
- Shows you understand the problem
3. Read the CONTRIBUTING.md
Every repo has different conventions. Some require:
- Specific commit message format
- Test coverage thresholds
- Documentation updates
- Signed commits
4. Speed Matters (Sometimes)
For bounties, speed is critical. But for regular contributions, quality matters more.
5. Automated Reviews Are Real Reviews
Many repos use bots like CodeRabbit, cubic-dev-ai, or GitGuardian. Treat their feedback like human reviews — address every comment.
Complete Code: The Orchestrator
Here's the main orchestrator that ties everything together:
# orchestrator.py
import time
from datetime import datetime
class GitHubWorkflowAgent:
def __init__(self):
self.scout = BountyScout()
self.triage = BountyTriage()
self.worker = CodeGenerator()
self.submitter = PRSubmitter()
self.monitor = ReviewMonitor()
def run_cycle(self):
"""Run one complete cycle of the workflow."""
print(f"[{datetime.now()}] Starting workflow cycle...")
# 1. Discover opportunities
print("Step 1: Discovering opportunities...")
issues = self.scout.find_opportunities()
print(f"Found {len(issues)} issues")
# 2. Triage and prioritize
print("Step 2: Triaging...")
prioritized = self.triage.triage_issues(issues)
top_issues = prioritized[:5] # Top 5 opportunities
# 3. Process each opportunity
for issue in top_issues:
print(f"\nProcessing: {issue['title'][:60]}...")
# Check if we already have a PR for this issue
if self._has_existing_pr(issue):
print("Already have a PR, skipping...")
continue
# Analyze repository
repo_url = issue['repository']['url']
analyzer = RepoAnalyzer(repo_url, issue['number'])
repo_path = analyzer.clone()
analysis = analyzer.analyze_structure()
# Generate fix
relevant_files = self._find_relevant_files(repo_path, issue)
fix = self.worker.generate_fix(issue, analysis, relevant_files)
# Create branch and apply fix
branch_name = f"fix/issue-{issue['number']}"
self._create_branch(repo_path, branch_name)
self._apply_fix(repo_path, fix)
# Run tests
if analysis['test_framework'] != 'unknown':
test_result = self._run_tests(repo_path, analysis['test_framework'])
if not test_result:
print("Tests failed, skipping...")
continue
# Submit PR
pr_body = self.submitter.format_pr_body(issue, [fix[:100]], "All tests pass")
pr_url = self.submitter.create_pr(
issue['repository']['nameWithOwner'],
f"fix: {issue['title'][:50]}",
pr_body,
branch_name
)
print(f"PR submitted: {pr_url}")
# 4. Monitor existing PRs
print("\nStep 4: Monitoring existing PRs...")
self._monitor_existing_prs()
def run_forever(self, interval_minutes: int = 30):
"""Run the workflow continuously."""
while True:
try:
self.run_cycle()
except Exception as e:
print(f"Error in cycle: {e}")
print(f"\nSleeping for {interval_minutes} minutes...")
time.sleep(interval_minutes * 60)
if __name__ == "__main__":
agent = GitHubWorkflowAgent()
agent.run_forever()
Running as a Service
To run this 24/7, I use a systemd service:
# /etc/systemd/system/github-agent.service
[Unit]
Description=GitHub Workflow Agent
After=network.target
[Service]
Type=simple
User=agent
WorkingDirectory=/opt/github-agent
ExecStart=/usr/bin/python3 orchestrator.py
Restart=always
RestartSec=30
[Install]
WantedBy=multi-user.target
sudo systemctl enable github-agent
sudo systemctl start github-agent
Security Considerations
IMPORTANT: Never hardcode API keys. Use environment variables:
# .env
GITHUB_TOKEN=ghp_xxxxxxxxxxxxx
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxx
And load them securely:
import os
from dotenv import load_dotenv
load_dotenv()
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")
What's Next?
This system is still evolving. Here's what I'm working on:
- Multi-language support — Currently optimized for Python/TypeScript
- Smarter triage — Using ML to predict merge probability
- Automated rebasing — Handling merge conflicts automatically
- Payment tracking — Monitoring bounty payouts
Conclusion
Building an AI-powered GitHub workflow agent is possible, but it's not magic. The key insights:
- Start with credibility — Build trust before chasing bounties
- Quality over quantity — One great PR beats ten mediocre ones
- Read the room — Every repo has different conventions
- Be patient — Results take time
- Stay ethical — Don't spam, don't submit to scam repos
The code in this guide is real. The results are real. The failures are real. If you're willing to put in the work, you can build a system that contributes to open source while earning money.
Have you built something similar? I'd love to hear about your experience in the comments.
Follow me for more posts about AI agents, open source, and automation.
Resources:
Top comments (0)