I built an MCP server that finds you mergeable open source issues in 30 seconds
The problem
I'm a CS student at IIT Guwahati. A few months ago I decided I wanted to contribute to open source. The advice was always the same: "look for the good first issue label."
So I did. And every single time:
- The issue was already assigned
- Someone had opened a PR yesterday
- The repo hadn't been touched in 8 months
- It needed a language I knew on paper but not in practice
After three weekends of this loop, I gave up and went back to building side projects.
Then MCP (Model Context Protocol) launched and I realized: this is exactly the kind of problem an AI agent should solve. Not by generating anything — just by filtering GitHub data better than I can scroll through it.
So I built OpenCollab MCP.
What it does
You ask Claude (or Cursor, or any MCP client):
"Find me a Python good-first-issue I can finish this weekend. Make sure nobody's working on it."
OpenCollab exposes 22 tools to the AI. The model picks the right ones, chains them together:
-
match_me— reads my GitHub, picks my strongest language -
find_issues— searches good-first-issues in that language -
check_issue_availability— for each candidate, verifies no one's working on it -
issue_complexity— rates difficulty 1-10
I get back 5 actually-mergeable issues in under 30 seconds.
Then:
"Plan a PR for issue #456 in owner/repo."
It calls generate_pr_plan which fetches the issue body, comments, CONTRIBUTING.md, repo structure, and default branch — handing the AI everything needed to draft real code.
The 22 tools
🔍 Discovery (6): find_issues, trending_repos, similar_repos, find_mentor_repos, weekend_issues, match_me
📊 Evaluation (7): repo_health, contribution_readiness, impact_estimator, repo_activity_pulse, compare_repos, repo_languages, dependency_check
🎯 Issue Intel (6): check_issue_availability, issue_complexity, stale_issue_finder, label_explorer, recent_prs, generate_pr_plan
👤 Profile (3): analyze_profile, first_timer_score, contributor_leaderboard
Design choices that mattered
It's a data bridge, not an AI
Zero AI inference happens on my end. Your client (Claude/Cursor) does all the reasoning. OpenCollab just feeds it clean GitHub data.
This means: zero cost to me, fully private (runs on your machine), and it inherits whatever model your client uses.
Pydantic for every input
Every tool's input is a Pydantic model with extra="forbid" and str_strip_whitespace=True. LLMs sometimes pass stray fields or whitespace — Pydantic catches it before any logic runs.
class IssueInput(BaseModel):
model_config = ConfigDict(str_strip_whitespace=True, extra="forbid")
owner: str = Field(..., min_length=1)
repo: str = Field(..., min_length=1)
issue_number: str = Field(..., min_length=1)
The issue_number: str is intentional — clients pass numbers as strings, and a permissive parser handles '#123', ' 123 ', and '123' uniformly.
Async + parallel for the heavy tools
match_me was originally 3 sequential API calls. Now it's asyncio.gather:
user, repos_raw = await asyncio.gather(
github_get(f"/users/{username}"),
github_get(f"/users/{username}/repos", {...}),
)
Same trick in repo_health (3 calls), compare_repos (4 calls per repo), dependency_check (8 file lookups), generate_pr_plan (5 endpoints). 3-5x latency improvement on the heavy paths.
In-memory TTL cache for rate limits
GitHub's unauthenticated rate limit is brutal. Even authenticated, 5 chained tool calls per question burns through fast. I added a 5-minute in-memory cache:
def _cache_get(key: str) -> Any | None:
entry = _cache.get(key)
if not entry:
return None
expires_at, value = entry
if time.monotonic() > expires_at:
_cache.pop(key, None)
return None
return value
Hand-rolled because functools.lru_cache doesn't do TTL and I didn't want a cachetools dependency for 30 lines.
MockTransport for fast tests
All 45 tests run in 0.12 seconds. No real network. httpx.MockTransport lets me return arbitrary status codes per path, which mattered for testing the GitHub-202-while-stats-compute case.
def _handler(request: httpx.Request) -> httpx.Response:
path = request.url.path
if path not in routes:
return httpx.Response(404, json={"message": "Not Found (mock)"})
spec = routes[path]
status, body = spec if isinstance(spec, tuple) else (200, spec)
return httpx.Response(status, json=body)
Install in 60 seconds
pip install opencollab-mcp
# or
uvx opencollab-mcp
Then add to your Claude Desktop config:
{
"mcpServers": {
"opencollab": {
"command": "uvx",
"args": ["opencollab-mcp"],
"env": { "GITHUB_TOKEN": "your_token_here" }
}
}
}
Restart Claude. Done.
What's next
-
first_pr_generator— one-shot find + plan + draft my first PR -
track_my_prs— dashboard with staleness nudges -
skill_gap— compare your skills vs a target repo's stack
If you've been wanting to contribute to open source but couldn't find the right issue, give it a shot. And if it helps you land a PR — a ⭐ on the repo would genuinely make my week.
GitHub: https://github.com/prakhar1605/Opencollab-mcp
PyPI: https://pypi.org/project/opencollab-mcp/
Top comments (0)