It started as a small annoyance
PR reviews are always a chore. On a small team — or a side project I run alone — the "someone has to look at this" person is always me. And if you're pushing straight to main, code review effectively disappears.
I started by stacking pylint and flake8 on top of GitHub Actions. But those don't answer the questions that actually matter: did this change do what I meant it to? Or does the commit message actually describe what changed? Static analysis catches grammar and style. It can't read intent.
So I asked Claude to review the same diffs, fused both signals together, scored them out of 100, and pushed the result to Telegram. That became SCAManager.
GitHub: https://github.com/xzawed/SCAManager
What it does
When a GitHub Webhook fires for a Push or PR event, the following runs in parallel:
Static analysis — pylint, flake8, bandit
AI code review — Claude Haiku 4.5
Commit message evaluation — Claude AI
Results map to a 100-point score and an A–F grade, then ship to whichever of the nine channels you've configured: Telegram, GitHub PR Comment, GitHub Commit Comment, GitHub Issue, Discord, Slack, Email, Generic Webhook, n8n.
For PRs, the score drives the gate automatically:
Auto mode — Above threshold → GitHub APPROVE. Below → REQUEST_CHANGES.
Semi-auto mode — Inline buttons in Telegram for manual approval.
Auto-merge — Above a separate threshold → squash merge.
The scoring system — why these weights
ItemPointsEvaluatorCode quality25pylint + flake8Security20banditCommit message15Claude AIImplementation direction25Claude AITest coverage15Claude AITotal100
Things machines see well go to machines (pylint, bandit). Things that need human judgment go to AI. AI evaluations come back on a 0–10 or 0–20 scale, then get re-weighted into the final score.
If ANTHROPIC_API_KEY isn't set, the AI items default to a neutral middle, and static analysis alone can still hit 89 points (B grade) at most. The tool isn't useless without API spend.
Architecture — the parts that were interesting to build
- asyncio.gather() for parallelism Running static analysis and AI review serially makes per-PR analysis time miserable. Wrapping them in asyncio.gather() collapses total wall-clock to whatever the slowest task is. I use asyncio.gather(return_exceptions=True) for the nine notification channels too — but here the goal is isolation, not speed. If Telegram is down, that shouldn't block Slack.
- Idempotency — same SHA, no double work GitHub Webhooks get retransmitted (response timeouts, retries, etc.). Running the same commit SHA twice costs money and produces no new information, so I dedupe by SHA at the DB layer. GitHub Push/PR └─ POST /webhooks/github (HMAC-SHA256 verification) └─ BackgroundTask: run_analysis_pipeline() ├─ Repo register · SHA dedup (idempotency) ├─ asyncio.gather() ── parallel │ ├─ analyze_file() × N (pylint · flake8 · bandit) │ └─ review_code() (Claude AI) ├─ calculate_score() → grade ├─ run_gate_check() [PR only] └─ asyncio.gather(return_exceptions=True) → notification channels
- Two ways to use the AI Same review, two call paths:
Server mode — Anthropic API. Needs ANTHROPIC_API_KEY. Costs money.
Local hook mode — Claude Code CLI (claude -p). Runs locally, no API key needed.
Local hook mode runs as a pre-push git hook. Output goes to terminal and to the dashboard. Environments without the CLI (Codespaces, mobile) silently skip the hook — exit 0 always, never blocks the push.
- DB Failover I built a FailoverSessionFactory that switches over to a fallback PostgreSQL when primary dies. /health reports which DB is currently active. Honestly, this is probably over-engineered. Whether a small side project actually needs failover is a separate question — building it was largely a learning exercise.
Limits and trade-offs
This tool isn't going to fit every team. Being honest about it:
Python-only — Static analysis is pylint/flake8/bandit. For non-Python repos, only the AI review piece gives you value.
AI score consistency — LLM output isn't 100% deterministic. The score is for spotting trends, not as a hard, trustworthy number.
API cost — Teams shipping big PRs frequently can rack up Claude API spend fast. File filters and thresholds give you some control, but it's a real cost line.
Auto-merge risk — Score-driven squash merge is convenient and dangerous. Validate your threshold settings before turning it on. Start in semi-auto mode.
If you want to try it
Repo: https://github.com/xzawed/SCAManager
License: MIT
Required: Python 3.13 · PostgreSQL · GitHub OAuth App
Optional: ANTHROPIC_API_KEY · Telegram Bot Token · SMTP
Easiest deploy: Railway with the PostgreSQL plugin and your env vars filled in. For on-prem, uvicorn + nginx + systemd works fine.
Feedback, issues, and "wait, is this actually how it should behave?" reports are all welcome.
Top comments (0)