Three weeks ago, I shipped a bug to production. Nothing catastrophic — a race condition in a Node.js endpoint, an edge case nobody on the team caught during review. The kind of mistake that slips through when you're in a hurry, tired, or just trusting a diff that's a little too long.
That bug is what made me build CodeClaw — an AI-powered code review assistant, built on OpenClaw, integrated directly into my GitHub Actions pipeline.
The Problem: Code Review Is a Bottleneck
In my 4-person engineering team, every PR waits an average of 6 to 14 hours before getting a first review. Not because we're lazy — but because we're busy, and a good review takes context, focus, and time.
Static tools like ESLint or SonarCloud do a great job on syntax and known anti-patterns. But they don't understand intent. They can't spot business logic bugs, subtle regressions, or questionable architectural decisions.
That's exactly where OpenClaw comes in.
How CodeClaw Works
PR opened on GitHub
│
▼
GitHub Actions triggered
│
▼
Python script fetches the diff via GitHub API
│
▼
Diff + repo context sent to OpenClaw
│
▼
OpenClaw analyzes and generates a structured report
│
▼
Automated comment posted on the PR
The core of the system is under 80 lines of Python. Here's the essential part:
import os
import httpx
from github import Github
OPENCLAW_API_KEY = os.environ["OPENCLAW_API_KEY"]
GITHUB_TOKEN = os.environ["GITHUB_TOKEN"]
def get_pr_diff(repo_name: str, pr_number: int) -> str:
g = Github(GITHUB_TOKEN)
repo = g.get_repo(repo_name)
pr = repo.get_pull(pr_number)
diff_lines = []
for file in pr.get_files():
diff_lines.append(f"### {file.filename}\n{file.patch or ''}")
return "\n\n".join(diff_lines)
def analyze_with_openclaw(diff: str, context: str) -> dict:
prompt = f"""
You are a senior code reviewer. Analyze this diff and provide:
- A summary of the changes (2-3 sentences)
- Potential risks (bugs, security, performance)
- Concrete improvement suggestions
- An overall score from 1 to 5
Project context: {context}
Diff:
{diff}
response = httpx.post(
"https://api.openclaw.dev/v1/analyze",
headers={"Authorization": f"Bearer {OPENCLAW_API_KEY}"},
json={"prompt": prompt, "model": "claw-3-opus"}
)
return response.json()
def post_review_comment(repo_name: str, pr_number: int, analysis: dict):
g = Github(GITHUB_TOKEN)
repo = g.get_repo(repo_name)
pr = repo.get_pull(pr_number)
comment = f"""
CodeClaw — Automated Review
Summary: {analysis['summary']}
Detected Risks:
{analysis['risks']}
Suggestions:
{analysis['suggestions']}
Overall Score: {'⭐' * analysis['score']} ({analysis['score']}/5)
Generated by CodeClaw · Powered by OpenClaw
pr.create_issue_comment(comment)
And the GitHub Actions workflow:
name: CodeClaw Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: pip install httpx PyGithub
- name: Run CodeClaw
env:
OPENCLAW_API_KEY: ${{ secrets.OPENCLAW_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO_NAME: ${{ github.repository }}
run: python codeclaw.py
What It Looks Like in Practice
For the past three weeks, every PR in our repository has automatically received a CodeClaw comment within 90 seconds of being opened. Here are some real examples of what OpenClaw caught:
A memory leak in a React event handler — an addEventListener with no corresponding removeEventListener
A SQL injection vulnerability in a query built through string concatenation in legacy code nobody had touched in 8 months
Tight coupling between two modules that was going to make future unit testing painful
That last one surprised me the most. ESLint would never have flagged it. SonarCloud wouldn't either. It's an architectural observation that requires actually understanding the code, not just scanning it.
What I Learned
OpenClaw is not a replacement for human review. It's an intelligent first filter that arrives before your teammates do. It catches the obvious stuff, raises the right questions, and lets human reviewers focus on what really matters: product vision alignment, long-term architectural choices, mentorship.
The real win isn't speed it's quality of attention. When your colleague opens your PR, the low-hanging fruit has already been sorted. The conversation can go deeper from the start.
Context injection is everything. Early versions of CodeClaw produced generic feedback. It wasn't until I started injecting project-specific context tech stack, team conventions, business domain that the analyses became genuinely useful. OpenClaw is a precision tool. It deserves precise instructions.
What's Next
I'm working on CodeClaw v2 with:
A review history so OpenClaw learns recurring patterns in our codebase
A Slack integration to alert the team about critical risks in real time
A metrics dashboard to track code quality trends over time
The source code is on GitHub. If you want to try it on your own repo, everything is in the README expect about 15 minutes to get a working integration up and running.
That production bug three weeks ago? It wouldn't have made it past CodeClaw.
Top comments (0)