I Replaced Our $600/Month Code Review Tool with 30 Lines of YAML (Here's How)

#ai #codereview #githubactions #devops

Last month, our team was paying $600/month for an AI code review tool. 10 developers × $60/seat.

The tool was fine. But I started wondering: these tools just call OpenAI's API and post comments on PRs. The API costs ~$0.02 per review. We were paying $60/seat for a glorified API wrapper.

So I built our own. It took 5 minutes. It costs ~$8/month for the entire team.

Here's the exact setup.

The 30-Line Workflow

Create .github/workflows/ai-review.yml:

name: AI Code Review
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get changed code
        id: diff
        run: |
          DIFF=$(git diff origin/${{ github.base_ref }}...HEAD -- \
            '*.py' '*.js' '*.ts' '*.go' '*.java' '*.rs' \
            ':!*.min.js' ':!*lock*' ':!*.generated.*')
          # Truncate to avoid token limits
          echo "diff<<EOF" >> $GITHUB_OUTPUT
          echo "$DIFF" | head -c 12000 >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: AI Review
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          REVIEW=$(curl -s https://api.openai.com/v1/chat/completions \
            -H "Authorization: Bearer $OPENAI_API_KEY" \
            -H "Content-Type: application/json" \
            -d "{
              \"model\": \"gpt-4o-mini\",
              \"messages\": [{
                \"role\": \"system\",
                \"content\": \"You are a senior code reviewer. Review for: 1) Bugs and logic errors 2) Security vulnerabilities (OWASP Top 10) 3) Performance issues 4) Code style violations. Be specific with line numbers. Use GitHub markdown.\"
              },{
                \"role\": \"user\",
                \"content\": $(echo '${{ steps.diff.outputs.diff }}' | jq -Rs .)
              }],
              \"max_tokens\": 2000
            }" | jq -r '.choices[0].message.content')

          gh pr comment ${{ github.event.pull_request.number }} \
            --body "## 🤖 AI Code Review

          $REVIEW

          ---
          *Powered by gpt-4o-mini · [How this works](https://1611833802030.gumroad.com/l/ai-code-review-cheat-sheet)*"

That's it. Every PR now gets an AI review automatically.

What This Actually Catches

After 3 months and ~400 PRs, here's what our AI reviewer found that humans missed:

Issue Type	Caught	Example
SQL Injection	12	Raw string interpolation in queries
Missing null checks	47	`user.profile.name` without optional chaining
N+1 queries	8	Loop inside a loop hitting the database
Hardcoded secrets	3	API keys in test files
Race conditions	5	Concurrent map access in Go

The AI doesn't replace human reviewers. But it catches the "obvious" stuff that humans skim over when they're reviewing 15 PRs before lunch.

Cost Breakdown: DIY vs. Commercial Tools

Solution	10-Dev Team Cost	Per-Review Cost	Setup Time
This workflow	~$8/mo	$0.02	5 minutes
CodeRabbit	$600/mo	Included	10 minutes
Bito	$400/mo	Included	15 minutes
Greptile	$500/mo	Included	20 minutes
Sourcery	$480/mo	Included	10 minutes

Why is it so cheap? GPT-4o-mini costs $0.15/1M input tokens. A typical PR diff is ~2,000 tokens. That's $0.0003 for the input + ~$0.006 for the output. Under $0.01 per review.

Even if you review 50 PRs/day, that's $15/month. Not $600.

3 Upgrades Worth Adding

1. Security-Focused Prompt

Replace the system message for repos that handle user data:

Focus exclusively on security:
- SQL injection (parameterized queries?)
- XSS (output encoding?)
- Authentication bypass
- Hardcoded credentials
- Insecure deserialization
- SSRF potential
Flag severity as 🔴 Critical / 🟡 Warning / 🟢 Info

2. Skip Bot-Generated Code

Add a path filter to avoid reviewing auto-generated files:

      - name: Get changed code
        run: |
          DIFF=$(git diff origin/${{ github.base_ref }}...HEAD -- \
            '*.py' '*.js' '*.ts' \
            ':!*.generated.*' ':!*.min.*' ':!*lock*' \
            ':!*migration*' ':!*__snapshots__*' \
            ':!vendor/*' ':!node_modules/*')

3. Use Claude for Complex Reviews

For architecture reviews or large PRs, Claude Sonnet gives more nuanced feedback:

curl -s https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d "{
    \"model\": \"claude-sonnet-4-20250514\",
    \"max_tokens\": 2000,
    \"messages\": [{
      \"role\": \"user\",
      \"content\": \"Review this code diff for architectural concerns, not just bugs:\n\n$DIFF\"
    }]
  }"

Common Pitfalls

"The diff is too big" — Truncate to 12K chars (line 14 above). For bigger PRs, split by file and review each separately. The full guide covers a chunking strategy.

"Reviews are too generic" — Add your team's coding standards to the system prompt. "We use Python 3.12+, prefer dataclasses over dicts, and require type hints on all public functions."

"It comments on every PR including dependabot" — Add a condition:

if: github.actor != 'dependabot[bot]'

"Rate limiting" — OpenAI's rate limits are generous (500 RPM for gpt-4o-mini). If you hit them, add a retry with exponential backoff.

Want the Full Playbook?

This article covers the basics. I wrote a 208-page guide that goes deep into:

🔧 Auto-fix PRs — AI doesn't just find bugs, it opens fix PRs automatically
🔒 Security automation — OWASP Top 10 scanning on every commit
🏢 Multi-repo strategies — monorepo and cross-repo review patterns
💰 Cost optimization — from $0.02/review down to $0.005/review
📊 ROI quantification — prove the value to your manager (real case: 1,018% ROI)
👥 Team standards as code — encode your style guide into AI rules