DEV Community

Charlie Li
Charlie Li

Posted on

I Replaced Our $300/month Code Review Tool with a 50-Line GitHub Action

Last month, our 8-person team was paying $288/month for CodeRabbit ($19/seat × 8 devs × 2 repos).

The reviews were good. But $3,456/year for "this function is too long" and "consider adding error handling"? I started wondering what we were actually paying for.

So I built a replacement using GitHub Actions + OpenAI API. Total cost: $4.20 last month. It catches real bugs, not just style nits.

Here's the exact setup.

The 50-Line Config

Create .github/workflows/ai-review.yml:

name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write

jobs:
  ai-review:
    runs-on: ubuntu-latest
    if: github.event.pull_request.draft == false
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get changed files
        id: diff
        run: |
          git diff origin/${{ github.event.pull_request.base.ref }}...HEAD \
            --name-only --diff-filter=ACMR \
            -- '*.ts' '*.tsx' '*.js' '*.jsx' '*.py' '*.go' '*.rs' \
            | head -20 > changed_files.txt
          echo "count=$(wc -l < changed_files.txt)" >> $GITHUB_OUTPUT

      - name: Get diff content
        if: steps.diff.outputs.count > 0
        run: |
          git diff origin/${{ github.event.pull_request.base.ref }}...HEAD \
            -- $(cat changed_files.txt | tr '\n' ' ') > pr_diff.txt

      - name: AI Review
        if: steps.diff.outputs.count > 0
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          PR_TITLE: ${{ github.event.pull_request.title }}
          PR_BODY: ${{ github.event.pull_request.body }}
        run: |
          DIFF=$(head -c 30000 pr_diff.txt)
          RESPONSE=$(curl -s https://api.openai.com/v1/chat/completions \
            -H "Authorization: Bearer $OPENAI_API_KEY" \
            -H "Content-Type: application/json" \
            -d "$(jq -n \
              --arg diff "$DIFF" \
              --arg title "$PR_TITLE" \
              --arg body "$PR_BODY" \
              '{
                model: "gpt-4o-mini",
                messages: [{
                  role: "system",
                  content: "You are a senior engineer doing code review. Be concise. Focus on: bugs, security issues, performance problems, and logic errors. Skip style/formatting. For each issue, cite the exact file and line. If the code is fine, say so briefly."
                }, {
                  role: "user",
                  content: "PR: \($title)\n\nDescription: \($body)\n\nDiff:\n\($diff)"
                }],
                max_tokens: 1500,
                temperature: 0.1
              }')" )
          echo "$RESPONSE" | jq -r '.choices[0].message.content' > review.md

      - name: Post Review Comment
        if: steps.diff.outputs.count > 0
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          REVIEW=$(cat review.md)
          gh pr comment ${{ github.event.pull_request.number }} \
            --body "## 🤖 AI Code Review

          $REVIEW

          ---
          *Powered by GPT-4o-mini · [How this works](https://1611833802030.gumroad.com/l/ai-code-review-guide)*"
Enter fullscreen mode Exit fullscreen mode

That's it. Push this file, and every PR gets an AI review.

What This Actually Catches

After 3 weeks and ~120 PRs, here's what surprised me:

Real bugs it caught:

  • A Promise.all that silently swallowed rejections (would have caused data loss)
  • An SQL query with string interpolation instead of parameterized queries (SQL injection)
  • A race condition in a cache invalidation handler
  • An off-by-one error in a pagination endpoint (returning 11 items for ?limit=10)

What it (correctly) ignores:

  • Variable naming preferences
  • Import ordering
  • "Consider using optional chaining" type suggestions

The key is the system prompt. Notice: "Skip style/formatting". This one instruction eliminates 80% of the noise that makes commercial tools annoying.

Cost Breakdown

Here's our actual usage for March 2026:

Metric Value
PRs reviewed 124
Avg tokens/review ~3,200 input + ~800 output
Model GPT-4o-mini
Cost per review ~$0.034
Monthly total $4.20

For comparison:

  • CodeRabbit: $19/seat/month → $152/month (8 devs)
  • Bito: $15/seat/month → $120/month
  • Greptile: $24/seat/month → $192/month
  • This setup: $4.20/month (flat, not per-seat)

That's a 97% reduction.

5 Optimizations That Actually Matter

1. Filter files aggressively

The --diff-filter=ACMR flag already skips deleted files. But also skip files that don't need review:

-- '*.ts' '*.tsx' '*.js' '*.jsx' '*.py' '*.go' '*.rs'
Enter fullscreen mode Exit fullscreen mode

This skips .md, .json, .yaml, config files, and lock files. Saves ~40% on tokens.

2. Cap the diff size

DIFF=$(head -c 30000 pr_diff.txt)
Enter fullscreen mode Exit fullscreen mode

30KB covers most PRs. For mega-PRs (1000+ lines), you'd need a chunking strategy — but honestly, mega-PRs are a process problem, not a tooling problem.

3. Use GPT-4o-mini, not GPT-4o

For code review, gpt-4o-mini catches 90%+ of what gpt-4o catches, at 1/15th the cost. The quality difference is mostly in nuanced architectural suggestions — not in bug detection.

Save gpt-4o for security-critical repos.

4. Set temperature to 0.1

"temperature": 0.1
Enter fullscreen mode Exit fullscreen mode

Low temperature = consistent, deterministic reviews. You don't want creative writing in code review.

5. Skip draft PRs

if: github.event.pull_request.draft == false
Enter fullscreen mode Exit fullscreen mode

Don't waste API calls on work-in-progress.

What This Doesn't Do (And Whether You Should Care)

Honest limitations:

Feature Commercial Tools This Setup
Inline comments on specific lines ❌ (PR-level comment)
Auto-fix PRs Some ❌ (but achievable*)
Learning from your codebase
Dashboard / analytics
SOC 2 / enterprise compliance

*Auto-fix is doable — you generate a patch and create a PR. I cover this in the full guide (Chapter 6).

My take: If you're a 5-50 person team, the basic setup above covers 80% of the value. Inline comments are nice but not essential. If you're 100+ engineers with compliance requirements, CodeRabbit is probably worth it.

Switching to Claude (Alternative)

Prefer Anthropic? Swap the API call:

RESPONSE=$(curl -s https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d "$(jq -n \
    --arg diff "$DIFF" \
    --arg title "$PR_TITLE" \
    '{
      model: "claude-sonnet-4-5-20250514",
      max_tokens: 1500,
      system: "You are a senior engineer doing code review. Be concise. Focus on bugs, security issues, performance problems, and logic errors. Skip style/formatting.",
      messages: [{
        role: "user",
        content: "PR: \($title)\n\nDiff:\n\($diff)"
      }]
    }')")
echo "$RESPONSE" | jq -r '.content[0].text' > review.md
Enter fullscreen mode Exit fullscreen mode

Claude tends to be more thorough on security issues. GPT-4o-mini is faster and cheaper for general review.

The Result

3 weeks in:

  • $4.20/month vs $288/month (saved $283.80)
  • 2 real bugs caught that humans missed (the SQL injection alone justified the effort)
  • Review comments dropped from an average of 12 (CodeRabbit, mostly noise) to 3-4 (focused, actionable)
  • Setup time: 5 minutes

The biggest win wasn't cost — it was signal-to-noise ratio. When every AI comment is worth reading, developers actually read them.


Want the Full Playbook?

This article covers the basics. The complete 208-page guide includes:

  • Auto-fix PRs — AI generates patches, not just suggestions (Chapter 6)
  • Security automation — OWASP Top 10 scanning on every PR (Chapter 5)
  • Team standards as code.review-rules.yml that AI enforces (Chapter 7)
  • Multi-repo & monorepo strategies (Chapter 8)
  • Cost optimization — caching, batching, incremental review (Chapter 9)
  • ROI templates for convincing your manager (real case: 1,018% ROI)

📘 AI Code Review: The Practical Guide — $9.99 on Gumroad

New to AI code review? Start with the free cheat sheet — one-page config + prompt templates.

Top comments (0)