Nimesh Johri

Posted on Jul 17, 2025

Smarter Code Reviews: OpenAI Meets GitHub PRs

#ai #githubactions #openai #codereview

AI-Powered Code Scanning in PRs with OpenAI & GitHub Actions

Ensure every pull request in your codebase is clean, tested, and production-ready with the power of OpenAI APIs integrated into GitHub Actions.

🚀 Why Automate Code Scanning?

Modern software teams merge dozens of pull requests every week. Manual reviews often miss critical issues like:

Unused variables or memory leaks
Missing test coverage
SSR hydration mismatches
Inefficient component rendering

AI code scanning helps automate these checks at scale using OpenAI's GPT models.

🛠️ What We'll Build

An automated GitHub Actions workflow that:

Scans code in PRs using OpenAI APIs (e.g., GPT-4o)
Detects issues like SSR bugs, test coverage gaps, memory leaks, or bad patterns
Posts inline GitHub annotations with severity for each issue or fails the build on critical issues

📁 Project Structure

Your repo should include a folder like this:

.github/workflows/ai-code-scan.yml
scripts/ai_ssr_scan.py

We'll write both the GitHub Action YAML and the Python script using OpenAI.

🧠 Step 1: Create the AI Scan Script (Python)

Create scripts/ai_ssr_scan.py:

#!/usr/bin/env python3

import os
import re
import subprocess
import json
from pathlib import Path
import argparse
import glob
from openai import OpenAI

# ---------------- CONFIG ----------------
FILES_TO_SCAN = [
    "client/components/**/*.js",
    "client/modules/**/*.tsx",
    "client/helpers/**/*.js",
    "client/modules/**/*.jsx",
    "client/modules/**/*.tsx",
    "client/components/**/*.tsx",
]
USE_GIT_DIFF = True
MAX_CODE_CHARS = 8000   # Adjusted for OpenAI's max input
MAX_RESPONSE_TOKENS = 4096 # Adjusted for OpenAI's max response
REPORT_DIR = ".github/ai-scan-reports" # Directory for storing reports
ANNOTATION_FILE = f"{REPORT_DIR}/annotations.json" 
COMMENT_FILE = f"{REPORT_DIR}/comment.md"
OPENAI_MODEL = "OPENAI_MODEL"
CONTEXT_WINDOW = 128000 // 4  # Adjusted for OpenAI's context window

# ------------- System Prompt --- This is the system prompt for the AI model
# It can be modified to change the behavior of the AI model based on the language and framework used.
SYSTEM_PRIME = {
    "role": "system",
    "content": (
        "You are a world-class expert in React SSR with Koa.js. "
        "Analyze the code for these critical issues:\n"
        "1. Memory leaks in effects/subscriptions\n"
        "2. Hydration mismatches between server/client\n"
        "3. Performance bottlenecks in SSR/data fetching\n"
        "4. Improper error handling in async flows\n"
        "5. Security vulnerabilities in SSR context\n"
        "6. State management anti-patterns\n"
        "7. Accessibility violations\n"
        "\nFor each finding provide:\n"
        "- Concise problem description\n"
        "- Severity (`low`, `medium`, `high`, or `critical`)\n"
        "- SSR-specific solution with code example\n"
        "- Performance impact analysis\n"
        "\nFormat requirements:\n"
        "- Begin each finding with [FILE: filename]\n"
        "- Mark Severity: clearly\n"
        "- Include code examples with proper SSR considerations\n"
        "- Prioritize critical/high severity issues first"
    )
}

# --------- Helper Functions ----------
def insert_badge(issue_title, severity): # Function to insert a badge based on severity
    badge = {
        "critical": "🛑 Critical",
        "failure": "🔴 High",
        "warning": "🟠 Medium",
        "notice": "🟢 Low"
    }.get(severity, "⚪ Unknown")
    return f"### {badge} | {issue_title}"

def extract_issues_with_badges(text):
    output = []
    current_file = None
    current_issue = ""
    current_severity = "notice"
    buffer = []

    for line in text.splitlines():
        if line.startswith("[FILE:"):
            if buffer:
                output.append("\n".join(buffer))
                buffer = []
            current_file = line.strip()
            buffer.append(current_file)
        elif "**Issue:**" in line:
            if buffer and current_issue:
                output.append("\n".join(buffer))
                buffer = [current_file] if current_file else []
            current_issue = line.split("**Issue:**")[1].strip()
        elif "**Severity:**" in line:
            severity_text = line.split("**Severity:**")[1].strip().lower()
            if "critical" in severity_text:
                current_severity = "critical"
            elif "high" in severity_text:
                current_severity = "failure"
            elif "medium" in severity_text:
                current_severity = "warning"
            else:
                current_severity = "notice"
            badge_header = insert_badge(current_issue, current_severity)
            buffer.append(badge_header)
        else:
            buffer.append(line)

    if buffer:
        output.append("\n".join(buffer))

    return "\n\n".join(output)

def prepare_annotations(text):
    annotations = []
    current_file = None
    current_severity = "warning"

    for line in text.splitlines():
        if line.startswith("[FILE:"):
            current_file = line.split("[FILE:")[1].split("]")[0]
        elif "**Severity:**" in line:
            severity_text = line.split("**Severity:**")[1].strip().lower()
            if "critical" in severity_text:
                current_severity = "failure"
            elif "high" in severity_text:
                current_severity = "failure"
            elif "medium" in severity_text:
                current_severity = "warning"
            else:
                current_severity = "notice"
        elif line.strip().startswith("**Issue:**") and current_file:
            issue_msg = re.sub(r"\*\*Issue:\*\*", "Issue:", line.strip())
            annotations.append({
                "path": current_file,
                "annotation_level": current_severity,
                "message": issue_msg,
                "start_line": 1,
                "end_line": 1
            })

    return annotations

def run_git_diff():
    base_branch = os.environ.get("GITHUB_BASE_REF")
    if not base_branch:
        raise ValueError("GITHUB_BASE_REF not set")
    base_commit = subprocess.check_output(
        ["git", "merge-base", f"origin/{base_branch}", "HEAD"],
        stderr=subprocess.DEVNULL
    ).decode().strip()
    changed = subprocess.check_output(
        ["git", "diff", "--name-only", base_commit, "HEAD"],
        stderr=subprocess.DEVNULL
    ).decode().splitlines()
    return [f for f in changed if f.endswith((".js", ".jsx", ".tsx", ".ts")) and os.path.exists(f)]

def call_openai_for_file(path):
    code = Path(path).read_text(encoding="utf-8", errors="ignore")[:MAX_CODE_CHARS]
    user_msg = {
        "role": "user",
        "content": f"--- FILE: {path} ---\n```
{% endraw %}
js\n{code}\n
{% raw %}
```"
    }
    messages = [SYSTEM_PRIME] + [user_msg]

    prompt_text = "".join(m.get("content", "") for m in messages)
    est_tokens = len(prompt_text) // 4
    max_tok = min(MAX_RESPONSE_TOKENS, CONTEXT_WINDOW - est_tokens)

    client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
    resp = client.chat.completions.create(
        model=OPENAI_MODEL,
        messages=messages,
        temperature=0,
        max_tokens=max_tok,
        stream=False
    )
    return resp.choices[0].message.content.strip()

# --------- Main Execution ----------
def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--dry-run", action="store_true")
    args = parser.parse_args()

    files = run_git_diff() if USE_GIT_DIFF else sum((glob.glob(p, recursive=True) for p in FILES_TO_SCAN), [])
    print(f"Total files to scan: {len(files)}")
    os.makedirs(REPORT_DIR, exist_ok=True)
    if args.dry_run:
        return

    all_results = []
    all_annotations = []
    batch_num = 1 # Batch number for report files

    for path in files:
        print(f"\n=== Scanning {path} ===")
        try:
            result = call_openai_for_file(path)
            all_results.append(result)
            all_annotations.extend(prepare_annotations(result))
            out = Path(f"{REPORT_DIR}/batch_{batch_num}.txt")
            out.write_text(result, encoding="utf-8")
            batch_num += 1
        except Exception as e:
            print(f"Error processing {path}: {str(e)}")

    with open(COMMENT_FILE, "w") as c:
        c.write("## Code Review Report\n\n")
        c.write("### Key Findings\n")
        c.write("This report identifies potential issues in your implementation.\n\n")

        for i, res in enumerate(all_results, 1):
            formatted = extract_issues_with_badges(res)
            c.write(f"{formatted}\n\n")

    with open(ANNOTATION_FILE, "w") as a:
        json.dump(all_annotations, a, indent=2)

    print("\n=== SSR Code Review Complete ===")
    print(f"Report generated at: {REPORT_DIR}")

if __name__ == "__main__":
    main()

🔒 Step 2: Add OpenAI Key to GitHub Secrets

Go to your GitHub repo:

Settings → Secrets → Actions → New repository secret
Name: OPENAI_API_KEY
Value: Your OpenAI API Key

⚙️ Step 3: Add GitHub Actions Workflow

Create .github/workflows/ai-code-scan.yml:

name: AI Code Scan

on:
  pull_request:
    paths:
      - "client/**/*.tsx"
      - "server/**/*.ts"

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Code
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11'

      - name: Install Dependencies
        run: pip install openai

      - name: Run AI Code Scan
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: python3 scripts/ai_ssr_scan.py
      - name: Read annotations JSON
        id: annotations
        run: |
          content=$(cat .github/ai-scan-reports/annotations.json | jq -c .)
          echo "ANNOTATIONS_JSON=$content" >> $GITHUB_ENV

      - name: Report inline annotations
        uses: LouisBrunner/checks-action@v1.6.0
        with:
          token: ${{ secrets.GITHUB_TOKEN }}
          name: SSR AI Scan
          conclusion: success
          output: |
            {
              "title": "SSR AI Scan Report",
              "summary": "Annotated inline issues found by AI"
            }
          annotations: ${{ env.ANNOTATIONS_JSON }}

      - name: Comment summary on PR
        if: always()
        uses: marocchino/sticky-pull-request-comment@v2
        with:
          path: .github/ai-scan-reports/comment.md

🧪 Optional: Fail Build or Post PR Comment

You can enhance the scan to:

Fail build on critical output keywords (e.g., "memory leak")
Use GitHub's REST API to post comments or annotations

📊 Example Output

This report identifies potential issues in your SSR implementation with React and Koa.js.

🔍 [FILE: src/PlanComparisonTable.tsx]

Issue: Memory Leaks in Effects

Severity: high
Impact: Potential memory leaks due to missing cleanup in useEffect
Fix:
useEffect(() => {
  const storedPlan = sessionStorage.getItem('selectedPlan');
  if (storedPlan !== null) {
    setSelectedPlan(parseInt(storedPlan));
  } else {
    let currentStoredPlan = fetchCurrentPlan();
    });
  }
  // Cleanup function to prevent memory leaks
  return () => {
    setSelectedPlan(null);
  };
}, [data]);

✅ Benefits Over Traditional Linters or SonarQube

Feature	ESLint/SonarQube	OpenAI Scan
SSR-specific insights	❌	✅
Memory leak detection	❌	✅ (via prompt)
Natural language summary	❌	✅
Fix suggestions	🚫	✅

💡 Final Thoughts

This OpenAI-powered setup turns every PR into a mini code review assistant.
You’ll catch bugs earlier, speed up reviews, and elevate code quality effortlessly.

DEV Community