DEV Community

Cover image for Building a Local, Zero-Cost AI Pull Request Reviewer with LangGraph and Ollama
Mohammad Awwaad
Mohammad Awwaad

Posted on

Building a Local, Zero-Cost AI Pull Request Reviewer with LangGraph and Ollama

Agentic Software Engineering

Building an Autonomous Pull Request Reviewer

Enterprise software engineering is at an inflection point. Everyone wants "AI Agents", but corporate security policies heavily restrict sending proprietary source code to external APIs like OpenAI.

If you want to build Agentic workflows in the enterprise, you have to build them locally.

In this tutorial, we will construct a robust, autonomous Pull Request (Merge Request) Reviewer on macOS. We will use a local Dockerized GitLab, local LLM inference via Ollama (Qwen 2.5 Coder), and deterministic orchestration via LangGraph.

Here is the Architect's step-by-step guide to building this exact stack from scratch, including all the traps and failures you'll encounter along the way.


Step 1: The Inference Engine (Ollama)

First, we need our local AI. We will use Ollama to run qwen2.5-coder:7b, which is specifically tuned for codebase analysis.

# Install Ollama on macOS
brew install --cask ollama

# Start the Ollama daemon (or open the app)
# Then pull and test the model:
ollama run qwen2.5-coder:7b
Enter fullscreen mode Exit fullscreen mode

Step 2: The Local GitLab Infrastructure

We need a safe sandbox to test our Agent without risking live or production repositories. We will deploy GitLab Community Edition right on our Mac using Docker.

1. Prepare the Host Directories:
Docker needs strict folder mapping to persist Git repository databases.

export GITLAB_HOME=$HOME/gitlab
mkdir -p ~/gitlab/config ~/gitlab/logs ~/gitlab/data
Enter fullscreen mode Exit fullscreen mode

2. The Docker Compose File:
Create docker-compose.yml and run docker-compose up -d:

version: '3.6'
services:
  web:
    image: 'gitlab/gitlab-ce:latest'
    restart: always
    hostname: 'localhost'
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'http://localhost'
    ports:
      - '80:80'
    volumes:
      - '$GITLAB_HOME/config:/etc/gitlab'
      - '$GITLAB_HOME/logs:/var/log/gitlab'
      - '$GITLAB_HOME/data:/var/opt/gitlab'
Enter fullscreen mode Exit fullscreen mode

3. Authenticating & Generating a Token:
Once GitLab boots up (it takes a few minutes), you need the default admin password to log into http://localhost.

# Retrieve the auto-generated root password
docker exec -it <your-gitlab-container-name> grep 'Password:' /etc/gitlab/initial_root_password
Enter fullscreen mode Exit fullscreen mode

Log in as root. Navigate to Edit Profile -> Access Tokens. Create a new Personal Access Token with the api scope. Save this token.


Step 3: The macOS Python Constraint Trap

The engine of our agent relies on Python.

🚨 The Creation Trap: You run python3 -m venv .venv, activate it, and confidently try to install the newest MCP and LangChain SDKs. Your terminal throws errors like ERROR: No matching distribution found...

The Fix: macOS aggressively binds the python3 command to Apple's default 3.9 installation. Modern AI packages (like mcp) enforce a strict requirement of Python >= 3.10. To fix this permanently on a Mac:

# Bypass Apple's Python completely
brew install python@3.12

# Explicitly command the Homebrew binary to build the environment
/opt/homebrew/bin/python3.12 -m venv .venv

# Activate and install
source .venv/bin/activate
pip install langchain langgraph langchain-community langchain-ollama mcp fastapi uvicorn python-dotenv requests
Enter fullscreen mode Exit fullscreen mode

Create a .env file in your project root to securely store your token so we don't leak it in source code:

GITLAB_API_URL="http://localhost/api/v4"
GITLAB_PERSONAL_ACCESS_TOKEN="glpat-YOUR_TOKEN_HERE"
Enter fullscreen mode Exit fullscreen mode

Step 4: The Model Context Protocol (MCP) Illusion

We need our Python agent to fetch code from GitLab. The modern approach is using the Model Context Protocol (MCP) standard server.

🚨 The Open-Source Trap: You try to use the official @modelcontextprotocol/server-gitlab bridge to fetch PR changes. But when you execute it, your terminal crashes with McpError: Unknown tool.

Why? Because the official Open-Source MCP standards are still being built! Currently, the open-source GitLab MCP Server supports creating branches and issues, but MR Diff reading and Note posting are not natively implemented yet (see GitLab Issue #561564).

The REST Fallback (mcp_client.py):
Do not let open-source limitations block your POC! We write a custom Python fallback using the requests library to fetch the /merge_requests/X/changes REST endpoint directly. This abstracts the data layer cleanly.

# mcp_client.py
import os
import requests
from dotenv import load_dotenv

load_dotenv()
GITLAB_API_URL = os.environ.get("GITLAB_API_URL", "http://localhost/api/v4")
GITLAB_PERSONAL_ACCESS_TOKEN = os.environ.get("GITLAB_PERSONAL_ACCESS_TOKEN", "")

def get_merge_request_diff(project_id: str, merge_request_iid: str) -> str:
    print(f"--> Fetching diff for Project {project_id}, MR #{merge_request_iid}")
    url = f"{GITLAB_API_URL}/projects/{project_id}/merge_requests/{merge_request_iid}/changes"
    headers = {"PRIVATE-TOKEN": GITLAB_PERSONAL_ACCESS_TOKEN}

    response = requests.get(url, headers=headers)
    response.raise_for_status()
    changes = response.json().get("changes", [])

    diff_text = ""
    for change in changes:
        diff_text += f"\n--- a/{change.get('old_path')} \n+++ b/{change.get('new_path')}\n"
        diff_text += change.get("diff", "")
    return diff_text

def create_merge_request_note(project_id: str, merge_request_iid: str, body: str) -> str:
    print(f"--> Posting review to Project {project_id}, MR #{merge_request_iid}")
    url = f"{GITLAB_API_URL}/projects/{project_id}/merge_requests/{merge_request_iid}/notes"
    headers = {"PRIVATE-TOKEN": GITLAB_PERSONAL_ACCESS_TOKEN}

    response = requests.post(url, headers=headers, data={"body": body})
    response.raise_for_status()
    return "Successfully posted note."
Enter fullscreen mode Exit fullscreen mode

Step 5: The Deterministic Graph (agent.py)

We avoid probabilistic "Agent" frameworks like CrewAI and instead use the rigid, state-machine determinism of LangGraph.

  • The Shared Memory: We define an AgentState.
  • The Model: We anchor qwen2.5-coder:7b to the logic. We specifically use Temperature 0.1. For a PR reviewer, creativity is a disaster. High temperatures cause hallucinations. 0.1 forces extreme analytical strictness.
# agent.py
from typing import TypedDict, Optional
from langchain_ollama import ChatOllama
from langgraph.graph import StateGraph, END
import mcp_client

# Initialize LLM
llm = ChatOllama(model="qwen2.5-coder:7b", temperature=0.1)

class AgentState(TypedDict):
    project_id: str
    mr_id: str
    code_diff: Optional[str]
    review_comment: Optional[str]
    error: Optional[str]

def fetch_code(state: AgentState) -> AgentState:
    try:
        diff = mcp_client.get_merge_request_diff(state["project_id"], str(state["mr_id"]))
        return {"code_diff": diff}
    except Exception as e:
        return {"error": f"Failed to fetch diff: {e}"}

def review_code(state: AgentState) -> AgentState:
    if state.get("error"): return {}

    prompt = f"""You are a Lead AI Architect reviewing a GitLab Pull Request.
Perform a strict logical and security review of this code diff:
{state.get('code_diff', '')}
"""
    try:
        response = llm.invoke(prompt)
        return {"review_comment": response.content}
    except Exception as e:
        return {"error": f"LLM Review failed: {e}"}

def post_review(state: AgentState) -> AgentState:
    if state.get("error") or not state.get("review_comment"): return {}

    try:
        mcp_client.create_merge_request_note(state["project_id"], str(state["mr_id"]), state["review_comment"])
        print("✅ Review successfully posted!")
        return {}
    except Exception as e:
        return {"error": str(e)}

# Compile the LangGraph
workflow = StateGraph(AgentState)
workflow.add_node("Fetch", fetch_code)
workflow.add_node("Review", review_code)
workflow.add_node("Comment", post_review)

workflow.set_entry_point("Fetch")
workflow.add_edge("Fetch", "Review")
workflow.add_edge("Review", "Comment")
workflow.add_edge("Comment", END)

app = workflow.compile()
Enter fullscreen mode Exit fullscreen mode

Step 6: The Ghost Thread Automation (webhook_server.py)

To make it fully autonomous, GitLab needs to trigger the agent automatically via Webhooks.

🚨 The Webhook Timeout Trap:
You open a PR. GitLab pings the FastAPI server. The LLM takes 30 seconds to generate a review. GitLab registers a "Connection Failed" timeout error and aborts the webhook.

The Solution: Use FastAPI's BackgroundTasks. We instantly reply "200 OK" to GitLab so the webhook succeeds, and hand the heavy LLM lifting to a ghost thread.

# webhook_server.py
from fastapi import FastAPI, Request, BackgroundTasks
import uvicorn
from agent import app as ai_agent

app = FastAPI()

def execute_pr_review(project_id: str, mr_iid: str):
    print(f"\n[Background] Executing AI Review for Project: {project_id}, MR: {mr_iid}")
    ai_agent.invoke({"project_id": str(project_id), "mr_id": str(mr_iid)})

@app.post("/webhook")
async def gitlab_webhook(request: Request, background_tasks: BackgroundTasks):
    payload = await request.json()

    if payload.get("object_kind") == "merge_request":
        attributes = payload.get("object_attributes", {})
        if attributes.get("action") in ["open", "update"]:
            project_id = payload.get("project", {}).get("id")
            mr_iid = attributes.get("iid")

            # Start the AI in a ghost thread!
            background_tasks.add_task(execute_pr_review, project_id, mr_iid)
            return {"status": "success", "message": "AI Review started in background."}

    return {"status": "ignored"}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)
Enter fullscreen mode Exit fullscreen mode

Final GitLab Config: Run python webhook_server.py. In GitLab, navigate to your project Settings > Webhooks. Set the URL to http://host.docker.internal:8000/webhook. Uncheck "Enable SSL verification". Under the repository Admin Network settings, explicitly enable "Allow requests to the local network".


Step 7: The Grand Finale (Testing the AI)

It is time to see your architecture in action:

  1. In your local GitLab, create a new branch in your sandbox repository.
  2. Open a source code file (like a typical Controller or Route) and intentionally write a terrible bug, such as throwing a raw Exception out of nowhere.
  3. Commit the change and instantly open a Merge Request.
  4. Switch to your terminal running the FastAPI server. You will immediately see it print: [Background] Executing AI Review for Project: 1, MR: X.
  5. Wait roughly 30 seconds for Ollama to process the code, then refresh your GitLab Merge Request page in the browser.

You will see Qwen 2.5 has autonomously posted a professional markdown comment catching your flaw, evaluating the risk, and providing a clean solution!

Conclusion

You have successfully bypassed Python limitations, navigated immature open-source standards, and outsmarted webhook timeouts. You now possess a locally hosted, entirely private AI PR Reviewer!

Top comments (0)