Building an AI Code Reviewer for GitLab CI with Google Gemini

#devchallenge #geminireflections #gemini

Built with Google Gemini: Writing Challenge

This is a submission for the Built with Google Gemini: Writing Challenge

What I Built with Google Gemini

Niteni, Javanese for "to observe carefully", is an AI-powered code review tool for GitLab CI pipelines, powered by the Gemini REST API.

GitLab's Free tier doesn't have a built-in AI review feature. I wanted something that would run inside a standard CI job, post inline diff comments (not a wall of text), and provide one-click "Apply suggestion" buttons, all without pulling in any npm runtime dependencies.

That last constraint was deliberate. CI environments are ephemeral. Every npm install is wasted time and a potential failure point. So Niteni uses only Node.js built-ins: https, fs, path, os, and url. Gemini does the heavy lifting via a direct REST call.

How Gemini fits in

Niteni sends the full MR diff to Gemini and asks it to return a structured list of findings, each with a severity level (CRITICAL, HIGH, MEDIUM, LOW), file path, line number, description, and optional suggestion. Those findings get posted as inline GitLab discussion comments with suggestion blocks the reviewer can apply in one click.

async review(diffContent: string): Promise<ReviewResult> {
  const apiResult = await this.reviewWithAPI(diffContent);
  if (apiResult && this.isValidStructuredReview(apiResult)) {
    return apiResult;
  }
  throw new Error('Review failed: empty or malformed structured response.');
}

A direct HTTP call to generativelanguage.googleapis.com, no CLI, no extensions, no sandbox issues. Just an API key and a network connection.

Demo

The tool runs as a GitLab CI job. After a push, it posts inline comments like this:

You can try it yourself: github.com/denyherianto/niteni

What I Learned

1. Gemini CLI ≠ CI-friendly

My first approach used a cascading strategy: Gemini CLI /code-review extension → CLI prompt → REST API. The CLI approaches failed every time in CI because Gemini CLI restricts its toolset in non-interactive (non-TTY) mode, the /code-review extension can't even run git diff. No Docker image change fixes this. The simplest solution turned out to be the best: just call the API directly.

2. Structured output beats regex parsing

Early versions asked Gemini for markdown and parsed it with regex. This was fragile, brackets optional, format drift between model versions, lastIndex state bugs in exec() loops:

// Handles both: **[CRITICAL]** and **CRITICAL**
const findingRegex = /\*\*\[?(CRITICAL|HIGH|MEDIUM|LOW)\]?\*\*\s*`([^`]+)`/g;

Migrating to Gemini's structured output (responseMimeType: "application/json" + responseSchema) eliminated the entire parsing layer. Findings come back as typed JSON objects. No regex, no format drift, no silent data loss.

3. GitLab CI variable "pass-through" is a trap

This looks innocent:

variables:
  GEMINI_API_KEY: $GEMINI_API_KEY  # ← circular reference

GitLab expands this to the literal string $GEMINI_API_KEY instead of the secret value. The fix: don't re-declare project-level CI/CD variables. They're available in every job automatically.

4. `execFileSync` over `execSync` for security

The original code built shell commands as strings. Branch names come from user input in CI, a branch named main; rm -rf / would be a shell injection. Switching to execFileSync('git', ['diff', '--merge-base',origin/${targetBranch}]) calls the binary directly, no shell interpretation.

5. URL-encoding every path parameter

GitLab project IDs can be namespaced paths (my-group/my-project). Without encodeURIComponent(), a branch named feature/auth silently becomes a path traversal. Every parameter, project ID, MR IID, file path, branch ref, gets encoded.

Google Gemini Feedback

What worked well:

Structured output / JSON mode is the standout feature. Once I switched from markdown + regex to responseSchema, reliability jumped dramatically. The schema enforcement means I can trust the shape of the response and write typed TypeScript around it.
temperature: 0.2 produces consistent, deterministic reviews. This is important for CI, you don't want findings to appear and disappear between pipeline runs on the same code.
65k output token limit means Niteni can review large diffs without truncation. This was a real concern early on.
The REST API itself is clean and well-documented. Direct HTTP calls from Node's https module work without friction.

Where I hit friction:

Error messages from the API are sometimes opaque. A JSON parse failure mid-response (unterminated string at character 1326) gave no signal about why the output was truncated. More structured error payloads would make debugging easier.
Rate limits during testing are easy to hit when you're running the tool repeatedly against the same MR. Clearer rate limit headers in responses would let the client back off gracefully instead of just failing.

Overall, Gemini's structured output capability was the key unlock for making Niteni reliable enough to trust in automated CI pipelines. The shift from "parse whatever the LLM returns" to "enforce a schema and get typed objects" is something I'd apply to any future LLM integration.

Code: github.com/denyherianto/niteni