Air-gapped code review with Ollama: when the diff never leaves the machine

#privacy #llm #selfhosted #go

The previous post was about scanning your diff for secrets before it leaves your machine. This one is about not letting it leave at all. Every API provider CommitBrief supports — Anthropic, OpenAI, Gemini, and the rest — sends your code to someone else's server for review. Point it at Ollama and the diff goes to a process on localhost instead. For code under an NDA, in a regulated shop, or that you'd rather not hand to a vendor, that's the difference between "a scanner guards the upload" and "there is no upload."

TL;DR

commitbrief --provider ollama reviews your diff against a model running on your own machine. Zero third-party egress.
No API key, no per-token cost — the pricing table is literally empty for Ollama.
It's not a special "offline mode." It's the same Provider interface as every other backend, pointed at http://localhost:11434.
OLLAMA_HOST repoints it at your own GPU box on the LAN — still off the public internet.
The limit. A local model is a real second pass, not a frontier-model one. You're trading some review quality for maximum privacy — and the eval harness measures exactly how much.

The egress question

The only question that matters for a privacy-constrained team is: where does my diff go? CommitBrief's answer depends entirely on the provider you picked, and it's worth being precise rather than reassuring:

Provider class	Where your diff goes
Anthropic / OpenAI / Gemini / DeepSeek / Mistral / Cohere	That vendor's HTTPS API
`claude-cli` / `gemini-cli` / `codex-cli`	Through the host CLI — still that vendor's backend
`ollama`	An Ollama server you run — `localhost` by default

With an API provider, the secret scanner from the previous post is a guard on an upload that still happens. With Ollama, there's no upload to guard. That's the whole pitch, and it's an honest one only because it's true at the mechanism level, not the marketing level.

It isn't a mode — it's just a localhost endpoint

There's no "offline switch" in CommitBrief. Air-gapped review falls out of the provider abstraction: Ollama is one more implementation of the same Provider interface, and the only thing that makes it private is the URL it talks to. Notice what's missing from its constructor — there's no API key:

const DefaultBaseURL = "http://localhost:11434"

func New(cfg config.ProviderConfig) (provider.Provider, error) {
    baseURL := cfg.BaseURL
    if baseURL == "" {
        baseURL = DefaultBaseURL
    }
    return &Client{
        baseURL: strings.TrimRight(baseURL, "/"),
        model:   cfg.Model,
        http:    &http.Client{Timeout: requestTimeout},
    }, nil
}

Compare that to the API providers, which reject an empty key outright. Ollama doesn't authenticate to anyone, because there's no one to authenticate to. The request is an HTTP POST to a port on your own machine.

Free, and the cost preflight knows it

Every paid provider has a per-model price table that feeds the pre-send cost estimate. Ollama's is a single line:

func pricingFor(_ string) provider.Pricing {
    return provider.Pricing{} // every model, zero cost
}

So the cost preflight has nothing to warn about and waves the call through. But "free" doesn't mean "untracked": Ollama returns real token counts (prompt_eval_count, eval_count), so --verbose still shows you the input/output token footer — useful for spotting a diff that's about to blow past a local model's context window. And the SHA-256 cache works exactly as it does for any other provider, so re-running a review on an unchanged diff is a disk read, not a re-inference.

Air-gapped doesn't mean underpowered

"Local" doesn't have to mean "on the laptop running the review." OLLAMA_HOST repoints the client at any Ollama server you control:

export OLLAMA_HOST=http://gpu.lan:11434
commitbrief --provider ollama --staged

Now the 14B model runs on the GPU box in the corner of your office and the review still never touches the public internet. The default model is qwen2.5-coder:14b with a 32K-token context window; a 7B variant is wired in for leaner hardware. The trust boundary is your network, not 127.0.0.1.

The rest of the pipeline doesn't change

Switching to Ollama changes the endpoint and nothing else. The secret scanner still runs — defense-in-depth, and it's already in place for the day you point this repo at an API provider. Your COMMITBRIEF.md rules still shape the review. The model is still asked for structured findings (format: "json"), still parsed into the same contract, still degrades to Markdown if the local model returns something malformed. And --fail-on=high still gates a commit hook or CI the same way:

commitbrief setup                          # pick ollama; no key to paste
commitbrief providers use ollama --local   # make it the default for this repo
commitbrief --staged                        # review, fully local

One provider swap, and a tool that talked to a vendor now talks to nothing but your own hardware.

What it is not

Here's the trade, stated plainly: a qwen2.5-coder review is a real second pass and it beats no review, but it is not a Claude or GPT review. It will miss subtler findings and surface more false positives than a frontier model. CommitBrief doesn't pretend otherwise — the eval harness scores each model against a known-answer corpus so the gap is a number you can read, not a vibe:

COMMITBRIEF_EVAL_PROVIDER=ollama make eval-live

For a repo you can't legally send to a third party, that trade is obvious: a local model's findings are the only findings you can have, and they're worth far more than none. For a throwaway script with no privacy constraint, a frontier API model will catch more. Picking the provider is picking that trade deliberately — which is the entire reason CommitBrief lets you pick at all.

Repo: github.com/CommitBrief/commitbrief.

Part 5 of **Building CommitBrief. Next: the content-addressed cache — why re-running a review is a disk read, and how editing one line of COMMITBRIEF.md invalidates exactly the right entries and nothing else.