Exposing a CLI as an MCP tool in standard-library Go

#go #ai #mcp #opensource

commitbrief mcp turns the review pipeline into a Model Context Protocol server, so an agent can run a code review as a tool call — typically a self-check before it submits the code it just wrote. Adding MCP support usually means pulling in an SDK. CommitBrief's server is encoding/json plus bufio, two files, and zero new dependencies — because the surface a stdio MCP server actually needs is small enough that hand-rolling it costs less than the dependency would.

TL;DR

commitbrief mcp speaks JSON-RPC 2.0 over line-delimited stdio. The advertised protocol revision is 2024-11-05.
The server is standard-library only: encoding/json for the envelopes, bufio for framing. No MCP SDK, no new dependency to license-audit.
It exposes one tool, review, which runs the exact same pipeline as commitbrief --json and returns schema-v1 findings plus a short text summary.
The limit. It's the stdio transport only, the review still costs a real provider call, and it's the same zeroth reviewer — now agent-invokable, not smarter.

The transport is a line and a flush

The whole framing decision is in the package doc, and it's a decision not to do something:

// The transport is line-delimited JSON: each JSON-RPC message is a single
// object written on its own line and flushed [...] We intentionally do
// NOT implement the optional Content-Length header framing — the line form is
// simpler, is what the reference hosts default to over stdio, and keeps the
// reader a plain bufio.Scanner.

So the read loop is a bufio.Scanner, one message per line, with the token cap raised because a findings document can outgrow the 64 KiB default:

func (s *Server) Serve(ctx context.Context, r io.Reader, w io.Writer) error {
    scanner := bufio.NewScanner(r)
    scanner.Buffer(make([]byte, 0, 64*1024), maxMessageBytes)
    writer := bufio.NewWriter(w)

    for scanner.Scan() {
        line := scanner.Bytes()
        if len(line) == 0 {
            continue // tolerate blank separator lines between messages
        }
        resp, emit := s.dispatch(ctx, line)
        if !emit {
            continue // notification: no answer on the wire
        }
        if err := writeMessage(writer, resp); err != nil {
            return err
        }
    }
    // ...
}

maxMessageBytes is 16 MiB — enough for the largest realistic review, bounded so a runaway peer can't exhaust memory. Every response is written and flushed immediately, because stdio is interactive and an unflushed buffer would deadlock the handshake.

The methods that matter

MCP over stdio needs a handful of methods, and the dispatcher is a switch:

switch req.Method {
case "initialize":
    return s.handleInitialize(req)
case "tools/list":
    return s.handleListTools(req)
case "tools/call":
    return s.handleCallTool(ctx, req)
case "ping":
    resp, _ := newResult(req.ID, struct{}{})
    return resp, !req.isNotification()
default:
    // "notifications/initialized" and any other notification: ack by silence.
    if req.isNotification() {
        return response{}, false
    }
    return newError(req.ID, codeMethodNotFound, "method not found: "+req.Method), true
}

initialize answers with the protocol version, a tools-only capabilities object, and the server identity. Notifications — anything with no id, like notifications/initialized — are processed for side effects and never answered, which the JSON-RPC spec requires. The reserved error codes (-32700 parse error, -32601 method not found, and the rest) are the spec's, used verbatim.

A failed review is content, not a protocol error

Here's the design choice worth copying. When the review tool fails — no staged changes, an aborted secret-scan guard, a provider timeout — that is not a JSON-RPC error. It's a successful call whose result carries an isError flag:

summary, structured, err := handler(ctx, params.Arguments)
if err != nil {
    // Tool-level failure: surface as content with IsError, not a JSON-RPC
    // error. The model sees what went wrong (e.g. "no staged changes",
    // "secret scan aborted") and can adjust instead of the call collapsing.
    errResult := callToolResult{
        Content: []contentBlock{textContent(toolErrorText(err))},
        IsError: true,
    }
    resp, mErr := newResult(req.ID, errResult)
    // ...
}

The distinction is the difference between an agent that recovers and one that stalls. A JSON-RPC protocol error tears down the call; an isError result hands the model an actionable sentence — "no staged changes" — that it can read and act on. Protocol errors stay reserved for malformed envelopes; everything the model should learn from arrives as content.

The tool is the pipeline — not a copy of it

The temptation when wiring a second entry point into a tool is to reimplement a leaner version. CommitBrief doesn't: the MCP handler drives the same runReview function the terminal uses. The comment is explicit about the seam:

// Everything downstream — diff fetch, three-layer filtering, the pre-send
// guard + secret scan, token/cost preflight, cache, the provider call, the
// flaky pre-pass, and signal control — runs exactly as it does for a terminal
// review. No pipeline is duplicated.

It gets there by forcing the machine-output flags and capturing the rendered document:

global = globalFlags{color: "never"}
reviewScope = reviewScopeFlags{}
global.json = true
global.quiet = true
// ...
reviewErr := runReview(cmd, scope, args.Diff)

Two consequences fall out of this reuse. First, the MCP server is a thin consumer of the same locked JSON schema v1 that external scripts consume — it re-parses the rendered output rather than reaching into pipeline internals, so the agent and a shell --json pipeline see byte-identical contracts. Second, the host is non-interactive, so if the pre-send secret scan would prompt, the call aborts and surfaces as a tool error. An agent cannot click "yes, send the secret anyway" — the safe default holds even when a model is driving.

The tool's input schema mirrors the CLI flags — staged, unstaged, diff, provider, model, fail_on, min_severity, no_flaky — with additionalProperties: false, decoded with DisallowUnknownFields() so a host that sends failon instead of fail_on gets a clear error rather than a silent no-op.

What it is not

This is the stdio transport only — no HTTP, no SSE, no Content-Length framing — and a single tool served on a single connection, sequentially. That's a deliberate floor, not an unfinished one: there is exactly one host on the other end of stdin, and a review is a blocking call, so concurrency would buy nothing and complicate the guard prompts.

And exposing the review to an agent doesn't change what the review is. It still makes a real provider call that costs tokens and a few seconds; it still catches the obvious-but-easy-to-miss class and misses intent-level design problems. The MCP server makes the zeroth reviewer callable from inside an agent loop — a fast self-check before code gets submitted. It does not make it a substitute for the human review that comes after.

Repo: github.com/CommitBrief/commitbrief.

Part 7 of **Building CommitBrief. Next: signal control — the baseline and inline-suppression layers that stop CommitBrief from re-flagging the same finding twice.