<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Muhammet ŞAFAK</title>
    <description>The latest articles on DEV Community by Muhammet ŞAFAK (@muhammetsafak).</description>
    <link>https://dev.to/muhammetsafak</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3995553%2F694f251e-0aec-487e-b18c-0e7f7af7a009.jpg</url>
      <title>DEV Community: Muhammet ŞAFAK</title>
      <link>https://dev.to/muhammetsafak</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/muhammetsafak"/>
    <language>en</language>
    <item>
      <title>Getting structured JSON out of five incompatible LLM APIs — and degrading when they ignore you</title>
      <dc:creator>Muhammet ŞAFAK</dc:creator>
      <pubDate>Fri, 26 Jun 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/muhammetsafak/getting-structured-json-out-of-five-incompatible-llm-apis-and-degrading-when-they-ignore-you-27jg</link>
      <guid>https://dev.to/muhammetsafak/getting-structured-json-out-of-five-incompatible-llm-apis-and-degrading-when-they-ignore-you-27jg</guid>
      <description>&lt;p&gt;&lt;strong&gt;CommitBrief renders a code review as cards, JSON schema v1, or a CI exit code — which means the LLM has to hand back structured findings, not prose.&lt;/strong&gt; Every provider can do that. The catch is that no two of them do it the same way, and some don't really do it at all.&lt;/p&gt;

&lt;p&gt;There's exactly one schema the whole system targets. Getting four native APIs to honor it takes four completely different mechanisms; getting three more is a matter of asking nicely and not trusting the answer. This is how that works, and what happens when a model ignores the contract anyway.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One schema, many dialects.&lt;/strong&gt; Every provider targets the same &lt;code&gt;Finding&lt;/code&gt; shape, expressed through whatever structured-output mechanism that vendor offers — &lt;code&gt;tool_use&lt;/code&gt;, strict &lt;code&gt;json_schema&lt;/code&gt;, &lt;code&gt;responseSchema&lt;/code&gt;, or just &lt;code&gt;format: "json"&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured output is a spectrum, not a guarantee.&lt;/strong&gt; It runs from "the API enforces the shape" down to "we asked in the prompt."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The real contract is your parser.&lt;/strong&gt; One &lt;code&gt;ParseFindings&lt;/code&gt; validates every provider's output the same way; failures retry once, then degrade to Markdown with a warning. The pipeline never crashes on a bad response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The limit.&lt;/strong&gt; A schema makes output &lt;em&gt;parseable&lt;/em&gt;, not &lt;em&gt;correct&lt;/em&gt;. It can't stop a model from inventing a plausible-but-wrong finding.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The one schema everyone targets
&lt;/h2&gt;

&lt;p&gt;A finding is a flat struct. Five required fields, three optional, and a severity drawn from a closed vocabulary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Finding&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Severity&lt;/span&gt;    &lt;span class="n"&gt;Severity&lt;/span&gt; &lt;span class="s"&gt;`json:"severity"`&lt;/span&gt;     &lt;span class="c"&gt;// one of five, below&lt;/span&gt;
    &lt;span class="n"&gt;File&lt;/span&gt;        &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"file"`&lt;/span&gt;
    &lt;span class="n"&gt;Line&lt;/span&gt;        &lt;span class="kt"&gt;int&lt;/span&gt;      &lt;span class="s"&gt;`json:"line"`&lt;/span&gt;
    &lt;span class="n"&gt;LineEnd&lt;/span&gt;     &lt;span class="kt"&gt;int&lt;/span&gt;      &lt;span class="s"&gt;`json:"line_end,omitempty"`&lt;/span&gt;
    &lt;span class="n"&gt;Title&lt;/span&gt;       &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"title"`&lt;/span&gt;
    &lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"description"`&lt;/span&gt;
    &lt;span class="n"&gt;Suggestion&lt;/span&gt;  &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"suggestion"`&lt;/span&gt;
    &lt;span class="n"&gt;Language&lt;/span&gt;    &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"language,omitempty"`&lt;/span&gt;
    &lt;span class="n"&gt;Snippet&lt;/span&gt;     &lt;span class="kt"&gt;string&lt;/span&gt;   &lt;span class="s"&gt;`json:"snippet,omitempty"`&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;SeverityCritical&lt;/span&gt; &lt;span class="n"&gt;Severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"critical"&lt;/span&gt;
    &lt;span class="n"&gt;SeverityHigh&lt;/span&gt;     &lt;span class="n"&gt;Severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"high"&lt;/span&gt;
    &lt;span class="n"&gt;SeverityMedium&lt;/span&gt;   &lt;span class="n"&gt;Severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"medium"&lt;/span&gt;
    &lt;span class="n"&gt;SeverityLow&lt;/span&gt;      &lt;span class="n"&gt;Severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"low"&lt;/span&gt;
    &lt;span class="n"&gt;SeverityInfo&lt;/span&gt;     &lt;span class="n"&gt;Severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"info"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The envelope is &lt;code&gt;{"findings": [ ... ]}&lt;/code&gt; and nothing else. That severity vocabulary is the wire contract with the model — deliberately English-only and fixed in code, so a user's custom &lt;code&gt;COMMITBRIEF.md&lt;/code&gt; can change the &lt;em&gt;rules&lt;/em&gt; of a review but never the &lt;em&gt;shape&lt;/em&gt; of its output. Everything downstream — the cards renderer, &lt;code&gt;--json&lt;/code&gt;, &lt;code&gt;--fail-on=high&lt;/code&gt; — depends on those five strings meaning exactly five things.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four native dialects for the same shape
&lt;/h2&gt;

&lt;p&gt;The native API providers each enforce that schema through their own mechanism. Same target, four wire formats.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anthropic — a forced tool call.&lt;/strong&gt; The findings schema is registered as a tool, and &lt;code&gt;tool_choice&lt;/code&gt; makes calling it non-optional:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tools&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToolUnionParam&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;buildReportTool&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;  &lt;span class="c"&gt;// schema as "report_findings"&lt;/span&gt;
&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToolChoice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ToolChoiceParamOfTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;toolName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c"&gt;// must call it&lt;/span&gt;
&lt;span class="c"&gt;// tool description: "Emit the review as structured findings. Always call this tool."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;OpenAI — strict &lt;code&gt;json_schema&lt;/code&gt;.&lt;/strong&gt; With &lt;code&gt;Strict&lt;/code&gt; set, the Chat Completions API holds the response to the schema server-side — and refuses the request outright rather than fall through to a model that would ignore it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;buildResponseFormat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChatCompletionNewParamsResponseFormatUnion&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChatCompletionNewParamsResponseFormatUnion&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;OfJSONSchema&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;shared&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseFormatJSONSchemaParam&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;JSONSchema&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shared&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseFormatJSONSchemaJSONSchemaParam&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;        &lt;span class="n"&gt;schemaName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;Description&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Structured findings for a code review."&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;Strict&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="n"&gt;responseSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(Responses-API-only models express the same schema through a &lt;code&gt;text.format&lt;/code&gt; json_schema config instead — one more dialect for the identical shape.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemini — a response schema plus a MIME type.&lt;/strong&gt; You hand the SDK a &lt;code&gt;*Schema&lt;/code&gt; value and tell it to return JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseMIMEType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"application/json"&lt;/span&gt;
&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseSchema&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;responseSchema&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c"&gt;// the Findings envelope as a *genai.Schema&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Ollama — &lt;code&gt;format: "json"&lt;/code&gt;, and that's all it promises.&lt;/strong&gt; A local model can be told to emit JSON, but the flag constrains &lt;em&gt;syntax&lt;/em&gt;, not &lt;em&gt;shape&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;Format&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c"&gt;// valid JSON guaranteed; the right keys are not&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That distinction matters. Anthropic, OpenAI, and Gemini constrain the &lt;em&gt;structure&lt;/em&gt;; Ollama only guarantees the output parses as &lt;em&gt;some&lt;/em&gt; JSON. The schema conformance has to come from somewhere else.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three providers that don't enforce at all
&lt;/h2&gt;

&lt;p&gt;DeepSeek, Mistral, and Cohere reach CommitBrief through the OpenAI-compatible SDK (covered in &lt;a href="https://dev.to/muhammetsafak/one-go-interface-ten-llms-three-transport-classes-3877"&gt;part 2&lt;/a&gt;), but their strict-schema support is uneven, so they don't request &lt;code&gt;response_format&lt;/code&gt; at all. Their JSON shape comes entirely from the prompt's contract block.&lt;/p&gt;

&lt;p&gt;So structured output across the seven API providers is a spectrum:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Constrains&lt;/th&gt;
&lt;th&gt;Providers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Forced tool / strict schema&lt;/td&gt;
&lt;td&gt;The exact shape&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;anthropic&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;gemini&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;format: "json"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Syntax only&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ollama&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt instruction&lt;/td&gt;
&lt;td&gt;Nothing, at the API level&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;deepseek&lt;/code&gt;, &lt;code&gt;mistral&lt;/code&gt;, &lt;code&gt;cohere&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A pipeline that only trusted the strict-schema providers would work for three of seven. The other four need a backstop that doesn't care how the JSON was produced.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real contract is your parser
&lt;/h2&gt;

&lt;p&gt;That backstop is one function every provider's output funnels through. &lt;code&gt;ParseFindings&lt;/code&gt; decodes the envelope and validates each finding — not just "is it JSON" but "is it a &lt;em&gt;valid finding&lt;/em&gt;":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Findings&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Severity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsValid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"parse findings: finding %d: unknown severity %q"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Severity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;File&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"parse findings: finding %d: missing file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Title&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"parse findings: finding %d: missing title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Suggestion&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An empty &lt;code&gt;findings&lt;/code&gt; array is a &lt;em&gt;clean review&lt;/em&gt;, returned as a non-nil empty slice — success, not an error. A made-up severity or a finding with no file is a parse failure, no matter which provider produced it. The strict-schema providers rarely trip it; the prompt-driven ones lean on it. Either way, the validation is identical, so a &lt;code&gt;--fail-on=high&lt;/code&gt; gate means the same thing whether you ran Claude or a local qwen.&lt;/p&gt;

&lt;h2&gt;
  
  
  When the model ignores all of it
&lt;/h2&gt;

&lt;p&gt;A strict schema reduces malformed output; it doesn't eliminate it, and three of the providers have no schema at all. So the call is wrapped in retry-once-then-degrade:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;prov&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Review&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Usage&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parseErr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFindings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;parseErr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Usage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FormatJSON&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c"&gt;// First attempt unparseable — retry once (ADR-0014 §4).&lt;/span&gt;
&lt;span class="n"&gt;onRetry&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;resp2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err2&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;prov&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Review&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err2&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Usage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FormatMarkdownFallback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parseErr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseFindings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;parseErr&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;totalUsage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FormatJSON&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c"&gt;// Both attempts failed — degrade: render the raw text as Markdown, warn once.&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;totalUsage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FormatMarkdownFallback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things in that flow are deliberate. Token usage is &lt;strong&gt;summed across both attempts&lt;/strong&gt;, so the cost footer reflects what you actually spent, even on a degrade. The outcome is recorded as a &lt;strong&gt;format marker&lt;/strong&gt; (&lt;code&gt;FormatJSON&lt;/code&gt; or &lt;code&gt;FormatMarkdownFallback&lt;/code&gt;) and cached with the response, so a degraded review replays from cache silently instead of re-warning forever. And degrade means &lt;strong&gt;render the raw model text as Markdown and print one warning&lt;/strong&gt; — never crash, never show the user a stack trace because an LLM got creative. A &lt;code&gt;--fail-on&lt;/code&gt; gate is skipped on a degrade, with a note on stderr, because there are no structured findings to threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it is not
&lt;/h2&gt;

&lt;p&gt;Structured output guarantees a response is &lt;em&gt;parseable&lt;/em&gt;. It does not guarantee it's &lt;em&gt;correct&lt;/em&gt;. A strict schema can't stop a model from inventing a line number, attaching a finding to the wrong file, or reporting a confident non-issue — which is why the prompt still carries an explicit "do not invent file paths or line numbers" directive, and why this is the zeroth reviewer, not the last one. The schema is what makes the output machine-readable; your judgment is what makes it trustworthy.&lt;/p&gt;

&lt;p&gt;If you want the measured version of "how often is it right," the eval harness scores precision and false-positive rate per model against a known-answer corpus:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;COMMITBRIEF_EVAL_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;name&amp;gt; make eval-live
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repo: &lt;strong&gt;github.com/CommitBrief/commitbrief&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 3 of **Building CommitBrief&lt;/em&gt;&lt;em&gt;. Next: the pre-send secret scanner — eight patterns, added-lines-only, and a match record that never stores the secret it just caught.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>go</category>
      <category>json</category>
    </item>
    <item>
      <title>One Go interface, ten LLMs, three transport classes</title>
      <dc:creator>Muhammet ŞAFAK</dc:creator>
      <pubDate>Thu, 25 Jun 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/muhammetsafak/one-go-interface-ten-llms-three-transport-classes-3877</link>
      <guid>https://dev.to/muhammetsafak/one-go-interface-ten-llms-three-transport-classes-3877</guid>
      <description>&lt;p&gt;&lt;strong&gt;CommitBrief reviews your git diff with whatever LLM you point it at — Claude, GPT, Gemini, a local Ollama model, or the &lt;code&gt;claude&lt;/code&gt; CLI you already have installed.&lt;/strong&gt; Ten providers, one &lt;code&gt;Provider&lt;/code&gt; interface, zero special-casing in the review pipeline. This post is how that abstraction is built, because the providers are far less alike than "they're all LLMs" suggests.&lt;/p&gt;

&lt;p&gt;The ten split into three transport classes that share almost nothing at the wire level: native HTTPS APIs, OpenAI-compatible endpoints, and local subprocesses. Making them satisfy one interface — without the pipeline knowing which is which — is the whole trick.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One interface, ten implementations.&lt;/strong&gt; Every provider satisfies a 7-method &lt;code&gt;Provider&lt;/code&gt; interface; the pipeline never type-switches on a vendor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A &lt;code&gt;database/sql&lt;/code&gt;-style registry.&lt;/strong&gt; Each provider registers itself in &lt;code&gt;init()&lt;/code&gt;; a blank import in &lt;code&gt;main.go&lt;/code&gt; is all it takes to add one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Three transport classes.&lt;/strong&gt; Native APIs, OpenAI-compatible endpoints (reusing one SDK via a base URL), and subprocess-backed CLIs — the last opt out of the JSON contract through a marker interface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The limit.&lt;/strong&gt; Provider-agnostic is not provider-&lt;em&gt;equal&lt;/em&gt;. A local model is a real review, not a frontier one — and the eval harness measures the gap instead of hiding it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;Providers&lt;/th&gt;
&lt;th&gt;Transport&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Native API&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;anthropic&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;gemini&lt;/code&gt;, &lt;code&gt;ollama&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Each vendor's own HTTPS API (Ollama over &lt;code&gt;localhost&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI-compatible&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;deepseek&lt;/code&gt;, &lt;code&gt;mistral&lt;/code&gt;, &lt;code&gt;cohere&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;openai-go&lt;/code&gt; SDK pointed at the vendor's base URL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CLI-backed&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;claude-cli&lt;/code&gt;, &lt;code&gt;gemini-cli&lt;/code&gt;, &lt;code&gt;codex-cli&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;A local subprocess; reuses the host CLI's auth&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The interface every provider satisfies
&lt;/h2&gt;

&lt;p&gt;This is the entire contract. Seven methods, no generics, no per-vendor escape hatch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Provider&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;DefaultModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;ContextWindow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;EstimateTokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;Pricing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Pricing&lt;/span&gt;
    &lt;span class="n"&gt;Review&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;TestConnection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Review&lt;/code&gt; does the work; the other six let the pipeline make decisions &lt;em&gt;before&lt;/em&gt; the call — estimate tokens for the cost preflight, look up &lt;code&gt;Pricing&lt;/code&gt; to warn over a threshold, check &lt;code&gt;ContextWindow&lt;/code&gt; to catch an oversized prompt, and &lt;code&gt;TestConnection&lt;/code&gt; so &lt;code&gt;commitbrief providers test &amp;lt;name&amp;gt;&lt;/code&gt; can ping a key without running a review. The pipeline holds a &lt;code&gt;Provider&lt;/code&gt; and never asks which concrete type it is.&lt;/p&gt;

&lt;h2&gt;
  
  
  A registry you've already used
&lt;/h2&gt;

&lt;p&gt;If you've ever written &lt;code&gt;_ "github.com/lib/pq"&lt;/code&gt; to register a database driver, you know this pattern. Providers register themselves; nothing imports them by name.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Factory&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ProviderConfig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt; &lt;span class="n"&gt;Factory&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"provider: Register called with empty name"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"provider: Register called with nil factory for "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;registryMu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;registryMu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exists&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="n"&gt;exists&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"provider: duplicate registration for "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The panics are deliberate. A duplicate name or a nil factory is a programmer error, and it should crash at startup — when an &lt;code&gt;init()&lt;/code&gt; runs — not silently shadow a provider that a user later selects. Each provider subpackage calls &lt;code&gt;Register&lt;/code&gt; in its own &lt;code&gt;init()&lt;/code&gt;, so wiring a new one into the binary is one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="s"&gt;"github.com/CommitBrief/commitbrief/internal/provider/anthropic"&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="s"&gt;"github.com/CommitBrief/commitbrief/internal/provider/deepseek"&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="s"&gt;"github.com/CommitBrief/commitbrief/internal/provider/claude-cli"&lt;/span&gt;
    &lt;span class="c"&gt;// ...one blank import per provider&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;New(name, cfg)&lt;/code&gt; looks the factory up under a read lock and returns a typed error listing the known names when you ask for one that isn't there. That's the seam every transport class plugs into.&lt;/p&gt;

&lt;h2&gt;
  
  
  Class 1 — native APIs
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;anthropic&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, and &lt;code&gt;gemini&lt;/code&gt; each talk to their vendor's own SDK and use that vendor's native structured-output mechanism. &lt;code&gt;ollama&lt;/code&gt; is the same shape pointed at &lt;code&gt;http://localhost:11434&lt;/code&gt;: same interface, but the diff never leaves the machine and the cost is zero. For a contractor under an NDA, that last property is the entire point — &lt;code&gt;commitbrief --provider ollama&lt;/code&gt; is a real review with no third-party egress.&lt;/p&gt;

&lt;p&gt;(How each vendor is coerced into returning structured findings — &lt;code&gt;tool_use&lt;/code&gt;, strict &lt;code&gt;json_schema&lt;/code&gt;, &lt;code&gt;responseSchema&lt;/code&gt; — is its own story. That's the next post in the series.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Class 2 — OpenAI-compatible, for the cost of a base URL
&lt;/h2&gt;

&lt;p&gt;DeepSeek, Mistral, and Cohere all expose an OpenAI-shaped Chat Completions API. So instead of three new SDKs, they reuse the one already in the build — &lt;code&gt;github.com/openai/openai-go&lt;/code&gt; — pointed at a different host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ProviderConfig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIKey&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"deepseek: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ErrUnauthorized&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;baseURL&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BaseURL&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;baseURL&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;baseURL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultBaseURL&lt;/span&gt; &lt;span class="c"&gt;// https://api.deepseek.com&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;sdk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;option&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithAPIKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIKey&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;option&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithBaseURL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three providers, one &lt;code&gt;option.WithBaseURL&lt;/code&gt;, no new dependency to license-audit or keep current. Cohere even ships its compatibility surface at &lt;code&gt;api.cohere.ai/compatibility/v1&lt;/code&gt;, so it slots in the same way. These three don't request a strict response format — support for it is uneven — so their JSON shape comes from the prompt contract plus a retry-once-then-degrade fallback, the same way Ollama works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Class 3 — subprocess CLIs, and a marker interface
&lt;/h2&gt;

&lt;p&gt;The third class is the unusual one. &lt;code&gt;claude-cli&lt;/code&gt;, &lt;code&gt;gemini-cli&lt;/code&gt;, and &lt;code&gt;codex-cli&lt;/code&gt; don't make HTTP calls at all — they shell out to a CLI you already have on your PATH and reuse &lt;em&gt;its&lt;/em&gt; auth. If you pay for a Claude or Gemini subscription, reviewing a diff through it costs nothing extra.&lt;/p&gt;

&lt;p&gt;These three differ only in their binary name and a few flags, so they share one &lt;code&gt;clireview.Backend&lt;/code&gt;. &lt;code&gt;claude-cli&lt;/code&gt;, for instance, pipes the prompt on stdin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;-p&lt;/span&gt; - &lt;span class="nt"&gt;--output-format&lt;/span&gt; text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two details are worth pulling out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They opt out of the JSON contract via a marker interface.&lt;/strong&gt; A CLI tool has already formatted its output; forcing it through JSON parsing and the cards renderer would mangle it. So the backend implements a marker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// PlainTextEmitter is the marker interface for providers whose&lt;/span&gt;
&lt;span class="c"&gt;// Review() returns formatted plain text instead of structured JSON.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;PlainTextEmitter&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Provider&lt;/span&gt;
    &lt;span class="n"&gt;EmitsPlainText&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Backend&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;EmitsPlainText&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="c"&gt;// the whole implementation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pipeline asks once whether the provider has the capability — the same type-assertion idiom as &lt;code&gt;http.Flusher&lt;/code&gt; or &lt;code&gt;io.WriterTo&lt;/code&gt; — and reuses the answer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// claude-cli / gemini-cli / codex-cli get the plain-text prompt&lt;/span&gt;
&lt;span class="c"&gt;// contract instead of the JSON one — the host CLI's agentic system&lt;/span&gt;
&lt;span class="c"&gt;// prompt makes structured-output guarantees unreliable.&lt;/span&gt;
&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;plainText&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;prov&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PlainTextEmitter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prompt&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;plainText&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BuildPlainText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loaded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lang&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numberedDiff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;archContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;global&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;withContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loaded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lang&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numberedDiff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;archContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single &lt;code&gt;plainText&lt;/code&gt; bool then drives the few places the two paths diverge: which prompt contract to build, whether to run the deterministic flaky-test pre-pass (skipped for CLI tools), and whether to stream the response verbatim or parse it as JSON. There's no &lt;code&gt;switch&lt;/code&gt; over provider names anywhere in the pipeline — routing is on the &lt;em&gt;capability&lt;/em&gt;, not the identity. A new plain-text provider implements &lt;code&gt;EmitsPlainText()&lt;/code&gt; and inherits all of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;DefaultModel()&lt;/code&gt; returns the binary's version, on purpose.&lt;/strong&gt; The cache key includes the model string. A CLI provider has no model name in the API sense, so it reports &lt;code&gt;binary + detected version&lt;/code&gt; — queried once with &lt;code&gt;--version&lt;/code&gt; and memoized behind a &lt;code&gt;sync.Once&lt;/code&gt;, because &lt;code&gt;DefaultModel()&lt;/code&gt; is on the hot path of every cache-key computation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Backend&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;DefaultModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;versionOrEmpty&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Binary&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Binary&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Backend&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;ContextWindow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="n"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c"&gt;// informational&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you upgrade the host CLI, the version string changes, so cached reviews from the old version cleanly invalidate. Correctness falls out of the cache key for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing one
&lt;/h2&gt;

&lt;p&gt;Selection is explicit — one provider per run, picked by flag or config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;commitbrief &lt;span class="nt"&gt;--provider&lt;/span&gt; openai          &lt;span class="c"&gt;# override the configured provider&lt;/span&gt;
commitbrief &lt;span class="nt"&gt;--cli&lt;/span&gt; claude               &lt;span class="c"&gt;# shorthand for --provider claude-cli&lt;/span&gt;
commitbrief providers use ollama &lt;span class="nt"&gt;--local&lt;/span&gt;   &lt;span class="c"&gt;# set the default for this repo&lt;/span&gt;
commitbrief providers &lt;span class="nb"&gt;test &lt;/span&gt;deepseek    &lt;span class="c"&gt;# ping the key, no review&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few combinations are rejected on purpose: &lt;code&gt;--provider&lt;/code&gt; and &lt;code&gt;--cli&lt;/code&gt; are mutually exclusive, and &lt;code&gt;--cli&lt;/code&gt; can't pair with &lt;code&gt;--json&lt;/code&gt; or &lt;code&gt;--markdown&lt;/code&gt; because a plain-text provider doesn't produce the structured output those formats render.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it is not
&lt;/h2&gt;

&lt;p&gt;Provider-agnostic is not provider-&lt;em&gt;equal&lt;/em&gt;. A local &lt;code&gt;qwen2.5-coder&lt;/code&gt; review is a real second pass and beats no review, but it won't match a frontier model on subtle findings. CommitBrief doesn't paper over that — the eval harness scores each model against a known-answer corpus so you can see the gap yourself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;COMMITBRIEF_EVAL_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;name&amp;gt; make eval-live
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And there's no automatic vendor-to-vendor failover. CommitBrief talks to exactly one provider per run; switching is a flag or a config write, never a silent retry against a different company's API on your diff. That's a deliberate choice about who decides where your code goes — you, every time.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 2 of **Building CommitBrief&lt;/em&gt;&lt;em&gt;. Next: getting structured findings out of five incompatible LLM APIs — &lt;code&gt;tool_use&lt;/code&gt;, strict &lt;code&gt;json_schema&lt;/code&gt;, &lt;code&gt;responseSchema&lt;/code&gt;, prompt-driven JSON — and degrading gracefully when the model ignores all of them.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>llm</category>
      <category>architecture</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I built a local-first LLM code reviewer in Go. Here's the entire pipeline.</title>
      <dc:creator>Muhammet ŞAFAK</dc:creator>
      <pubDate>Wed, 24 Jun 2026 04:40:00 +0000</pubDate>
      <link>https://dev.to/muhammetsafak/i-built-a-local-first-llm-code-reviewer-in-go-heres-the-entire-pipeline-11g</link>
      <guid>https://dev.to/muhammetsafak/i-built-a-local-first-llm-code-reviewer-in-go-heres-the-entire-pipeline-11g</guid>
      <description>&lt;p&gt;&lt;strong&gt;CommitBrief is a local-first CLI that runs an LLM review over your git diff before a teammate — or your future self — sees it.&lt;/strong&gt; There's no server and no telemetry; the diff leaves your machine only for the provider &lt;em&gt;you&lt;/em&gt; chose, and with a local model like Ollama it never leaves at all.&lt;/p&gt;

&lt;p&gt;The interesting engineering isn't "call an LLM." It's everything that has to happen &lt;em&gt;around&lt;/em&gt; that call so the review stays cheap, safe, and reproducible. Here's the whole path from &lt;code&gt;commitbrief --staged&lt;/code&gt; to the findings on your screen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What it is&lt;/strong&gt; — a CLI that reviews your staged diff (or any &lt;code&gt;git diff&lt;/code&gt; range) with the provider you pick: Claude, GPT, Gemini, or a fully local Ollama model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The non-obvious part&lt;/strong&gt; — the LLM call is one stage out of fourteen. Filtering, a pre-send secret scan, content-addressed caching, and a cost preflight do most of the work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The limit&lt;/strong&gt; — it's the &lt;em&gt;zeroth&lt;/em&gt; reviewer, not a replacement for a human one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key facts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go 1.25, GPL-3.0-or-later, no hosted service.&lt;/li&gt;
&lt;li&gt;10 providers + a mock: Anthropic, OpenAI, Gemini, Ollama (native APIs); DeepSeek, Mistral, Cohere (OpenAI-compatible); claude-cli, gemini-cli, codex-cli (subprocess-backed).&lt;/li&gt;
&lt;li&gt;The review path is &lt;strong&gt;read-only&lt;/strong&gt;. The single git-write command is &lt;code&gt;commitbrief commit&lt;/code&gt;, and even that only runs one &lt;code&gt;git commit&lt;/code&gt; of already-staged changes — it never edits a file.&lt;/li&gt;
&lt;li&gt;Install: &lt;code&gt;brew install CommitBrief/tap/commitbrief&lt;/code&gt;, &lt;code&gt;scoop install commitbrief&lt;/code&gt;, or &lt;code&gt;go install github.com/CommitBrief/commitbrief/cmd/commitbrief@latest&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The shape of a review
&lt;/h2&gt;

&lt;p&gt;Every review walks one linear pipeline. Here it is at altitude before we zoom in:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;What happens&lt;/th&gt;
&lt;th&gt;Why it's here&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. Resolve context&lt;/td&gt;
&lt;td&gt;Walk up for &lt;code&gt;.git&lt;/code&gt;, merge config (built-in &amp;lt; global &amp;lt; repo), apply env + flags&lt;/td&gt;
&lt;td&gt;One deterministic config per run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. Load rules&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;./COMMITBRIEF.md&lt;/code&gt; or the embedded default; validate the output template first&lt;/td&gt;
&lt;td&gt;Fail on a broken template &lt;em&gt;before&lt;/em&gt; spending a token&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Acquire diff&lt;/td&gt;
&lt;td&gt;Hybrid go-git + &lt;code&gt;exec git&lt;/code&gt; fallback&lt;/td&gt;
&lt;td&gt;Worktree state is git's, not a reimplementation's&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Parse + filter&lt;/td&gt;
&lt;td&gt;Three ignore layers, then an optional allowlist&lt;/td&gt;
&lt;td&gt;Don't pay to review lock files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. Pre-send guard&lt;/td&gt;
&lt;td&gt;Refuse to leak &lt;code&gt;.commitbrief/**&lt;/code&gt;; scan for secrets&lt;/td&gt;
&lt;td&gt;The diff is about to leave the machine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6. Build prompt&lt;/td&gt;
&lt;td&gt;Four XML blocks + an immutability guard&lt;/td&gt;
&lt;td&gt;Structured and injection-resistant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7. Cache lookup&lt;/td&gt;
&lt;td&gt;SHA-256 of the exact inputs&lt;/td&gt;
&lt;td&gt;A re-run is a disk read, not a bill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8. Cost preflight&lt;/td&gt;
&lt;td&gt;Estimate tokens, warn over a threshold&lt;/td&gt;
&lt;td&gt;No surprise spend&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9. Provider call&lt;/td&gt;
&lt;td&gt;Structured JSON, or verbatim text for CLI providers&lt;/td&gt;
&lt;td&gt;The actual review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10. Render + gate&lt;/td&gt;
&lt;td&gt;Cards / JSON / Markdown, then &lt;code&gt;--fail-on&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Human output or a CI exit code&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Five of these carry most of the weight. Let's take them in order.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting the diff: go-git, with git as the source of truth
&lt;/h2&gt;

&lt;p&gt;You'd think reading a diff is trivial. It is — until you need staged-vs-unstaged, a worktree comparison, and &lt;code&gt;git diff main...feature&lt;/code&gt; to all behave &lt;em&gt;exactly&lt;/em&gt; like git, on Windows too.&lt;/p&gt;

&lt;p&gt;CommitBrief runs a hybrid: a primary &lt;code&gt;go-git&lt;/code&gt; implementation with a &lt;code&gt;git&lt;/code&gt; CLI fallback (ADR-0002). Range operations that go-git models cleanly — commit-vs-first-parent, merge-base range diffs, branch diffs — stay in-process. Staged, unstaged, and arbitrary &lt;code&gt;git diff &amp;lt;args&amp;gt;&lt;/code&gt; passthrough shell out to &lt;code&gt;git&lt;/code&gt; with &lt;code&gt;--no-color --no-ext-diff&lt;/code&gt; for stable parsing. The CLI stays the source of truth for index and worktree state; reimplementing that plumbing is exactly the kind of subtle drift you don't want under a review tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;commitbrief &lt;span class="nt"&gt;--staged&lt;/span&gt;                 &lt;span class="c"&gt;# the index&lt;/span&gt;
commitbrief &lt;span class="nt"&gt;--unstaged&lt;/span&gt;               &lt;span class="c"&gt;# the working tree&lt;/span&gt;
commitbrief diff main...feature      &lt;span class="c"&gt;# args forwarded verbatim to git diff&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Filtering: three layers, so the model never reads noise
&lt;/h2&gt;

&lt;p&gt;A diff is mostly signal &lt;em&gt;and&lt;/em&gt; a pile of things no reviewer should read. Filtering is three composed layers (ADR-0006):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Built-ins&lt;/strong&gt; — around 65 patterns: lock files, &lt;code&gt;vendor/**&lt;/code&gt;, &lt;code&gt;node_modules/**&lt;/code&gt;, generated code, build artifacts, binaries, IDE/OS noise, and &lt;code&gt;.commitbrief/cache/**&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;.commitbriefignore&lt;/code&gt;&lt;/strong&gt; — gitignore syntax, repo-root, team-shared. It composes &lt;em&gt;after&lt;/em&gt; the built-ins with last-match-wins, so a &lt;code&gt;!pattern&lt;/code&gt; line can re-include something a built-in dropped.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic prose&lt;/strong&gt; — natural-language exclusions in &lt;code&gt;COMMITBRIEF.md&lt;/code&gt; ("don't flag generated mocks"), applied by the model itself.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After the ignore layers, &lt;code&gt;--file&lt;/code&gt; / &lt;code&gt;--dir&lt;/code&gt; apply a narrower allowlist. If nothing survives, the run exits &lt;code&gt;0&lt;/code&gt; having spent nothing — an empty diff is a success, not an error.&lt;/p&gt;

&lt;h2&gt;
  
  
  The guard nobody else runs: don't send secrets to the model
&lt;/h2&gt;

&lt;p&gt;Stage 5 is the one I care about most, because it sits on the boundary where your code is about to leave the machine.&lt;/p&gt;

&lt;p&gt;Two checks run before the provider call. First, a guard refuses to quietly ship your own config: if any path in the diff starts with &lt;code&gt;.commitbrief/&lt;/code&gt;, it prompts, and auto-aborts when there's no TTY. Second, a secret scanner runs over &lt;strong&gt;added lines only&lt;/strong&gt; against eight built-in patterns — AWS access keys, GitHub/GitLab tokens, Anthropic/OpenAI keys, JWTs, Stripe live keys, PEM private keys — and you can add your own through &lt;code&gt;guard.secret_patterns&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;One detail I'm proud of: a match records only &lt;code&gt;{Line, Patterns}&lt;/code&gt; — never the matched substring. The scanner that exists to stop a leak can't itself become one through a log line, stderr, or a cache file.&lt;/p&gt;

&lt;p&gt;The bypass policy is deliberate, too. &lt;code&gt;--allow-secrets&lt;/code&gt; skips the scan; &lt;code&gt;--yes&lt;/code&gt; does &lt;strong&gt;not&lt;/strong&gt;. Auto-confirming prompts in a pipeline should never silently approve shipping a credential to a third party.&lt;/p&gt;

&lt;h2&gt;
  
  
  Caching is just content addressing
&lt;/h2&gt;

&lt;p&gt;Reviewing the same diff twice should be free. The cache key is a SHA-256 over the exact inputs that could change the answer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sha256( diff "::" systemPrompt "::" provider ":" model ":" lang ":" schemaVersion [":ctx"] [":mode:"+mode] )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every input earns its place. The fully-assembled system prompt is in the key, so editing &lt;code&gt;COMMITBRIEF.md&lt;/code&gt; invalidates stale reviews. The schema version is in the key, so bumping the output contract invalidates everything at once. &lt;code&gt;--with-context&lt;/code&gt; and the commit-message mode each get a suffix so they never collide with a plain review. Entries are written atomically — temp file, &lt;code&gt;0600&lt;/code&gt;, then rename — to &lt;code&gt;.commitbrief/cache/&amp;lt;key&amp;gt;.json&lt;/code&gt;: one file per response, no index.&lt;/p&gt;

&lt;p&gt;The payoff lands at stage 8. Cost preflight estimates input tokens with a conservative &lt;code&gt;(len+3)/4&lt;/code&gt; heuristic, clamps expected output to &lt;code&gt;[200, 1500]&lt;/code&gt; tokens, multiplies by a per-model price table, and prompts only when the estimate clears your &lt;code&gt;cost.warn_threshold_usd&lt;/code&gt; (default &lt;code&gt;$0.50&lt;/code&gt;). A cache hit skips preflight entirely — re-running a review on an unchanged diff costs one disk read.&lt;/p&gt;

&lt;h2&gt;
  
  
  The call, and what happens when the model misbehaves
&lt;/h2&gt;

&lt;p&gt;For API providers, CommitBrief asks for structured findings and parses them into a fixed contract — &lt;code&gt;severity&lt;/code&gt;, &lt;code&gt;file&lt;/code&gt;, &lt;code&gt;line&lt;/code&gt;, &lt;code&gt;title&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, &lt;code&gt;suggestion&lt;/code&gt; — emitted as JSON schema v1. If the model returns something unparseable, it retries once; if it's still bad, it degrades to rendering the raw text as Markdown and prints a warning instead of crashing. CLI-backed providers (&lt;code&gt;claude-cli&lt;/code&gt;, &lt;code&gt;gemini-cli&lt;/code&gt;, &lt;code&gt;codex-cli&lt;/code&gt;) run as read-only subprocesses and stream their text verbatim.&lt;/p&gt;

&lt;p&gt;Output is Cards by default, or:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;commitbrief &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="nt"&gt;--fail-on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;high       &lt;span class="c"&gt;# CI gate: exit 1 on any high+ finding&lt;/span&gt;
commitbrief diff main...feature &lt;span class="nt"&gt;--markdown&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; review.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit codes stay simple: &lt;code&gt;0&lt;/code&gt; for success — including a clean review or a &lt;code&gt;--fail-on&lt;/code&gt; threshold that wasn't breached — and &lt;code&gt;1&lt;/code&gt; for any error or a breached gate.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it is not
&lt;/h2&gt;

&lt;p&gt;It's the zeroth reviewer, not a replacement for a human one. It catches the obvious-but-easy-to-miss class: injection, missing nil checks, swallowed errors, a guard clause that's now unreachable. It does not catch intent-level design problems, and it won't tell you whether the feature should exist. That conversation stays with your reviewer.&lt;/p&gt;

&lt;p&gt;I won't assert quality at you, either. There's a reproducible eval harness scoring real output against a known-answer corpus — run it yourself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;COMMITBRIEF_EVAL_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;name&amp;gt; make eval-live
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want a second pair of eyes on your diff before anyone else gets one — locally, with the provider you already trust — that's the whole point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;commitbrief setup     &lt;span class="c"&gt;# pick a provider, paste a key, ping it&lt;/span&gt;
commitbrief init      &lt;span class="c"&gt;# optional: write project-specific rules&lt;/span&gt;
git add &lt;span class="nb"&gt;.&lt;/span&gt;
commitbrief &lt;span class="nt"&gt;--staged&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repo and install instructions: &lt;strong&gt;github.com/CommitBrief/commitbrief&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is part 1 of **Building CommitBrief&lt;/em&gt;&lt;em&gt;. Next: how one Go interface fans out to 10 LLMs across three transport classes — native APIs, OpenAI-compatible endpoints, and subprocess-backed CLIs.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>ai</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
