<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Owada Tomohiro</title>
    <description>The latest articles on DEV Community by Owada Tomohiro (@owada_tomohiro_28ec22f5ee).</description>
    <link>https://dev.to/owada_tomohiro_28ec22f5ee</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1995224%2F3ba8d7b7-2314-4844-8100-6dbaa6d7322a.png</url>
      <title>DEV Community: Owada Tomohiro</title>
      <link>https://dev.to/owada_tomohiro_28ec22f5ee</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/owada_tomohiro_28ec22f5ee"/>
    <language>en</language>
    <item>
      <title>wikigen: Auto-generate specification docs from your codebase</title>
      <dc:creator>Owada Tomohiro</dc:creator>
      <pubDate>Sun, 15 Mar 2026 00:26:28 +0000</pubDate>
      <link>https://dev.to/owada_tomohiro_28ec22f5ee/wikigen-auto-generate-specification-docs-from-your-codebase-30c9</link>
      <guid>https://dev.to/owada_tomohiro_28ec22f5ee/wikigen-auto-generate-specification-docs-from-your-codebase-30c9</guid>
      <description>&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Documentation gets written once and goes stale. Engineers don't update it. New team members keep asking "where is this API called from?" and "what's this table for?"&lt;/p&gt;

&lt;p&gt;I wanted documentation that stays in sync with code — generated directly from the source, not written by hand.&lt;/p&gt;

&lt;h2&gt;
  
  
  Existing solutions
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://devin.ai/" rel="noopener noreferrer"&gt;DeepWiki&lt;/a&gt; by Devin generates wiki-style documentation from repositories. The approach — using AI to read code and produce docs — was exactly what I was looking for.&lt;/p&gt;

&lt;p&gt;However, for my use case (batch generation across private repos, CI/CD integration), I needed something I could run from the command line without a UI.&lt;/p&gt;

&lt;p&gt;The open-source &lt;a href="https://github.com/AsyncFuncAI/deepwiki-open" rel="noopener noreferrer"&gt;DeepWiki-Open&lt;/a&gt; lets you self-host, but requires Docker + Ollama (for embedding) + an LLM backend. That's a lot of infrastructure for generating docs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The realization
&lt;/h2&gt;

&lt;p&gt;DeepWiki-Open's pipeline is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;clone → embedding → RAG search → LLM generates docs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But Claude Code can already explore codebases. Give it &lt;code&gt;--add-dir&lt;/code&gt; and it uses Read, Grep, Glob, and Bash to find and read whatever files it needs. No embedding, no vector DB, no RAG.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;clone → claude -p --add-dir ./repo → reads code directly → writes docs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/tomohiro-owada/wikigen" rel="noopener noreferrer"&gt;wikigen&lt;/a&gt; is a single-binary CLI that generates GitHub Wiki from source code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./wikigen owner/repo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. It clones the repo, lets Claude Code explore the code, and outputs GitHub Wiki-compatible Markdown files.&lt;/p&gt;

&lt;h3&gt;
  
  
  What gets generated
&lt;/h3&gt;

&lt;p&gt;The document structure follows categories from ISO/IEC 12207 (software lifecycle documentation), filtered to what's actually derivable from code:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Factual (directly from code):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;System overview, architecture, API specifications&lt;/li&gt;
&lt;li&gt;Data models (from migrations, ORM definitions)&lt;/li&gt;
&lt;li&gt;Config, environment variables, build/deploy procedures&lt;/li&gt;
&lt;li&gt;Test structure, auth flows, error handling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;High-confidence inference (from code patterns):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Processing flows (from function call chains)&lt;/li&gt;
&lt;li&gt;Security design (from middleware, validation)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not generated:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Business requirements, risk assessments, SLAs — anything that would be speculation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The prompt explicitly says: "If there's no code evidence, don't write it. Don't even mention that you couldn't find it."&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-repo projects
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight codeowners"&gt;&lt;code&gt;&lt;span class="n"&gt;myproject&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="n"&gt;owner/frontend&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;myproject&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="n"&gt;owner/backend&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;myproject&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="n"&gt;owner/shared&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Multiple repos get merged into one wiki with cross-repository documentation — architecture pages that show how services interact.&lt;/p&gt;

&lt;h3&gt;
  
  
  Parallel generation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./wikigen &lt;span class="nt"&gt;-f&lt;/span&gt; repos.txt &lt;span class="nt"&gt;-p&lt;/span&gt; 2 &lt;span class="nt"&gt;-pp&lt;/span&gt; 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;-p 2&lt;/code&gt; runs 2 repos in parallel, &lt;code&gt;-pp 5&lt;/code&gt; generates 5 pages per repo simultaneously.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Actions integration
&lt;/h3&gt;

&lt;p&gt;Wiki auto-updates when you push to main:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Generate wiki&lt;/span&gt;
  &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;CLAUDE_CODE_OAUTH_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;wikigen -lang en -pp 3 -local . my-project&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No clone needed in CI — the checkout action already has the code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error handling
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Each page auto-retries up to 3 times&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;./wikigen -retry&lt;/code&gt; regenerates only failed pages&lt;/li&gt;
&lt;li&gt;Pages save immediately — partial results survive interruptions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Output format
&lt;/h2&gt;

&lt;p&gt;GitHub Wiki-compatible. Push directly to &lt;code&gt;{repo}.wiki.git&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wiki-output/project/
  Home.md
  _Sidebar.md
  System-Architecture.md
  API-Specification.md
  Data-Model.md
  ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cross-page links use &lt;code&gt;[Page Title](Page-Filename)&lt;/code&gt; format. &lt;code&gt;_Sidebar.md&lt;/code&gt; provides navigation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;See a real example:&lt;/strong&gt; &lt;a href="https://github.com/tomohiro-owada/wikigen/wiki" rel="noopener noreferrer"&gt;github.com/tomohiro-owada/wikigen/wiki&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What went wrong along the way
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;claude -p output had commentary mixed in.&lt;/strong&gt; The generated docs would start with "Sure, I'll create the wiki page for you." Fixed by telling Claude to use the Write tool to save files directly, and having Go read the files instead of stdout.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The dialect incident.&lt;/strong&gt; My Claude Code session was configured to respond in Kyoto dialect. The generated documentation came out saying things like "This API accepts POST requests, ya know." Added "formal technical language only, no dialects" to the prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Go 1.22+&lt;/li&gt;
&lt;li&gt;git (SSH or PAT)&lt;/li&gt;
&lt;li&gt;Claude Code CLI, authenticated
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/tomohiro-owada/wikigen.git
&lt;span class="nb"&gt;cd &lt;/span&gt;wikigen
go build &lt;span class="nt"&gt;-o&lt;/span&gt; wikigen &lt;span class="nb"&gt;.&lt;/span&gt;
./wikigen owner/repo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/tomohiro-owada/wikigen" rel="noopener noreferrer"&gt;github.com/tomohiro-owada/wikigen&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example output:&lt;/strong&gt; &lt;a href="https://github.com/tomohiro-owada/wikigen/wiki" rel="noopener noreferrer"&gt;github.com/tomohiro-owada/wikigen/wiki&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Inspired by &lt;a href="https://github.com/AsyncFuncAI/deepwiki-open" rel="noopener noreferrer"&gt;DeepWiki-Open&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>cli</category>
      <category>documentation</category>
    </item>
    <item>
      <title>I Rewrote Google's Gemini CLI in Go - 68x Faster Startup</title>
      <dc:creator>Owada Tomohiro</dc:creator>
      <pubDate>Sat, 24 Jan 2026 08:50:45 +0000</pubDate>
      <link>https://dev.to/owada_tomohiro_28ec22f5ee/i-rewrote-googles-gemini-cli-in-go-68x-faster-startup-30em</link>
      <guid>https://dev.to/owada_tomohiro_28ec22f5ee/i-rewrote-googles-gemini-cli-in-go-68x-faster-startup-30em</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Google's official Gemini CLI has ~1 second Node.js startup overhead&lt;/li&gt;
&lt;li&gt;I rewrote it in Go → startup is now 0.01 seconds (68x faster)&lt;/li&gt;
&lt;li&gt;Reuses auth from official CLI, so your free tier / Workspace quota just works&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://github.com/tomohiro-owada/gmn" rel="noopener noreferrer"&gt;https://github.com/tomohiro-owada/gmn&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Built This
&lt;/h2&gt;

&lt;p&gt;Google's official Gemini CLI is an amazing tool. Rich TUI, seamless Google authentication, excellent MCP support. I loved using it.&lt;/p&gt;

&lt;p&gt;But there was one issue for my use case: &lt;strong&gt;startup time&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;time &lt;/span&gt;gemini &lt;span class="nt"&gt;--version&lt;/span&gt;
0.22.2
gemini &lt;span class="nt"&gt;--version&lt;/span&gt;  1.00s user 0.24s system 129% cpu 0.951 total
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;~1 second just to start. That's the Node.js runtime overhead. Fine for interactive use, but painful when you're calling it repeatedly in shell scripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: gmn
&lt;/h2&gt;

&lt;p&gt;So I rewrote the core functionality in Go. The result is &lt;strong&gt;gmn&lt;/strong&gt; (short for gemini-mini):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;time &lt;/span&gt;gmn &lt;span class="nt"&gt;--version&lt;/span&gt;
gmn version 0.2.0
gmn &lt;span class="nt"&gt;--version&lt;/span&gt;  0.00s user 0.00s system 47% cpu 0.014 total
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;0.014 seconds. 68x faster.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmarks
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;gmn&lt;/th&gt;
&lt;th&gt;Official CLI&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Startup&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.01s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.95s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;68x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binary&lt;/td&gt;
&lt;td&gt;5.6MB&lt;/td&gt;
&lt;td&gt;~200MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;35x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Node.js&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;With API response time included:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;time &lt;/span&gt;gmn &lt;span class="s2"&gt;"hi"&lt;/span&gt;
Hello! How can I &lt;span class="nb"&gt;help &lt;/span&gt;you today?
gmn &lt;span class="s2"&gt;"hi"&lt;/span&gt;  0.01s user 0.02s system 0% cpu 3.205 total

&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;time &lt;/span&gt;gemini &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"hi"&lt;/span&gt;
I&lt;span class="s1"&gt;'m ready to help. What would you like to do?
gemini -p "hi"  2.13s user 0.53s system 24% cpu 10.933 total
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites (Important!)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;gmn doesn't have its own authentication.&lt;/strong&gt; You must authenticate once with the official Gemini CLI first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @google/gemini-cli
gemini  &lt;span class="c"&gt;# Choose "Login with Google"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;gmn reuses credentials from &lt;code&gt;~/.gemini/&lt;/code&gt;. Your free tier quota or Workspace Code Assist quota applies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install gmn
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Homebrew:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;tomohiro-owada/tap/gmn
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Go:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go &lt;span class="nb"&gt;install &lt;/span&gt;github.com/tomohiro-owada/gmn@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Binary:&lt;/strong&gt;&lt;br&gt;
Download from &lt;a href="https://github.com/tomohiro-owada/gmn/releases" rel="noopener noreferrer"&gt;Releases&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Usage
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Simple prompt&lt;/span&gt;
gmn &lt;span class="s2"&gt;"Explain quantum computing"&lt;/span&gt;

&lt;span class="c"&gt;# With file context&lt;/span&gt;
gmn &lt;span class="s2"&gt;"Review this code"&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; main.go

&lt;span class="c"&gt;# Pipe input&lt;/span&gt;
&lt;span class="nb"&gt;cat &lt;/span&gt;error.log | gmn &lt;span class="s2"&gt;"What's wrong?"&lt;/span&gt;

&lt;span class="c"&gt;# JSON output&lt;/span&gt;
gmn &lt;span class="s2"&gt;"List 3 colors"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; json

&lt;span class="c"&gt;# Different model&lt;/span&gt;
gmn &lt;span class="s2"&gt;"Write a poem"&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; gemini-2.5-pro
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  Technical Details
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Discovering the API
&lt;/h3&gt;

&lt;p&gt;Initially, I tried using &lt;code&gt;generativelanguage.googleapis.com&lt;/code&gt; (the public Gemini API), but got 403 errors due to OAuth scope mismatch.&lt;/p&gt;

&lt;p&gt;Reading the official CLI source code, I discovered it actually uses the &lt;strong&gt;Code Assist API&lt;/strong&gt; (&lt;code&gt;cloudcode-pa.googleapis.com&lt;/code&gt;). This is an internal Google Cloud API, not publicly documented.&lt;/p&gt;
&lt;h3&gt;
  
  
  Auth Reuse
&lt;/h3&gt;

&lt;p&gt;The official CLI stores OAuth tokens in &lt;code&gt;~/.gemini/oauth_creds.json&lt;/code&gt;. gmn reads this file and refreshes tokens when needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsExpired&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;authMgr&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RefreshToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  MCP Support
&lt;/h3&gt;

&lt;p&gt;gmn also supports MCP (Model Context Protocol). It reads the same &lt;code&gt;~/.gemini/settings.json&lt;/code&gt; config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gmn mcp list
gmn mcp call my-server tool-name &lt;span class="nv"&gt;arg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;value
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's NOT Included
&lt;/h2&gt;

&lt;p&gt;gmn is focused on non-interactive use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Interactive/TUI mode → use official CLI&lt;/li&gt;
&lt;li&gt;OAuth flow → authenticate with official CLI first&lt;/li&gt;
&lt;li&gt;API Key / Vertex AI auth&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This is a love letter to Google's official Gemini CLI. I just needed something faster for scripting.&lt;/p&gt;

&lt;p&gt;If you use Gemini in shell scripts or automation, give gmn a try:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;tomohiro-owada/tap/gmn
gmn &lt;span class="s2"&gt;"Hello, World!"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://github.com/tomohiro-owada/gmn" rel="noopener noreferrer"&gt;https://github.com/tomohiro-owada/gmn&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Acknowledgments
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/google-gemini/gemini-cli" rel="noopener noreferrer"&gt;Google Gemini CLI&lt;/a&gt; — The incredible original&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ai.google.dev/" rel="noopener noreferrer"&gt;Google Gemini API&lt;/a&gt; — The underlying API&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>cli</category>
      <category>gemini</category>
      <category>go</category>
      <category>performance</category>
    </item>
    <item>
      <title>Introducing Free RAG for Claude Code — Save Tokens &amp; Time</title>
      <dc:creator>Owada Tomohiro</dc:creator>
      <pubDate>Sat, 25 Oct 2025 01:09:22 +0000</pubDate>
      <link>https://dev.to/owada_tomohiro_28ec22f5ee/introducing-free-rag-for-claude-code-save-tokens-time-2mi9</link>
      <guid>https://dev.to/owada_tomohiro_28ec22f5ee/introducing-free-rag-for-claude-code-save-tokens-time-2mi9</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Tired of feeding docs to Claude Code every single time?&lt;/p&gt;

&lt;p&gt;With a locally running, free RAG tool (&lt;strong&gt;DevRag&lt;/strong&gt;), Claude Code can find the right documents for you via vector search. You no longer need to remember hundreds of filenames or locations.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Completely free: no API, entirely local&lt;/li&gt;
&lt;li&gt;Simple setup: ~5 minutes&lt;/li&gt;
&lt;li&gt;Fast: token usage cut to 1/40, responses 15× faster&lt;/li&gt;
&lt;li&gt;Repository: &lt;a href="https://github.com/tomohiro-owada/devrag" rel="noopener noreferrer"&gt;https://github.com/tomohiro-owada/devrag&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Problems When Letting Claude Code Read Documents Directly
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Wasting context
&lt;/h3&gt;

&lt;p&gt;Claude Code’s context window is limited.&lt;/p&gt;

&lt;p&gt;Every time you have it read an entire document, you burn through a huge amount of tokens.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You: “Check the project’s API authentication scheme.”&lt;/li&gt;
&lt;li&gt;Claude reads &lt;code&gt;docs/auth.md&lt;/code&gt; (3,000 tokens)&lt;/li&gt;
&lt;li&gt;Claude: “We use JWT-based authentication.”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those 3,000 tokens are now gone from your prompt budget.&lt;br&gt;&lt;br&gt;
Ask something else later → it reads the whole thing again.&lt;/p&gt;


&lt;h3&gt;
  
  
  2. It’s hard to know which file to look at
&lt;/h3&gt;

&lt;p&gt;As docs accumulate, &lt;strong&gt;you&lt;/strong&gt; don’t know where things are — and neither does Claude.&lt;/p&gt;

&lt;p&gt;You: “Tell me about our Redis caching strategy.”&lt;br&gt;
Claude tries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;docs/architecture.md&lt;/code&gt; (4,000 tokens)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/caching.md&lt;/code&gt; (2,000 tokens)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/redis.md&lt;/code&gt; (doesn’t exist)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But maybe you only needed 200 tokens of &lt;code&gt;docs/caching.md&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In a project with 10–100 documents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You don’t know where others documented things&lt;/li&gt;
&lt;li&gt;You can’t predict filenames&lt;/li&gt;
&lt;li&gt;Asking “Where did we write that again?” becomes daily routine&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  3. Repeated documentation reading
&lt;/h3&gt;

&lt;p&gt;You often refer to the same docs:&lt;/p&gt;

&lt;p&gt;Session 1 → &lt;code&gt;docs/auth.md&lt;/code&gt; (3,000 tokens)&lt;br&gt;&lt;br&gt;
Session 2 → again (3,000 tokens)&lt;br&gt;&lt;br&gt;
Session 3 → again (3,000 tokens)&lt;/p&gt;

&lt;p&gt;Same file, three times, 9,000 tokens.&lt;/p&gt;

&lt;p&gt;Because you always read from the beginning, even if you only need a tiny piece.&lt;/p&gt;


&lt;h2&gt;
  
  
  RAG Solves All of These at Once
&lt;/h2&gt;
&lt;h3&gt;
  
  
  How RAG Works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Once&lt;/strong&gt; at the beginning: vectorize documents and index them&lt;/li&gt;
&lt;li&gt;At query time: retrieve only relevant chunks&lt;/li&gt;
&lt;li&gt;Claude reads only the necessary parts&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Traditional:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Question → Read whole document (3,000 tokens) → Answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;With RAG:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Question → Vector search relevant part (200 tokens) → Answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This cuts token usage significantly and increases signal-to-noise ratio.&lt;/p&gt;

&lt;p&gt;The biggest benefit:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Claude Code can find what you need even if you don’t know filenames.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  DevRag — A Simplified RAG for Claude Code
&lt;/h2&gt;

&lt;p&gt;I built &lt;strong&gt;DevRag&lt;/strong&gt; to make context retrieval simpler and faster for Claude Code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;One-binary: no external DB, no Python&lt;/li&gt;
&lt;li&gt;Auto model download on first run&lt;/li&gt;
&lt;li&gt;MCP integration as a &lt;code&gt;search&lt;/code&gt; tool&lt;/li&gt;
&lt;li&gt;Fast: startup ~2 s, search &amp;lt;100 ms&lt;/li&gt;
&lt;li&gt;Multilingual support (JP/EN)&lt;/li&gt;
&lt;li&gt;No vendor lock-in&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Setup (~5 minutes)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Download binary
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS (Apple Silicon)&lt;/span&gt;
wget https://github.com/tomohiro-owada/devrag/releases/latest/download/devrag-macos-apple-silicon.tar.gz
&lt;span class="nb"&gt;tar&lt;/span&gt; &lt;span class="nt"&gt;-xzf&lt;/span&gt; devrag-macos-apple-silicon.tar.gz
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x devrag-macos-apple-silicon
&lt;span class="nb"&gt;sudo mv &lt;/span&gt;devrag-macos-apple-silicon /usr/local/bin/devrag
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Configure Claude Code
&lt;/h3&gt;

&lt;p&gt;Add to &lt;code&gt;~/.claude.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"devrag"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stdio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/usr/local/bin/devrag"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Add some documents
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;documents
&lt;span class="nb"&gt;cp &lt;/span&gt;your-notes.md documents/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DevRag indexes automatically when launched.&lt;/p&gt;




&lt;h2&gt;
  
  
  Actual Usage Comparison
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Before (No RAG)
&lt;/h3&gt;

&lt;p&gt;You: “What’s our DB migration method?”&lt;/p&gt;

&lt;p&gt;Claude reads:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;README.md&lt;/code&gt; (5,000 tokens)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/database.md&lt;/code&gt; (4,000 tokens)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/setup.md&lt;/code&gt; (3,000 tokens)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;→ &lt;strong&gt;12,000 tokens&lt;/strong&gt;, ~30 seconds&lt;/p&gt;

&lt;p&gt;Because you’re guessing filenames.&lt;/p&gt;




&lt;h3&gt;
  
  
  After (With DevRag)
&lt;/h3&gt;

&lt;p&gt;You: “What’s our DB migration method?”&lt;/p&gt;

&lt;p&gt;Claude:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Runs vector search&lt;/li&gt;
&lt;li&gt;Finds relevant 300-token snippet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Claude:&lt;br&gt;&lt;br&gt;
“Run &lt;code&gt;npm run migrate&lt;/code&gt;. For details see &lt;code&gt;docs/database.md:42&lt;/code&gt;.”&lt;/p&gt;

&lt;p&gt;→ &lt;strong&gt;300 tokens&lt;/strong&gt;, ~2 seconds&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Directly reading documents means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ Token waste&lt;/li&gt;
&lt;li&gt;❌ Hard to find the right file&lt;/li&gt;
&lt;li&gt;❌ Repeat full-reads every session&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RAG means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Token usage cut to 1/40&lt;/li&gt;
&lt;li&gt;✅ Responses 15× faster&lt;/li&gt;
&lt;li&gt;✅ Filename knowledge not required&lt;/li&gt;
&lt;li&gt;✅ Setup in ~5 minutes&lt;/li&gt;
&lt;li&gt;✅ Entirely local and free&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let Claude Code retrieve what you need automatically using vector search.&lt;/p&gt;




&lt;h2&gt;
  
  
  Repository
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/tomohiro-owada/devrag" rel="noopener noreferrer"&gt;https://github.com/tomohiro-owada/devrag&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;License: MIT&lt;br&gt;&lt;br&gt;
Feedback: via Issues&lt;br&gt;&lt;br&gt;
Try it out! 🚀&lt;/p&gt;

</description>
      <category>tooling</category>
      <category>llm</category>
      <category>opensource</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
