<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AlexMikhalev</title>
    <description>The latest articles on DEV Community by AlexMikhalev (@alexmikhalev).</description>
    <link>https://dev.to/alexmikhalev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F615498%2F1d6840a1-b457-4a78-a93a-1fa431bfc53c.png</url>
      <title>DEV Community: AlexMikhalev</title>
      <link>https://dev.to/alexmikhalev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alexmikhalev"/>
    <language>en</language>
    <item>
      <title>Plug Terraphim Search into Claude Code and opencode (CLI First, MCP When You Need It)</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Sat, 18 Apr 2026 08:53:07 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/plug-terraphim-search-into-claude-code-and-opencode-cli-first-mcp-when-you-need-it-4404</link>
      <guid>https://dev.to/alexmikhalev/plug-terraphim-search-into-claude-code-and-opencode-cli-first-mcp-when-you-need-it-4404</guid>
      <description>&lt;p&gt;Your AI coding agent already has a knowledge graph. It is just not yours yet. The model knows GitHub, Stack Overflow, and the public training corpus -- it has no idea that in your project &lt;code&gt;npm&lt;/code&gt; should be &lt;code&gt;bun&lt;/code&gt;, that &lt;code&gt;RFP&lt;/code&gt; is shorthand for &lt;code&gt;acquisition need&lt;/code&gt;, or that the email about the Stripe receipt for the Obsidian licence lives in your Fastmail mailbox. This post shows the smallest path to fixing that for both &lt;a href="https://claude.com/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; and &lt;a href="https://opencode.ai" rel="noopener noreferrer"&gt;opencode&lt;/a&gt;, using &lt;a href="https://terraphim.ai" rel="noopener noreferrer"&gt;Terraphim&lt;/a&gt; and the three roles we have published over the last week (Terraphim Engineer, &lt;a href="https://terraphim.ai/posts/personal-assistant-role-jmap-obsidian/" rel="noopener noreferrer"&gt;Personal Assistant&lt;/a&gt;, &lt;a href="https://terraphim.ai/posts/system-operator-logseq-knowledge-graph/" rel="noopener noreferrer"&gt;System Operator&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Two paths. CLI first.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "integrate" means here
&lt;/h2&gt;

&lt;p&gt;The host -- Claude Code or opencode -- needs a way to ask your role-aware Terraphim setup a question and get back ranked, source-attributed results. The model decides when to ask. The role decides which haystacks to search. Terraphim's &lt;code&gt;terraphim-graph&lt;/code&gt; ranker decides which results come back first.&lt;/p&gt;

&lt;p&gt;Concrete example. You are working in opencode and you type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/tsearch "System Operator" RFP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The slash command runs against the System Operator role. The role's knowledge graph normalises &lt;code&gt;RFP&lt;/code&gt; to its INCOSE-canonical form &lt;code&gt;acquisition need&lt;/code&gt;. The Aho-Corasick matcher walks the role's haystack (1,347 Logseq pages from the &lt;a href="https://github.com/terraphim/system-operator" rel="noopener noreferrer"&gt;terraphim/system-operator&lt;/a&gt; repository). The top hit comes back ranked 13 -- &lt;code&gt;Acquisition need.md&lt;/code&gt; -- with the &lt;code&gt;synonyms::&lt;/code&gt; line that mapped your query to it visible in the snippet. The model now has the right page in its context window and can answer your follow-up without a hallucinated INCOSE handbook reference.&lt;/p&gt;

&lt;p&gt;This works in both hosts because both speak the same two integration languages: shell-out slash commands and MCP servers. We are going to use both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Path A -- CLI via slash command
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why this is the recommended starting point
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;terraphim-agent&lt;/code&gt; already exists, takes &lt;code&gt;--role&lt;/code&gt; and &lt;code&gt;--limit&lt;/code&gt;, and writes ranked results to stdout. There is nothing to build. Both Claude Code and opencode let slash commands shell out via Bash. So a two-line command file is the entire integration.&lt;/p&gt;

&lt;h3&gt;
  
  
  One file, two hosts
&lt;/h3&gt;

&lt;p&gt;Drop this at &lt;code&gt;~/.claude/commands/tsearch.md&lt;/code&gt; (and an identical copy at &lt;code&gt;~/.config/opencode/command/tsearch.md&lt;/code&gt; -- both hosts read the same frontmatter shape):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Terraphim&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;across&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;configured&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;roles.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Usage:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;/tsearch&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;[role]&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;lt;query&amp;gt;"&lt;/span&gt;
&lt;span class="na"&gt;allowed-tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Bash(terraphim-agent search:*), Bash(terraphim-agent-pa search:*)&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
Run &lt;span class="sb"&gt;`terraphim-agent search --role "&amp;lt;role&amp;gt;" --limit 5 "&amp;lt;query&amp;gt;"`&lt;/span&gt; (or
&lt;span class="sb"&gt;`terraphim-agent-pa search ...`&lt;/span&gt; if the role is "Personal Assistant" and
the query needs the JMAP haystack). Return the top results as a numbered
list with title, source path/URL, and a 120-char snippet.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is it. The &lt;code&gt;allowed-tools&lt;/code&gt; line auto-approves the two CLI invocations so the model does not have to ask permission per call. Restart the host (or reload commands) and &lt;code&gt;/tsearch&lt;/code&gt; is live.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why fast enough
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;terraphim-agent&lt;/code&gt; reads its persisted role state at start (low milliseconds), runs the query against the role's haystacks, and returns. For a typical knowledge-graph query against the Terraphim Engineer role on a laptop, the round trip from slash command to formatted output is well under a second. The agent already has the typed CLI -- &lt;code&gt;--role&lt;/code&gt;, &lt;code&gt;--limit&lt;/code&gt;, &lt;code&gt;--format json&lt;/code&gt; -- so there is nothing the MCP layer adds for the search-only flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three example queries
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/tsearch "Terraphim Engineer" rolegraph
/tsearch "System Operator" RFP
/tsearch "Personal Assistant" invoice    # uses terraphim-agent-pa wrapper for JMAP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Personal Assistant case is the most interesting because it crosses surfaces -- Obsidian notes interleave with &lt;code&gt;jmap:///email/&amp;lt;id&amp;gt;&lt;/code&gt; URLs from your Fastmail mailbox, ranked by the same &lt;code&gt;terraphim-graph&lt;/code&gt; scoring. The wrapper script injects &lt;code&gt;JMAP_ACCESS_TOKEN&lt;/code&gt; from 1Password at call time so the secret never lands on disk; the bare &lt;code&gt;terraphim-agent&lt;/code&gt; continues to work for the other five roles without paying the unlock cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Path B -- MCP server (when you want typed tools)
&lt;/h2&gt;

&lt;p&gt;The CLI path is enough for search. If you want the model to call &lt;code&gt;search&lt;/code&gt; as a first-class tool with structured JSON parameters -- alongside &lt;code&gt;autocomplete_terms&lt;/code&gt;, &lt;code&gt;autocomplete_with_snippets&lt;/code&gt;, four flavours of fuzzy autocomplete, &lt;code&gt;build_autocomplete_index&lt;/code&gt;, and &lt;code&gt;update_config_tool&lt;/code&gt; -- that is what &lt;code&gt;terraphim_mcp_server&lt;/code&gt; exposes. It reads the same &lt;code&gt;~/.config/terraphim/embedded_config.json&lt;/code&gt;, so the role list is identical.&lt;/p&gt;

&lt;h3&gt;
  
  
  Build and install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/projects/terraphim/terraphim-ai
cargo build &lt;span class="nt"&gt;--release&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; terraphim_mcp_server &lt;span class="nt"&gt;--features&lt;/span&gt; jmap
&lt;span class="nb"&gt;cp &lt;/span&gt;target/release/terraphim_mcp_server ~/.cargo/bin/terraphim_mcp_server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the Personal Assistant role, mirror the existing &lt;code&gt;terraphim-agent-pa&lt;/code&gt; wrapper at &lt;code&gt;~/bin/terraphim_mcp_server-pa&lt;/code&gt; so the JMAP token flows through &lt;code&gt;op run&lt;/code&gt; instead of being baked into config.&lt;/p&gt;

&lt;h3&gt;
  
  
  Register
&lt;/h3&gt;

&lt;p&gt;opencode -- add to &lt;code&gt;~/.config/opencode/opencode.json&lt;/code&gt; under &lt;code&gt;mcp&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="nl"&gt;"terraphim"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"local"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/Users/alex/.cargo/bin/terraphim_mcp_server"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"terraphim-pa"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"local"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/Users/alex/bin/terraphim_mcp_server-pa"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code -- one shell command per server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add terraphim    /Users/alex/.cargo/bin/terraphim_mcp_server
claude mcp add terraphim-pa /Users/alex/bin/terraphim_mcp_server-pa
claude mcp list      &lt;span class="c"&gt;# both should show as Connected&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model now sees &lt;code&gt;mcp__terraphim__search&lt;/code&gt; and &lt;code&gt;mcp__terraphim_pa__search&lt;/code&gt; (plus the autocomplete tools) in its tool list.&lt;/p&gt;

&lt;h2&gt;
  
  
  SessionStart primer (both paths)
&lt;/h2&gt;

&lt;p&gt;Slash commands and MCP tools are useless if the model does not know the roles exist. Extend the SessionStart hook in &lt;code&gt;~/.claude/settings.json&lt;/code&gt; to print a one-screen role index when each session starts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'\n--- Terraphim search via /tsearch [role] &amp;lt;query&amp;gt; ---\n'&lt;/span&gt;
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'  Terraphim Engineer  (Rust/agent KG)\n'&lt;/span&gt;
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'  Personal Assistant  (Obsidian + Fastmail JMAP, use terraphim-agent-pa for email)\n'&lt;/span&gt;
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'  System Operator     (INCOSE/MBSE Logseq KG)\n'&lt;/span&gt;
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'  Context Engineering Author, Rust Engineer, Default\n'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Equivalent hook in opencode. Cost: one screen of context per session. Benefit: the model picks the right role on the first try instead of guessing.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to pick which path
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;CLI (Path A)&lt;/th&gt;
&lt;th&gt;MCP (Path B)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;New binaries&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;terraphim_mcp_server&lt;/code&gt; plus wrapper&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cold start&lt;/td&gt;
&lt;td&gt;~50-200 ms per call&lt;/td&gt;
&lt;td&gt;~10-50 ms per call (long-lived process)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tools exposed&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;search&lt;/code&gt; only&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;search&lt;/code&gt; + 4 autocomplete + &lt;code&gt;build_autocomplete_index&lt;/code&gt; + &lt;code&gt;update_config_tool&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Works in any host&lt;/td&gt;
&lt;td&gt;Yes -- anything that runs a slash command&lt;/td&gt;
&lt;td&gt;Only hosts that speak MCP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token handling&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;terraphim-agent-pa&lt;/code&gt; wrapper&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;terraphim_mcp_server-pa&lt;/code&gt; wrapper&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For the search-across-roles flow, CLI is enough. Add MCP when the model needs autocomplete-as-you-type, when you want it to manage role configuration without leaving the conversation, or when you are using a host where the typed-tool surface matters more than the cold-start cost.&lt;/p&gt;

&lt;p&gt;You do not have to choose. Wire both. The slash command above defaults to CLI and falls back to MCP if the binary is missing -- the two paths coexist cleanly because they read the same role config.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is the right shape
&lt;/h2&gt;

&lt;p&gt;Most "AI assistant + knowledge base" integrations end up tightly coupled to a specific host. Vendor X's plugin marketplace, Vendor Y's tool format. Terraphim takes the opposite stance: the role configuration lives in your filesystem, the haystacks live in your filesystem (or your mailbox), the ranker runs in a process you own, and the integration with the AI host is the thinnest possible shim -- a slash command or an MCP server, both of which are commodity surfaces.&lt;/p&gt;

&lt;p&gt;Yesterday the Personal Assistant role was a private setup on one laptop. Today it is callable from inside two different AI coding hosts via a one-file slash command. Tomorrow you can add Cursor or Aider with the same two-line wrapper because the integration surface is &lt;code&gt;terraphim-agent search&lt;/code&gt;, not &lt;code&gt;vendor-specific-tool-protocol-v3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The expensive part of context engineering is not the ranker. It is the vocabulary in the knowledge graph and the haystacks the role can reach. The integration layer should not be allowed to compete for that budget. CLI-first keeps it small.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build (or install the published crate when JMAP feature lands on crates.io)&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/projects/terraphim/terraphim-ai
cargo &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--path&lt;/span&gt; crates/terraphim_agent &lt;span class="nt"&gt;--features&lt;/span&gt; jmap

&lt;span class="c"&gt;# Configure roles -- copy the snippets from the how-tos linked below&lt;/span&gt;
&lt;span class="nv"&gt;$EDITOR&lt;/span&gt; ~/.config/terraphim/embedded_config.json

&lt;span class="c"&gt;# Install the slash command&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/.claude/commands ~/.config/opencode/command
&lt;span class="nb"&gt;cp&lt;/span&gt; ~/projects/terraphim/terraphim-ai/docs/src/howto/mcp-integration-claude-opencode.md &lt;span class="se"&gt;\&lt;/span&gt;
   /tmp/tsearch.md  &lt;span class="c"&gt;# adapt to your slash command file shape&lt;/span&gt;

&lt;span class="c"&gt;# Reload roles&lt;/span&gt;
terraphim-agent config reload

&lt;span class="c"&gt;# Try&lt;/span&gt;
terraphim-agent search &lt;span class="nt"&gt;--role&lt;/span&gt; &lt;span class="s2"&gt;"Terraphim Engineer"&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 3 &lt;span class="s2"&gt;"rolegraph"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step-by-step in the docs: &lt;a href="https://docs.terraphim.ai/howto/mcp-integration-claude-opencode.html" rel="noopener noreferrer"&gt;Plug Terraphim Search into Claude Code and opencode&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For the underlying engine, start with &lt;a href="https://terraphim.ai/posts/why-graph-embeddings-matter/" rel="noopener noreferrer"&gt;Why Graph Embeddings Matter&lt;/a&gt;. For the two roles this integration most cleanly exposes, see &lt;a href="https://terraphim.ai/posts/personal-assistant-role-jmap-obsidian/" rel="noopener noreferrer"&gt;Personal Assistant&lt;/a&gt; and &lt;a href="https://terraphim.ai/posts/system-operator-logseq-knowledge-graph/" rel="noopener noreferrer"&gt;System Operator&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>terraphim</category>
      <category>claudecode</category>
      <category>opencode</category>
      <category>mcp</category>
    </item>
    <item>
      <title>System Operator Demo: A Logseq Knowledge Graph Drives Enterprise MBSE Search</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Fri, 17 Apr 2026 18:31:00 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/system-operator-demo-a-logseq-knowledge-graph-drives-enterprise-mbse-search-1kb</link>
      <guid>https://dev.to/alexmikhalev/system-operator-demo-a-logseq-knowledge-graph-drives-enterprise-mbse-search-1kb</guid>
      <description>&lt;p&gt;Terraphim's &lt;a href="https://github.com/terraphim/system-operator" rel="noopener noreferrer"&gt;System Operator role&lt;/a&gt; is the demo we point people at when they want to see a real Logseq knowledge graph drive search. 1,347 Logseq pages, 52 of them carrying explicit &lt;code&gt;synonyms::&lt;/code&gt; lines, covering Model-Based Systems Engineering vocabulary -- requirements, architecture, verification, validation, life cycle concepts. This post walks the demo end-to-end and shows the piece people miss: the KG is doing real work, not just re-ranking text matches.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the demo is
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;terraphim/system-operator&lt;/code&gt; repository on GitHub is a &lt;strong&gt;Logseq vault&lt;/strong&gt; -- flat folder of markdown files under &lt;code&gt;pages/&lt;/code&gt;, one page per concept, with Logseq's bullet-tree syntax for structure and Terraphim-format &lt;code&gt;synonyms::&lt;/code&gt; lines for the knowledge-graph layer. Two things make it a useful demo rather than a toy:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Real MBSE vocabulary.&lt;/strong&gt; The synonyms are not invented; they track the INCOSE Systems Engineering Handbook v.4, the V-Model, and SEMP conventions. When you type &lt;code&gt;RFP&lt;/code&gt;, the automaton normalises it to &lt;code&gt;acquisition need&lt;/code&gt; because that is what the handbook calls it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real scale.&lt;/strong&gt; 1,347 markdown files is enough to expose cold-start behaviour (~5-10 seconds to index on a laptop) without being so large it obscures the ranking signal.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Run it
&lt;/h2&gt;

&lt;p&gt;There is an automated setup script in the repo. As of today it clones to a durable path under &lt;code&gt;~/.config/terraphim/system_operator&lt;/code&gt; instead of &lt;code&gt;/tmp&lt;/code&gt;, so the vault survives a reboot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./scripts/setup_system_operator.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then either drive the role via the server --&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo run &lt;span class="nt"&gt;--bin&lt;/span&gt; terraphim_server &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--config&lt;/span&gt; terraphim_server/default/system_operator_config.json
curl &lt;span class="s2"&gt;"http://127.0.0.1:8000/documents/search?q=RFP&amp;amp;role=System%20Operator&amp;amp;limit=5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;-- or via the &lt;code&gt;terraphim-agent&lt;/code&gt; CLI after adding the role entry to &lt;code&gt;~/.config/terraphim/embedded_config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;terraphim-agent config reload
terraphim-agent search &lt;span class="nt"&gt;--role&lt;/span&gt; &lt;span class="s2"&gt;"System Operator"&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 5 &lt;span class="s2"&gt;"RFP"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full config snippet and the &lt;code&gt;embedded_config.json&lt;/code&gt; entry are in the &lt;a href="https://github.com/terraphim/terraphim-ai/blob/main/terraphim_server/README_SYSTEM_OPERATOR.md" rel="noopener noreferrer"&gt;&lt;code&gt;README_SYSTEM_OPERATOR.md&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The piece people miss
&lt;/h2&gt;

&lt;p&gt;Search-over-notes tools usually describe ranking in terms of "it uses a knowledge graph". That sentence hides a lot. Is the graph actually consulted at query time? Is it just a post-hoc re-ranker on top of BM25? Does it expand synonyms? On what vocabulary?&lt;/p&gt;

&lt;p&gt;Terraphim exposes the answer directly. &lt;code&gt;validate --connectivity&lt;/code&gt; prints which words in your query the automaton matched and what canonical terms they normalised to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;terraphim-agent validate &lt;span class="nt"&gt;--role&lt;/span&gt; &lt;span class="s2"&gt;"System Operator"&lt;/span&gt; &lt;span class="nt"&gt;--connectivity&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="s2"&gt;"RFP business analysis life cycle model business requirements documentation tree"&lt;/span&gt;

Connectivity Check &lt;span class="k"&gt;for &lt;/span&gt;role &lt;span class="s1"&gt;'System Operator'&lt;/span&gt;:
  Connected: &lt;span class="nb"&gt;false
  &lt;/span&gt;Matched terms: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"acquisition need"&lt;/span&gt;, &lt;span class="s2"&gt;"business or mission analysis"&lt;/span&gt;,
                  &lt;span class="s2"&gt;"business requirements"&lt;/span&gt;, &lt;span class="s2"&gt;"documentation tree"&lt;/span&gt;,
                  &lt;span class="s2"&gt;"life cycle concepts"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Five query fragments, five canonical matches. &lt;code&gt;RFP&lt;/code&gt; collapsed to &lt;code&gt;acquisition need&lt;/code&gt; (its synonym, from &lt;code&gt;Acquisition need.md&lt;/code&gt; in the vault). &lt;code&gt;business analysis&lt;/code&gt; collapsed to &lt;code&gt;business or mission analysis&lt;/code&gt; (INCOSE terminology). &lt;code&gt;life cycle model&lt;/code&gt; collapsed to &lt;code&gt;life cycle concepts&lt;/code&gt;. None of this is text matching -- the word &lt;code&gt;RFP&lt;/code&gt; does not appear in the canonical page body; it lives in the &lt;code&gt;synonyms::&lt;/code&gt; line.&lt;/p&gt;

&lt;p&gt;Once a query is normalised, the ranker walks the graph. A document that mentions &lt;code&gt;acquisition need&lt;/code&gt; directly outranks one that mentions it through three synonym hops, and both outrank a document that mentions none of the canonical terms at all. Ranks come back with concrete integer scores -- &lt;code&gt;[13]&lt;/code&gt; on a top result, not an opaque 0.87 cosine.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it compares to the Personal Assistant role
&lt;/h2&gt;

&lt;p&gt;We &lt;a href="https://terraphim.ai/posts/personal-assistant-role-jmap-obsidian/" rel="noopener noreferrer"&gt;wrote up the Personal Assistant role yesterday&lt;/a&gt;: a private per-user role that indexes a Fastmail mailbox plus an Obsidian vault. Same engine, same ranker, different haystacks. The knowledge graph there is a small &lt;code&gt;kg/&lt;/code&gt; folder inside the user's vault with 14 synonym files covering personal vocabulary (&lt;code&gt;bun&lt;/code&gt; with &lt;code&gt;npm&lt;/code&gt;/&lt;code&gt;yarn&lt;/code&gt;/&lt;code&gt;pnpm&lt;/code&gt; synonyms, &lt;code&gt;odilo&lt;/code&gt;, &lt;code&gt;invoice&lt;/code&gt;, &lt;code&gt;meeting&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The two roles expose the same pattern at two scales:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;System Operator&lt;/th&gt;
&lt;th&gt;Personal Assistant&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;KG size&lt;/td&gt;
&lt;td&gt;52 synonym files, 1,300-concept vocabulary&lt;/td&gt;
&lt;td&gt;14 synonym files, ~30-concept personal vocabulary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Haystacks&lt;/td&gt;
&lt;td&gt;1 (Logseq repo)&lt;/td&gt;
&lt;td&gt;2 (Obsidian vault + Fastmail JMAP)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source&lt;/td&gt;
&lt;td&gt;Public GitHub repo&lt;/td&gt;
&lt;td&gt;Private user files and mailbox&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audience&lt;/td&gt;
&lt;td&gt;Demos, onboarding, public showcase&lt;/td&gt;
&lt;td&gt;One user&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lifetime&lt;/td&gt;
&lt;td&gt;Frozen per release&lt;/td&gt;
&lt;td&gt;Edited daily, rebuilt in 20 ms per edit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both use &lt;code&gt;terraphim-graph&lt;/code&gt; ranking. Both build an Aho-Corasick automaton once at role-load time. Both run in a 4 GB process on a laptop with no cloud round-trip. The only interesting difference is the vocabulary, which is exactly the separation of concerns a knowledge-graph-first design is supposed to deliver.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters for teams evaluating MBSE tooling
&lt;/h2&gt;

&lt;p&gt;If you are evaluating Terraphim for a systems engineering group, the System Operator role is the honest starting point. It runs on a laptop against a public vault; you can check that every synonym mapping traces back to a concrete page; you can diff the &lt;code&gt;pages/&lt;/code&gt; folder against the INCOSE handbook and argue about terminology. When your team's own vocabulary diverges (every organisation's does), you clone the repo, edit &lt;code&gt;synonyms::&lt;/code&gt; lines, and the graph rebuilds in 20 milliseconds without a retraining step.&lt;/p&gt;

&lt;p&gt;The expensive part of enterprise search is not the ranker. It is the vocabulary. A deterministic graph makes the vocabulary an asset you curate, not a black box you tune.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/terraphim/terraphim-ai
&lt;span class="nb"&gt;cd &lt;/span&gt;terraphim-ai
./scripts/setup_system_operator.sh
cargo run &lt;span class="nt"&gt;--bin&lt;/span&gt; terraphim_server &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--config&lt;/span&gt; terraphim_server/default/system_operator_config.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or cut to the CLI if you already have &lt;code&gt;terraphim-agent&lt;/code&gt; installed -- the &lt;code&gt;embedded_config.json&lt;/code&gt; snippet is in the &lt;a href="https://github.com/terraphim/terraphim-ai/blob/main/terraphim_server/README_SYSTEM_OPERATOR.md" rel="noopener noreferrer"&gt;README&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For the underlying engine, start with &lt;a href="https://terraphim.ai/posts/why-graph-embeddings-matter/" rel="noopener noreferrer"&gt;Why Graph Embeddings Matter&lt;/a&gt;. For the personal-productivity analogue, see the &lt;a href="https://terraphim.ai/posts/personal-assistant-role-jmap-obsidian/" rel="noopener noreferrer"&gt;Personal Assistant role post&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>terraphim</category>
      <category>logseq</category>
      <category>knowledgegraph</category>
      <category>mbse</category>
    </item>
    <item>
      <title>Personal Assistant Role: One Search Across Email and Notes</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Fri, 17 Apr 2026 13:31:38 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/personal-assistant-role-one-search-across-email-and-notes-5691</link>
      <guid>https://dev.to/alexmikhalev/personal-assistant-role-one-search-across-email-and-notes-5691</guid>
      <description>&lt;p&gt;Most "personal AI" tools split your context across silos: one search box for email, another for notes, a third for your chat history. Terraphim treats every source as a haystack on the same role, so a single query crosses all of them. This post shows how to wire up the two most common personal sources -- email via JMAP and notes in an Obsidian vault -- under a new Personal Assistant role.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a unified role matters
&lt;/h2&gt;

&lt;p&gt;The mental tax of personal search is not the typing. It is the &lt;em&gt;deciding&lt;/em&gt;. "Did I read that in an email or write it in a note?" Each silo you skip is a context switch with no useful payload. Once Terraphim is the front door for both surfaces, the question collapses to "where is the thing about X" and the role's &lt;code&gt;terraphim-graph&lt;/code&gt; ranking serves whichever source actually has the strongest signal.&lt;/p&gt;

&lt;p&gt;The Personal Assistant role uses two haystacks under one role:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Obsidian vault&lt;/strong&gt; indexed by the &lt;code&gt;Ripgrep&lt;/code&gt; service. Plain markdown, sub-millisecond local search, no daemon.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fastmail mailbox&lt;/strong&gt; indexed by the &lt;code&gt;Jmap&lt;/code&gt; service (&lt;a href="https://www.rfc-editor.org/rfc/rfc8620" rel="noopener noreferrer"&gt;RFC 8620/8621&lt;/a&gt;). One HTTPS round trip per query, server-side full-text against your real mailbox, results returned with &lt;code&gt;jmap:///email/&amp;lt;id&amp;gt;&lt;/code&gt; URLs you can paste back into a mail client.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ranking is the same &lt;code&gt;terraphim-graph&lt;/code&gt; scoring as every other Terraphim role: an Aho-Corasick automaton built from the Obsidian vault contributes synonyms specific to your project vocabulary, then both haystacks share the same rank ladder. Notes and email interleave by relevance, not by source.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you get
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A single command: &lt;code&gt;terraphim-agent-pa search "&amp;lt;query&amp;gt;"&lt;/code&gt; returns mixed hits ordered by rank.&lt;/li&gt;
&lt;li&gt;Determinism. Every match traces back to a concrete edge in your knowledge graph -- no opaque embedding score.&lt;/li&gt;
&lt;li&gt;Privacy. The Obsidian vault never leaves disk; the JMAP query goes directly to your mail provider with your token. No Terraphim cloud component sits in the path.&lt;/li&gt;
&lt;li&gt;Composition. The role is just a JSON entry in &lt;code&gt;~/.config/terraphim/embedded_config.json&lt;/code&gt;. Add another haystack tomorrow (calendar, contacts, browser history) and the same query sweeps it too.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A 4 GB process on your laptop holds the whole working set; queries return in single-digit milliseconds for the local side and a few hundred for the remote JMAP round trip.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiring sketch
&lt;/h2&gt;

&lt;p&gt;The role config is roughly thirty lines of JSON: two haystacks, one knowledge-graph pointer, no LLM. The Fastmail token is &lt;em&gt;not&lt;/em&gt; in the config -- it is injected at runtime via &lt;code&gt;op run&lt;/code&gt; from 1Password into the &lt;code&gt;JMAP_ACCESS_TOKEN&lt;/code&gt; environment variable, so the secret never lands on disk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;exec &lt;/span&gt;op run &lt;span class="nt"&gt;--account&lt;/span&gt; my.1password.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--env-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'JMAP_ACCESS_TOKEN=op://VAULT/ITEM/credential'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--&lt;/span&gt; /Users/alex/.cargo/bin/terraphim-agent &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wrap that in &lt;code&gt;~/bin/terraphim-agent-pa&lt;/code&gt;, &lt;code&gt;chmod +x&lt;/code&gt;, and the JMAP haystack lights up only for queries that ask for it. The other roles keep using the bare &lt;code&gt;terraphim-agent&lt;/code&gt; and never pay for the 1Password unlock.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why graph embeddings make this practical
&lt;/h2&gt;

&lt;p&gt;The reason a unified role works at all -- not just for two haystacks but for any reasonable number -- is that Terraphim's graph-embeddings layer is sub-millisecond and deterministic. There is no per-query embedding API call to amortise across sources, no vector database to keep in sync, no opaque ranker that has to be retrained when you add a new haystack. The matching is byte-level Aho-Corasick traversal of an automaton built once at role-load time. We wrote up the engine in detail at &lt;a href="https://terraphim.ai/posts/why-graph-embeddings-matter/" rel="noopener noreferrer"&gt;Why Graph Embeddings Matter&lt;/a&gt;; this Personal Assistant role is one application of that engine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The end-to-end how-to is in the docs: install the prerequisites, add the JSON snippet, write the wrapper, run three verification queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Read the how-to: &lt;a href="https://docs.terraphim.ai/howto/personal-assistant-role.html" rel="noopener noreferrer"&gt;Personal Assistant Role on docs.terraphim.ai&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One caveat worth surfacing up front: the published &lt;code&gt;terraphim-agent&lt;/code&gt; on crates.io does not yet ship with the JMAP haystack (the &lt;code&gt;haystack_jmap&lt;/code&gt; dependency is not published either). For email search you need to build from local source with &lt;code&gt;cargo build --release -p terraphim_agent --features jmap&lt;/code&gt;. The how-to walks through the two Cargo.toml edits required.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is next
&lt;/h2&gt;

&lt;p&gt;Personal Assistant is the smallest useful instance of "Terraphim as the front door for everything I read." Calendar (CalDAV), contacts (CardDAV), browser bookmarks, RSS, and AI session logs are all natural follow-ups -- each is a single haystack entry on the same role. The pattern composes; the cost stays linear in haystacks, not quadratic in cross-source queries.&lt;/p&gt;

&lt;p&gt;If you want the underlying engine, start with &lt;a href="https://terraphim.ai/posts/why-graph-embeddings-matter/" rel="noopener noreferrer"&gt;Why Graph Embeddings Matter&lt;/a&gt;. If you want to wire knowledge-graph hooks into your AI coding agent on the same machine, &lt;a href="https://terraphim.ai/posts/teaching-ai-agents-with-knowledge-graphs/" rel="noopener noreferrer"&gt;Teaching AI Coding Agents with Knowledge Graph Hooks&lt;/a&gt; covers that side of the same engine.&lt;/p&gt;

</description>
      <category>terraphim</category>
      <category>personalassistant</category>
      <category>jmap</category>
      <category>obsidian</category>
    </item>
    <item>
      <title>Teaching AI Coding Agents with Knowledge Graph Hooks</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Fri, 17 Apr 2026 10:50:07 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/teaching-ai-coding-agents-with-knowledge-graph-hooks-497i</link>
      <guid>https://dev.to/alexmikhalev/teaching-ai-coding-agents-with-knowledge-graph-hooks-497i</guid>
      <description>&lt;p&gt;How we use Aho-Corasick automata and knowledge graphs to automatically enforce coding standards across AI coding agents like Claude Code, Cursor, and Aider.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;New:&lt;/strong&gt; see &lt;a href="https://terraphim.ai/posts/why-graph-embeddings-matter/" rel="noopener noreferrer"&gt;Why Graph Embeddings Matter&lt;/a&gt; for the underlying engine that makes these hooks possible — sub-millisecond, deterministic, fully explainable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Anthropic Bought Bun. Claude Still Outputs &lt;code&gt;npm install&lt;/code&gt;.
&lt;/h2&gt;

&lt;p&gt;On December 3, 2025, &lt;a href="https://www.anthropic.com/news/anthropic-acquires-bun-as-claude-code-reaches-usd1b-milestone" rel="noopener noreferrer"&gt;Anthropic announced its first-ever acquisition&lt;/a&gt;: Bun, the blazing-fast JavaScript runtime. This came alongside Claude Code reaching &lt;a href="https://bun.com/blog/bun-joins-anthropic" rel="noopener noreferrer"&gt;$1 billion in run-rate revenue&lt;/a&gt; just six months after public launch.&lt;/p&gt;

&lt;p&gt;As Mike Krieger, Anthropic's Chief Product Officer, put it:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Bun represents exactly the kind of technical excellence we want to bring into Anthropic... bringing the Bun team into Anthropic means we can build the infrastructure to compound that momentum."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude Code &lt;a href="https://simonwillison.net/2025/Dec/2/anthropic-acquires-bun/" rel="noopener noreferrer"&gt;ships as a Bun executable&lt;/a&gt; to millions of developers. Anthropic now owns the runtime their flagship coding tool depends on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;And yet...&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ask Claude to set up a Node.js project, and what do you get?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;express
yarn add lodash
pnpm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--save-dev&lt;/span&gt; jest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yet Anthropic's own models still default to npm, yarn, and pnpm in their outputs. The training data predates the acquisition, and old habits die hard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So how do you teach your AI coding tools to consistently use Bun, regardless of what the underlying LLM insists on?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: LLMs Don't Know Your Preferences
&lt;/h2&gt;

&lt;p&gt;AI coding agents are powerful, but they're trained on the internet's collective habits—which means npm everywhere. Your team might have standardized on Bun for its speed (25% monthly growth, &lt;a href="https://devclass.com/2025/12/03/bun-javascript-runtime-acquired-by-anthropic-tying-its-future-to-ai-coding/" rel="noopener noreferrer"&gt;7.2 million downloads&lt;/a&gt; in October 2025), but every AI agent keeps suggesting the old ways.&lt;/p&gt;

&lt;p&gt;Manually fixing these inconsistencies is tedious. What if your knowledge graph could automatically intercept and transform AI outputs?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Knowledge Graph Hooks
&lt;/h2&gt;

&lt;p&gt;Terraphim provides a hook system that intercepts AI agent actions and applies knowledge graph-based transformations. The system uses:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Aho-Corasick automata&lt;/strong&gt; for efficient multi-pattern matching&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LeftmostLongest strategy&lt;/strong&gt; ensuring specific patterns match before general ones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Markdown-based knowledge graph&lt;/strong&gt; files that are human-readable and version-controlled&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input Text → Aho-Corasick Automata → Pattern Match → Knowledge Graph Lookup → Transformed Output
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The knowledge graph is built from simple markdown files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# bun install&lt;/span&gt;

Fast package installation with Bun.

synonyms:: pnpm install, npm install, yarn install
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the automata encounter any synonym, they replace it with the canonical term (the heading).&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Example: npm → bun
&lt;/h2&gt;

&lt;p&gt;Let's prove it works. Here's a live test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"npm install"&lt;/span&gt; | terraphim-agent replace
bun &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"yarn install lodash"&lt;/span&gt; | terraphim-agent replace
bun &lt;span class="nb"&gt;install &lt;/span&gt;lodash

&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"pnpm install --save-dev jest"&lt;/span&gt; | terraphim-agent replace
bun &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--save-dev&lt;/span&gt; jest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LeftmostLongest matching ensures &lt;code&gt;npm install&lt;/code&gt; matches the more specific pattern before standalone &lt;code&gt;npm&lt;/code&gt; could match.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hook Integration Points
&lt;/h2&gt;

&lt;p&gt;Terraphim hooks integrate at multiple points in the development workflow:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Claude Code PreToolUse Hooks
&lt;/h3&gt;

&lt;p&gt;Intercept Bash commands before execution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"terraphim-agent replace"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When Claude Code tries to run &lt;code&gt;npm install express&lt;/code&gt;, the hook transforms it to &lt;code&gt;bun install express&lt;/code&gt; before execution.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Git prepare-commit-msg Hooks
&lt;/h3&gt;

&lt;p&gt;Enforce attribution standards in commits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;COMMIT_MSG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;
&lt;span class="nv"&gt;ORIGINAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$COMMIT_MSG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;TRANSFORMED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ORIGINAL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | terraphim-agent replace&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TRANSFORMED&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$COMMIT_MSG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With a knowledge graph entry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Terraphim AI&lt;/span&gt;

Attribution for AI-assisted development.

synonyms:: Claude Code, Claude, Anthropic Claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every commit message mentioning "Claude Code" becomes "Terraphim AI".&lt;/p&gt;

&lt;h3&gt;
  
  
  3. MCP Tools
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;replace_matches&lt;/code&gt; MCP tool exposes the same functionality to any MCP-compatible client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"replace_matches"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"arguments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Run npm install to setup"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The hook system is built on three crates:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Crate&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;terraphim_automata&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Aho-Corasick pattern matching, thesaurus building&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;terraphim_hooks&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ReplacementService, HookResult, binary discovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;terraphim_agent&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CLI with &lt;code&gt;replace&lt;/code&gt; subcommand&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Performance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pattern matching&lt;/strong&gt;: O(n) where n is input length (not pattern count)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Startup&lt;/strong&gt;: ~50ms to load knowledge graph and build automata&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: Automata are compact finite state machines&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Extending the Knowledge Graph
&lt;/h2&gt;

&lt;p&gt;Adding new patterns is simple. Create a markdown file in the mdBook source tree under &lt;code&gt;docs/src/kg/&lt;/code&gt; (published at &lt;a href="https://docs.terraphim.ai/src/kg/" rel="noopener noreferrer"&gt;https://docs.terraphim.ai/src/kg/&lt;/a&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# pytest&lt;/span&gt;

Python testing framework.

synonyms:: python -m unittest, unittest, nose
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system automatically rebuilds the automata on startup.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern Priority
&lt;/h3&gt;

&lt;p&gt;The LeftmostLongest strategy means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;npm install&lt;/code&gt; matches before &lt;code&gt;npm&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;python -m pytest&lt;/code&gt; matches before &lt;code&gt;python&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Longer, more specific patterns always win&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Quick Setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install all hooks&lt;/span&gt;
./scripts/install-terraphim-hooks.sh &lt;span class="nt"&gt;--easy-mode&lt;/span&gt;

&lt;span class="c"&gt;# Test the replacement&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"npm install"&lt;/span&gt; | ./target/release/terraphim-agent replace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Manual Setup
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Build the agent:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo build &lt;span class="nt"&gt;-p&lt;/span&gt; terraphim_agent &lt;span class="nt"&gt;--features&lt;/span&gt; repl-full &lt;span class="nt"&gt;--release&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Configure Claude Code hooks in &lt;code&gt;.claude/settings.local.json&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Install Git hooks:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp &lt;/span&gt;scripts/hooks/prepare-commit-msg .git/hooks/
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x .git/hooks/prepare-commit-msg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Use Cases
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Replacement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Package manager standardization&lt;/td&gt;
&lt;td&gt;npm, yarn, pnpm&lt;/td&gt;
&lt;td&gt;bun&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI attribution&lt;/td&gt;
&lt;td&gt;Claude Code, Claude&lt;/td&gt;
&lt;td&gt;Terraphim AI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Framework migration&lt;/td&gt;
&lt;td&gt;React.Component&lt;/td&gt;
&lt;td&gt;React functional components&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API versioning&lt;/td&gt;
&lt;td&gt;/api/v1&lt;/td&gt;
&lt;td&gt;/api/v2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deprecated function replacement&lt;/td&gt;
&lt;td&gt;moment()&lt;/td&gt;
&lt;td&gt;dayjs()&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Claude Code Skills Plugin
&lt;/h2&gt;

&lt;p&gt;For AI agents that support skills, we provide a dedicated plugin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude plugin &lt;span class="nb"&gt;install &lt;/span&gt;terraphim-engineering-skills@terraphim-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;terraphim-hooks&lt;/code&gt; skill teaches agents how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use the replace command correctly&lt;/li&gt;
&lt;li&gt;Extend the knowledge graph&lt;/li&gt;
&lt;li&gt;Debug hook issues&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Knowledge graph hooks provide a powerful, declarative way to enforce coding standards across AI agents. By defining patterns in simple markdown files, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standardize package managers across your team&lt;/li&gt;
&lt;li&gt;Ensure consistent attribution in commits&lt;/li&gt;
&lt;li&gt;Migrate deprecated patterns automatically&lt;/li&gt;
&lt;li&gt;Keep your knowledge graph version-controlled and human-readable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Aho-Corasick automata ensure efficient matching regardless of pattern count, making this approach scale to large knowledge graphs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;To wire knowledge-graph hooks into your own project, the &lt;a href="https://terraphim.ai/how-tos/command-rewriting-howto/" rel="noopener noreferrer"&gt;Command Rewriting How-to&lt;/a&gt; walks through the configuration end to end. To understand &lt;em&gt;why&lt;/em&gt; the matching is sub-millisecond and deterministic — and what that lets you promise to your users — read &lt;a href="https://terraphim.ai/posts/why-graph-embeddings-matter/" rel="noopener noreferrer"&gt;Why Graph Embeddings Matter&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/terraphim/terraphim-ai" rel="noopener noreferrer"&gt;Terraphim AI Repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/terraphim/terraphim-claude-skills" rel="noopener noreferrer"&gt;Claude Code Skills Plugin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.terraphim.ai/hooks/" rel="noopener noreferrer"&gt;Hook Installation Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.terraphim.ai/knowledge-graph/" rel="noopener noreferrer"&gt;Knowledge Graph Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>terraphim</category>
      <category>ai</category>
      <category>hooks</category>
      <category>knowledgegraph</category>
    </item>
    <item>
      <title>Why Graph Embeddings Matter</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Fri, 17 Apr 2026 10:49:56 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/why-graph-embeddings-matter-epp</link>
      <guid>https://dev.to/alexmikhalev/why-graph-embeddings-matter-epp</guid>
      <description>&lt;p&gt;Vector databases are probabilistic and slow. Graph embeddings are deterministic and sub-millisecond. If you are building context for an AI coding agent — or any system where you need to know &lt;em&gt;why&lt;/em&gt; a result came back — the difference is not academic. It changes what your application is allowed to promise.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pitch in One Paragraph
&lt;/h2&gt;

&lt;p&gt;Terraphim represents concepts as nodes in a knowledge graph and ranks them by how many synonyms and edges connect them. There is no embedding model, no GPU, no per-query distance computation in a 1024-dimensional space. There is an &lt;a href="https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm" rel="noopener noreferrer"&gt;Aho-Corasick&lt;/a&gt; automaton built once, queried in O(n+m+z) time over the input length plus the number of matches. The mechanism is described in detail on the &lt;a href="https://terraphim.ai/docs/graph-embeddings/" rel="noopener noreferrer"&gt;Graph Embeddings reference&lt;/a&gt; page; this post is about why it matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;Three numbers carry the argument. Each is reproducible on a laptop.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1.4 million patterns matched in under one millisecond, with under 4 GB of RAM.&lt;/strong&gt; That is the working set behind a multi-role knowledge graph — operator, engineer, analyst — held resident in the same process that serves the query.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;5–10 nanoseconds per knowledge-graph inference step.&lt;/strong&gt; Not microseconds. Nanoseconds. Once the automaton is built, traversal is a tight loop over byte slices and graph edges, and modern CPUs are extremely good at that.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;20 milliseconds to rebuild the embeddings for a role from scratch.&lt;/strong&gt; Rename a synonym, add a new term, drop an obsolete one — the whole role's graph is reconstituted before your editor has rendered the next frame.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For comparison, a typical vector-DB nearest-neighbour query lands in the 5–50 ms range &lt;em&gt;after&lt;/em&gt; you have paid the embedding API call (50–500 ms) and the network round-trip. We are not in the same regime.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Consequences
&lt;/h2&gt;

&lt;p&gt;The numbers are interesting on their own. The reason they matter is what they let you build.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Full Explainability
&lt;/h3&gt;

&lt;p&gt;Every match in Terraphim traces back to a specific edge in the knowledge graph and a specific synonym in a specific role. There is no "the model said so." When a search returns a document, you can show the user exactly which terms matched, which role's graph supplied the synonym, and which edges connected them. That is not a debugging nicety — it is a regulatory requirement in any domain where you have to defend a decision after the fact. Healthcare, legal, finance, government. Vector search by construction cannot do this.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. No Training, No Retraining, No Fine-Tuning
&lt;/h3&gt;

&lt;p&gt;Adding a new concept is a text edit. You write the synonym down, you point Terraphim at the file, the graph rebuilds in 20 ms. There is no training run, no GPU bill, no "we need to schedule a retrain on the new corpus." This collapses the loop between &lt;em&gt;noticing a gap&lt;/em&gt; and &lt;em&gt;fixing the gap&lt;/em&gt; from days or weeks to seconds. For an AI coding agent that needs to learn a project's vocabulary as you onboard, this is the difference between a working tool and a stalled rollout.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Language-Agnostic Without Language Detection
&lt;/h3&gt;

&lt;p&gt;Because matching is done on normalised terms — synonyms you supply explicitly — the same node in the graph can carry English, French, Russian, and Mandarin labels at no extra cost. There is no language-detection step, no per-language embedding model, no separate index. The query "consensus" and the query "консенсус" both reach the same node if you have told the graph they are synonyms. Stop-word lists become irrelevant: if a word is not in the graph, it does not match, full stop.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Lets You Do
&lt;/h2&gt;

&lt;p&gt;The pieces above are infrastructure. The story arc continues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Build hooks that transform AI coding agent output deterministically.&lt;/strong&gt; When Claude Code suggests &lt;code&gt;npm install&lt;/code&gt;, intercept it via a graph-embeddings match and replace it with &lt;code&gt;bun install&lt;/code&gt;. We wrote this up at &lt;a href="https://terraphim.ai/posts/teaching-ai-agents-with-knowledge-graphs/" rel="noopener noreferrer"&gt;Teaching AI Coding Agents with Knowledge Graph Hooks&lt;/a&gt; — that post is the &lt;em&gt;demo&lt;/em&gt; of what this engine enables.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capture and reuse mistakes.&lt;/strong&gt; When an agent gets corrected, store the correction as a new synonym and the next session never repeats it. See &lt;a href="https://terraphim.ai/posts/teaching-ai-agents-to-learn-from-mistakes/" rel="noopener noreferrer"&gt;Teaching AI Agents to Learn from Their Mistakes&lt;/a&gt; and &lt;a href="https://terraphim.ai/posts/learning-via-negativa/" rel="noopener noreferrer"&gt;Learning via Negativa&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run the whole thing in a 4 GB process on your laptop with no network calls.&lt;/strong&gt; The compactness is not an accident — it is the engineering brief from the &lt;a href="https://terraphim.ai/capabilities/origin-story/" rel="noopener noreferrer"&gt;Origin Story&lt;/a&gt;, which explains where the design came from and why it has stayed this small.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to Apply
&lt;/h2&gt;

&lt;p&gt;If you want to wire this into your own project, the &lt;a href="https://terraphim.ai/how-tos/command-rewriting-howto/" rel="noopener noreferrer"&gt;Command Rewriting How-to&lt;/a&gt; walks through the moving parts: where to put your synonyms, how the role graph is built, how hooks call the matcher.&lt;/p&gt;

&lt;p&gt;The mechanism — automata, ranking formula, ASCII walk-through — is on the &lt;a href="https://terraphim.ai/docs/graph-embeddings/" rel="noopener noreferrer"&gt;Graph Embeddings reference&lt;/a&gt; page. Read that next if you want the data structures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Bother Saying This Out Loud
&lt;/h2&gt;

&lt;p&gt;The current default in the AI tooling ecosystem is to reach for a vector database the moment anyone mentions "semantic search." It is the path of least resistance because the tools are well-marketed and the API surface is familiar. But for a large class of problems — explainability-first systems, on-device agents, anywhere you need a hard latency budget or a hard explainability guarantee — graph embeddings are the better-engineered answer. Not the only answer; the better one for that class.&lt;/p&gt;

&lt;p&gt;The promotion campaign over the next few weeks goes deeper: a &lt;a href="https://terraphim.ai/posts/sub-millisecond-context-knowledge-graphs/" rel="noopener noreferrer"&gt;sub-millisecond context article&lt;/a&gt; walks through the FST/Aho-Corasick implementation, and the &lt;em&gt;Context Engineering with Knowledge Graphs&lt;/em&gt; book (launching in May) puts it in the wider context of moving from RAG to context graphs.&lt;/p&gt;

&lt;p&gt;Until then: read the reference, try the how-to, and let us know in &lt;a href="https://terraphim.discourse.group" rel="noopener noreferrer"&gt;Discourse&lt;/a&gt; what you build with it.&lt;/p&gt;

</description>
      <category>terraphim</category>
      <category>graphembeddings</category>
      <category>knowledgegraph</category>
      <category>ahocorasick</category>
    </item>
    <item>
      <title>Disciplined Engineering: How We Build AI Systems That Actually Work</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Thu, 16 Apr 2026 15:00:55 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/disciplined-engineering-ai-systems-19di</link>
      <guid>https://dev.to/alexmikhalev/disciplined-engineering-ai-systems-19di</guid>
      <description>&lt;p&gt;AI coding agents are making us worse engineers, unless we add discipline back. Here is what we do instead of vibe coding, and how you can do it too in 30 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Vibe Coding Problem
&lt;/h2&gt;

&lt;p&gt;Every AI-generated pull request we review has the same pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scope creep&lt;/strong&gt; beyond the original task. You ask for a bug fix, you get a refactored module.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No traceability&lt;/strong&gt; from requirements to tests. The agent shipped code, but nobody verified it does what was actually asked.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge lost between sessions.&lt;/strong&gt; Each conversation starts from scratch. Yesterday's design decisions evaporate overnight.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent shipped code. It even passed the tests. But the tests were written by the same agent that wrote the code, optimising for the metric rather than understanding the problem.&lt;/p&gt;

&lt;p&gt;The missing piece is not better models. It is engineering discipline. AI agents need the same rigour humans use: understand the problem before coding, verify against the design, validate against requirements. We encoded this as executable skills that any AI coding agent can follow.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The research evidence behind this framework, including language-specific scaling laws and the 30% adoption gap between code intelligence research and production harnesses, is the subject of our next article.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The V-Model: Adding Discipline Back
&lt;/h2&gt;

&lt;p&gt;We built a V-model for AI agents. The left side asks "what should we build?" The right side asks "did we build it correctly?"&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvq4u9pmf9ojwl8mad4xf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvq4u9pmf9ojwl8mad4xf.png" alt="The V-Model for AI Agents: Research, Design, Specification, Implementation, Verification, Validation with quality gates at each transition" width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: Research
&lt;/h3&gt;

&lt;p&gt;Before writing any code, the agent must understand the problem space:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search existing knowledge graphs for relevant patterns&lt;/li&gt;
&lt;li&gt;Identify language-specific constraints&lt;/li&gt;
&lt;li&gt;Determine optimal context window for target language&lt;/li&gt;
&lt;li&gt;Find similar implementations in the codebase&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 2: Design
&lt;/h3&gt;

&lt;p&gt;Create a specification before implementation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Define interfaces and data structures&lt;/li&gt;
&lt;li&gt;Identify cross-language considerations&lt;/li&gt;
&lt;li&gt;Plan for compiler feedback integration&lt;/li&gt;
&lt;li&gt;Document the "why" not just the "what"&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 3: Implementation
&lt;/h3&gt;

&lt;p&gt;Write code with tests from the start:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Language-appropriate context management&lt;/li&gt;
&lt;li&gt;Compiler feedback integration points&lt;/li&gt;
&lt;li&gt;Type-safe by default (for typed languages)&lt;/li&gt;
&lt;li&gt;Self-documenting through clear structure&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 4: Verification
&lt;/h3&gt;

&lt;p&gt;Verify against the design, not just tests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Type-checking passes&lt;/li&gt;
&lt;li&gt;Linting passes&lt;/li&gt;
&lt;li&gt;Compiler warnings addressed&lt;/li&gt;
&lt;li&gt;Design intent preserved&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 5: Validation
&lt;/h3&gt;

&lt;p&gt;Validate against the original requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does it solve the actual problem?&lt;/li&gt;
&lt;li&gt;Are there simpler alternatives?&lt;/li&gt;
&lt;li&gt;What's the maintenance burden?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  terraphim-skills: 32+ Executable Disciplines
&lt;/h2&gt;

&lt;p&gt;We packaged the V-model as executable skills you can add to any AI agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add terraphim/terraphim-skills
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This installs skills that enforce:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;disciplined-research&lt;/strong&gt;: Understand before building&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;disciplined-design&lt;/strong&gt;: Plan before coding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;disciplined-implementation&lt;/strong&gt;: Build with tests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;disciplined-verification&lt;/strong&gt;: Verify against design&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;disciplined-validation&lt;/strong&gt;: Validate against requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each skill is a self-contained prompt that guides the agent through the phase's inputs, outputs, and quality gates.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quality Gates: The Judge System
&lt;/h2&gt;

&lt;p&gt;Our judge system (Kimi K2.5) catches what humans miss:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;90% verdict agreement with human reviewers&lt;/li&gt;
&lt;li&gt;62.5% NO-GO detection rate on genuinely flawed code&lt;/li&gt;
&lt;li&gt;~9s average review latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Automated quality gates, zero manual overhead. Every PR reviewed before it merges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Guard Rails in Practice
&lt;/h2&gt;

&lt;p&gt;AI agents do not type commands into a terminal. They invoke tools programmatically, and they do not always get it right. "Cleaning up build artefacts" becomes &lt;code&gt;rm -rf ./src&lt;/code&gt; (one-character typo). "Resetting to last commit" becomes &lt;code&gt;git reset --hard&lt;/code&gt; (uncommitted work gone). You need a safety net that operates between the agent and your shell.&lt;/p&gt;

&lt;p&gt;We use two layers of guard rails:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: git-safety-guard&lt;/strong&gt; (a terraphim-skill that runs as a PreToolUse hook):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Blocks &lt;code&gt;git reset --hard&lt;/code&gt;, &lt;code&gt;git push --force&lt;/code&gt;, &lt;code&gt;rm -rf&lt;/code&gt; and similar destructive commands before they execute&lt;/li&gt;
&lt;li&gt;Checks for secrets in diffs before commits&lt;/li&gt;
&lt;li&gt;Validates commit message format&lt;/li&gt;
&lt;li&gt;Zero configuration: install the skill, protection is immediate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: &lt;a href="https://github.com/Dicklesworthstone/destructive_command_guard" rel="noopener noreferrer"&gt;Destructive Command Guard (DCG)&lt;/a&gt;&lt;/strong&gt; by Jeff Emanuel, integrated via tool hooks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A Rust binary using SIMD-accelerated pattern matching&lt;/li&gt;
&lt;li&gt;Intercepts every shell command the agent attempts to run&lt;/li&gt;
&lt;li&gt;Returns allow/block verdicts in under 1ms&lt;/li&gt;
&lt;li&gt;Works with Claude Code, OpenCode, and any agent that exposes a pre-execution hook&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The architecture is simple: the agent calls a bash tool, the hook pipes the command to DCG as JSON, DCG pattern-matches against known destructive commands, and blocks execution before damage occurs. The agent receives an error explaining why, and can adjust its approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real Example: AI Dark Factory
&lt;/h2&gt;

&lt;p&gt;We run 12+ AI agents overnight on a single machine, coordinated by a Rust orchestrator. Each agent follows the V-model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Safety agents&lt;/strong&gt; run continuously with automatic restart and cooldown. They handle monitoring, log analysis, and drift detection. If one crashes, the orchestrator waits 15 minutes before restarting (up to 3 times) to prevent crash loops.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core agents&lt;/strong&gt; are scheduled via cron. They pick the highest-priority unblocked issue from the Gitea board (ranked by PageRank across the dependency graph), claim it, branch, implement with tests, and open a pull request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Growth agents&lt;/strong&gt; run on demand for research, code review, and content generation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every agent's output passes through the judge system before merge. The morning routine is reviewing verdicts, not debugging overnight chaos. When an agent produces a NO-GO verdict, the PR is flagged with the specific issues: missing test coverage, undocumented API changes, or security concerns.&lt;/p&gt;

&lt;p&gt;This is disciplined engineering at scale: not process overhead, but automated quality gates that catch problems before they compound.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The gap between what AI agents can do and what they should do is real. It is not a technology gap: it is a discipline gap. The V-model and 32+ executable skills we built are available today:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add terraphim/terraphim-skills
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add discipline back. Your future self will thank you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Deeper dive: The V-model and quality gates we use are detailed in Chapters 3-4 of "Context Engineering with Knowledge Graphs". Coming soon.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related posts:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://terraphim.ai/posts/teaching-ai-agents-with-knowledge-graphs/" rel="noopener noreferrer"&gt;Teaching AI Coding Agents with Knowledge Graph Hooks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://terraphim.ai/posts/teaching-ai-agents-to-learn-from-mistakes/" rel="noopener noreferrer"&gt;Teaching AI Agents to Learn from Their Mistakes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>engineering</category>
      <category>devtools</category>
      <category>coding</category>
    </item>
    <item>
      <title>From Learning Capture to Self-Evolving Rules: Adding Verification Sweeps to terraphim-agent</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Mon, 30 Mar 2026 16:21:44 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/from-learning-capture-to-self-evolving-rules-adding-verification-sweeps-to-terraphim-agent-4l43</link>
      <guid>https://dev.to/alexmikhalev/from-learning-capture-to-self-evolving-rules-adding-verification-sweeps-to-terraphim-agent-4l43</guid>
      <description>&lt;h1&gt;
  
  
  From Learning Capture to Self-Evolving Rules: Adding Verification Sweeps to terraphim-agent
&lt;/h1&gt;

&lt;p&gt;A self-evolving AI coding agent sounds like science fiction. It is not. It is a shell script, a markdown file with grep patterns, and a weekly review discipline.&lt;/p&gt;

&lt;p&gt;We have been running &lt;a href="https://github.com/terraphim/terraphim-ai" rel="noopener noreferrer"&gt;terraphim-agent&lt;/a&gt; in production for months. It captures every failed bash command from Claude Code and OpenCode, stores them in a persistent learning database, and lets agents query past mistakes before repeating them. The capture loop works. The query system works. The correction mechanism works.&lt;/p&gt;

&lt;p&gt;What was missing was &lt;strong&gt;verification&lt;/strong&gt;. We could capture mistakes and add corrections, but we had no way to prove the corrections were being followed. No machine-checkable enforcement. No audit trail. No quantitative measure of whether the system was actually improving.&lt;/p&gt;

&lt;p&gt;Then &lt;a href="https://x.com/meta_alchemist/status/2038222105654022325" rel="noopener noreferrer"&gt;Meta Alchemist published a viral guide&lt;/a&gt; on transforming Claude Code into a self-evolving system, and two ideas jumped out: &lt;strong&gt;verification patterns on every rule&lt;/strong&gt; and &lt;strong&gt;session scorecards&lt;/strong&gt;. We already had the foundation. The article showed us what to build on top.&lt;/p&gt;

&lt;p&gt;This post covers what we added, what we deliberately did not copy, and why the combination of a Rust CLI with a thin shell verification layer is more robust than an all-in-JSONL approach.&lt;/p&gt;

&lt;p&gt;If you have not read the &lt;a href="https://zestic.ai/blog/terraphim-agent-learning-hooks" rel="noopener noreferrer"&gt;foundation post on configuring terraphim-agent for Claude Code and OpenCode&lt;/a&gt;, start there. This post assumes you have the capture system running.&lt;/p&gt;




&lt;h2&gt;
  
  
  What we already had
&lt;/h2&gt;

&lt;p&gt;Before reading the Meta Alchemist article, our learning infrastructure had three layers:&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Learning capture (PostToolUse hook)
&lt;/h3&gt;

&lt;p&gt;Every failed bash command in Claude Code triggers our &lt;code&gt;post_tool_use.sh&lt;/code&gt; hook. The hook extracts the command, exit code, and error output, then pipes them to &lt;code&gt;terraphim-agent learn hook --format claude&lt;/code&gt;. The learning is stored as a structured file in &lt;code&gt;~/.local/share/terraphim/learnings/&lt;/code&gt; (global) or &lt;code&gt;.terraphim/learnings/&lt;/code&gt; (project-scoped).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# What the hook does on every failed command:&lt;/span&gt;
terraphim-agent learn capture &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$COMMAND&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--error&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ERROR_OUTPUT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--exit-code&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXIT_CODE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The design is fail-open: if terraphim-agent is missing or crashes, the hook passes through silently. An observability tool must never break the tool it observes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Safety guard (PreToolUse hook)
&lt;/h3&gt;

&lt;p&gt;Before any bash command executes, our &lt;code&gt;pre_tool_use.sh&lt;/code&gt; hook runs two checks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;terraphim-agent guard --json&lt;/code&gt; blocks destructive commands (rm -rf, git push --force, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;terraphim-agent replace --role "Terraphim Engineer"&lt;/code&gt; performs knowledge graph text replacement (npm -&amp;gt; bun, pip -&amp;gt; uv, etc.)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The guard blocks. The replacement corrects. Neither depends on the LLM remembering instructions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Query and correct
&lt;/h3&gt;

&lt;p&gt;Humans and agents query the learning database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List recent learnings&lt;/span&gt;
terraphim-agent learn list

&lt;span class="c"&gt;# Search by pattern (with synonym expansion via thesaurus)&lt;/span&gt;
terraphim-agent learn query &lt;span class="s2"&gt;"docker"&lt;/span&gt;

&lt;span class="c"&gt;# Add a correction to a captured learning&lt;/span&gt;
terraphim-agent learn correct 3 &lt;span class="nt"&gt;--correction&lt;/span&gt; &lt;span class="s2"&gt;"Use 'docker compose' (v2 plugin)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The thesaurus has 20 semantic categories and 160+ synonym mappings. Search for "error" and you find "failure", "bug", "issue". Search for "setup" and you find "configuration", "install", "init". This is not keyword matching. It is structured retrieval.&lt;/p&gt;

&lt;h3&gt;
  
  
  What was missing
&lt;/h3&gt;

&lt;p&gt;The capture-query-correct loop is a journal. It records mistakes and lets you look them up. What it does not do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Enforce rules mechanically.&lt;/strong&gt; A rule saying "never use pip" exists only in CLAUDE.md text that the LLM might or might not follow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify compliance at session start.&lt;/strong&gt; No sweep checks whether graduated rules are actually being obeyed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track improvement quantitatively.&lt;/strong&gt; No session scorecards. No trend data. No way to prove the system is getting better.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provide an audit trail for rule changes.&lt;/strong&gt; Rules appear and disappear without record.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These gaps are exactly what the Meta Alchemist article addressed.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Meta Alchemist proposed
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://x.com/meta_alchemist/status/2038222105654022325" rel="noopener noreferrer"&gt;full guide&lt;/a&gt; describes a four-layer system:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cognitive core&lt;/strong&gt; (CLAUDE.md): A decision framework Claude runs before writing code, plus completion criteria that must pass before any task is done.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specialised agents&lt;/strong&gt;: An architect (plans, read-only) and a reviewer (validates, read-only) that spawn as subagents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Path-scoped rules&lt;/strong&gt;: Security rules that only load when editing auth code. API design rules that only activate in handler directories. Keeps context lean.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evolution engine&lt;/strong&gt;: A memory system that captures corrections in JSONL files, runs verification sweeps at session start, generates session scorecards, and promotes patterns through a confidence ladder.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The genuinely good ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verify lines on every rule.&lt;/strong&gt; Each learned rule gets a machine-checkable grep pattern. &lt;code&gt;verify: Grep("\.\.\.options", path="src/api/") -&amp;gt; 0 matches&lt;/code&gt;. The sweep runs the grep and reports PASS/FAIL. This is brilliant. It turns instructions into guardrails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session scorecards.&lt;/strong&gt; Quantitative tracking of corrections received, rules checked, rules passed, violations found. Trend detection over time. If corrections are flat or increasing, the rules are not working.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Promotion ladder.&lt;/strong&gt; Corrected once = logged. Corrected twice = auto-promoted to permanent rule. In learned-rules for 10+ sessions = candidate for graduation to CLAUDE.md.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capacity management.&lt;/strong&gt; Max 50 lines in learned-rules.md forces graduation or pruning. Prevents unbounded growth.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key quote: "A rule without a verification check is a wish. A rule with a verification check is a guardrail. Only guardrails survive."&lt;/p&gt;

&lt;p&gt;We agree with the principle. We disagree with the implementation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where our approaches diverge
&lt;/h2&gt;

&lt;p&gt;The Meta Alchemist article builds everything from scratch using JSONL files parsed by the LLM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;corrections.jsonl&lt;/code&gt; -- user corrections as JSON objects&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;observations.jsonl&lt;/code&gt; -- verified discoveries&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;violations.jsonl&lt;/code&gt; -- rule violations caught by sweep&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sessions.jsonl&lt;/code&gt; -- session scorecards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a reasonable approach if you have no existing infrastructure. We do. terraphim-agent already provides structured file storage with frontmatter, synonym-expanded querying, project/global scoping, and correction chaining. Adding parallel JSONL files would create a split-brain problem: two sources of truth for the same data.&lt;/p&gt;

&lt;p&gt;The article also proposes auto-promotion: when the same correction appears twice, it automatically becomes a permanent rule. This is risky. A correction might be context-dependent (correct for one project, wrong for another). It might be a preference rather than a constraint. Auto-promotion without a quality gate means the system accumulates rules without human judgement about which ones deserve to be permanent.&lt;/p&gt;

&lt;p&gt;Our approach: &lt;strong&gt;capture in terraphim-agent, verify with shell scripts, promote with CTO approval.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The verification layer we added
&lt;/h2&gt;

&lt;p&gt;Three new components, all configuration. No Rust code changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  learned-rules.md: graduated rules with verify patterns
&lt;/h3&gt;

&lt;p&gt;The file lives at &lt;code&gt;.claude/memory/learned-rules.md&lt;/code&gt;. Each rule has three parts: the constraint text, a machine-checkable verify pattern, and a source annotation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Learned Rules&lt;/span&gt;

Rules graduated from terraphim-agent corrections and CLAUDE.md conventions.
Each rule has a &lt;span class="sb"&gt;`verify:`&lt;/span&gt; pattern checked by the /boot verification sweep.
&lt;span class="p"&gt;
---

-&lt;/span&gt; Never use pip, pip3, or pipx; always use uv instead.
  verify: Grep("pip install|pip3 install|pipx install", path="automation/") -&amp;gt; 0 matches
  [source: CLAUDE.md convention, terraphim-agent learning #4, 2026-03-30]
&lt;span class="p"&gt;
-&lt;/span&gt; Never use npm, yarn, or pnpm; always use bun instead.
  verify: Grep("npm install|yarn add|pnpm add", path="automation/") -&amp;gt; 0 matches
  [source: CLAUDE.md convention, terraphim KG hook replacement, 2026-03-30]
&lt;span class="p"&gt;
-&lt;/span&gt; Never use double dashes in document titles or markdown headings.
  verify: Grep("^#.&lt;span class="err"&gt;*&lt;/span&gt;--", path="knowledge/") -&amp;gt; 0 matches
  [source: corrected 2x, terraphim-agent learning #5, 2026-03-30]
&lt;span class="p"&gt;
-&lt;/span&gt; Never hardcode API keys as default values in bash scripts.
  verify: Grep("API_KEY=.[a-zA-Z0-9]", path="automation/") -&amp;gt; 0 matches
  [source: MEMORY.md security lesson, 2026-03-30]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The format is deliberately simple. No JSON. No YAML frontmatter. Just markdown that a human can read and a shell script can parse. The verify line follows a consistent pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;verify: Grep&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"regex_pattern"&lt;/span&gt;, &lt;span class="nv"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"scope/"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; -&amp;gt; N matches
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where &lt;code&gt;-&amp;gt; 0 matches&lt;/code&gt; means the pattern should NOT appear (absence check) and &lt;code&gt;-&amp;gt; 1+ matches&lt;/code&gt; means the pattern MUST appear (presence check).&lt;/p&gt;

&lt;p&gt;Rules without a verify line are flagged as technical debt during evolution review.&lt;/p&gt;

&lt;h3&gt;
  
  
  verify-sweep.sh: the verification engine
&lt;/h3&gt;

&lt;p&gt;The core script parses &lt;code&gt;learned-rules.md&lt;/code&gt;, extracts each verify line, runs the check, and reports PASS/FAIL. It uses &lt;code&gt;rg&lt;/code&gt; (ripgrep) when available for native output limiting (no SIGPIPE issues from pipe chains).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Verification Sweep: parse learned-rules.md, run verify: checks, report PASS/FAIL&lt;/span&gt;
&lt;span class="c"&gt;# Always exits 0 (advisory tool, never blocks).&lt;/span&gt;

&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-uo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;RULES_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;1&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="p"&gt;.claude/memory/learned-rules.md&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;PROJECT_ROOT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;git rev-parse &lt;span class="nt"&gt;--show-toplevel&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;RG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;which rg 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="nv"&gt;TOTAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;PASSED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;FAILED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nv"&gt;MANUAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
&lt;span class="nv"&gt;current_rule&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;

&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; line&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="c"&gt;# Capture rule text (lines starting with "- ")&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | rg &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="s1"&gt;'^\s*- .+'&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nv"&gt;current_rule&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/^\s*- //'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;fi&lt;/span&gt;

    &lt;span class="c"&gt;# Process verify: lines&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | rg &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="s1"&gt;'^\s*verify:'&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nv"&gt;TOTAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;TOTAL &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;

        &lt;span class="c"&gt;# Skip manual checks&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | rg &lt;span class="nt"&gt;-qi&lt;/span&gt; &lt;span class="s1"&gt;'manual'&lt;/span&gt; 2&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
            &lt;/span&gt;&lt;span class="nv"&gt;MANUAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;MANUAL &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;
            &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SKIP: &lt;/span&gt;&lt;span class="nv"&gt;$current_rule&lt;/span&gt;&lt;span class="s2"&gt; (manual check)"&lt;/span&gt;
            &lt;span class="k"&gt;continue
        fi&lt;/span&gt;

        &lt;span class="c"&gt;# Extract pattern, path, and expected count&lt;/span&gt;
        &lt;span class="nv"&gt;pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s1"&gt;'s/.*Grep("\([^"]*\)".*/\1/p'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
        &lt;span class="nv"&gt;path_scope&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s1"&gt;'s/.*path="\([^"]*\)".*/\1/p'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
        &lt;span class="nv"&gt;expected&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s1"&gt;'s/.*-&amp;gt; \([0-9]*\).*/\1/p'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

        &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$pattern&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;MANUAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;MANUAL &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SKIP: &lt;/span&gt;&lt;span class="nv"&gt;$current_rule&lt;/span&gt;&lt;span class="s2"&gt; (unparseable)"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;search_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;path_scope&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$search_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROJECT_ROOT&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$search_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;search_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROJECT_ROOT&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$search_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

        &lt;span class="c"&gt;# Count matches&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RG&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
            &lt;/span&gt;&lt;span class="nv"&gt;match_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RG&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$pattern&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$search_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="se"&gt;\&lt;/span&gt;
                | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;: &lt;span class="s1"&gt;'{s+=$NF} END {print s+0}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
        &lt;/span&gt;&lt;span class="k"&gt;else
            &lt;/span&gt;&lt;span class="nv"&gt;match_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rEc&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$pattern&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$search_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="se"&gt;\&lt;/span&gt;
                | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;: &lt;span class="s1"&gt;'{s+=$NF} END {print s+0}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
        &lt;/span&gt;&lt;span class="k"&gt;fi&lt;/span&gt;

        &lt;span class="c"&gt;# Compare against expectation&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$expected&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"0"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
            if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$match_count&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
                &lt;/span&gt;&lt;span class="nv"&gt;PASSED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;PASSED &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"PASS: &lt;/span&gt;&lt;span class="nv"&gt;$current_rule&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;else
                &lt;/span&gt;&lt;span class="nv"&gt;FAILED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;FAILED &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;
                &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"FAIL: &lt;/span&gt;&lt;span class="nv"&gt;$current_rule&lt;/span&gt;&lt;span class="s2"&gt; (found &lt;/span&gt;&lt;span class="nv"&gt;$match_count&lt;/span&gt;&lt;span class="s2"&gt; matches, expected 0)"&lt;/span&gt;
                &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RG&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="nt"&gt;--max-count&lt;/span&gt; 1 &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$pattern&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$search_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="se"&gt;\&lt;/span&gt;
                    | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-3&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/^/  &amp;gt;&amp;gt; /'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
            &lt;/span&gt;&lt;span class="k"&gt;fi
        else
            if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$match_count&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-gt&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
                &lt;/span&gt;&lt;span class="nv"&gt;PASSED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;PASSED &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"PASS: &lt;/span&gt;&lt;span class="nv"&gt;$current_rule&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;else
                &lt;/span&gt;&lt;span class="nv"&gt;FAILED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;FAILED &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;
                &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"FAIL: &lt;/span&gt;&lt;span class="nv"&gt;$current_rule&lt;/span&gt;&lt;span class="s2"&gt; (found 0 matches, expected 1+)"&lt;/span&gt;
            &lt;span class="k"&gt;fi
        fi
    fi
done&lt;/span&gt; &amp;lt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RULES_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"--- Verification Summary ---"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Rules checked: &lt;/span&gt;&lt;span class="nv"&gt;$TOTAL&lt;/span&gt;&lt;span class="s2"&gt; | Passed: &lt;/span&gt;&lt;span class="nv"&gt;$PASSED&lt;/span&gt;&lt;span class="s2"&gt; | Failed: &lt;/span&gt;&lt;span class="nv"&gt;$FAILED&lt;/span&gt;&lt;span class="s2"&gt; | Skipped: &lt;/span&gt;&lt;span class="nv"&gt;$MANUAL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Real output from our production environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PASS: Never use pip, pip3, or pipx; always use uv instead.
PASS: Never use npm, yarn, or pnpm; always use bun instead.
FAIL: Never use double dashes in document titles or markdown headings. (found 252 matches, expected 0)
  &amp;gt;&amp;gt; knowledge/mem-layer-graph-memory.md:1:# Mem-Layer -- Graph-Based AI Memory System
  &amp;gt;&amp;gt; knowledge/claude-1m-context-ga.md:7:# Claude 1M Context Window -- General Availability
  &amp;gt;&amp;gt; knowledge/ukri-funding-opportunities-2026.md:7:# UKRI Funding Opportunities -- 2026 Active/Recent
FAIL: Use British English spelling in all generated content. (found 316 matches, expected 0)
  &amp;gt;&amp;gt; knowledge/topics/context-engineering.md:41:- locality-of-behavior-dev-community.md
  &amp;gt;&amp;gt; knowledge/topics/conway-vs-strongdm-identity.md:49:- Govern AI agent behavior at runtime
SKIP: Always run date before date-sensitive operations. (manual check)
PASS: Never hardcode API keys as default values in bash scripts.
PASS: Never use git commit --amend in pre-push hooks.

--- Verification Summary ---
Rules checked: 7 | Passed: 4 | Failed: 2 | Skipped: 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two failures are from imported external articles that use American English spelling and double dashes. These are expected: imported content is not generated content. This kind of nuance is exactly why we do not auto-fix violations. The sweep surfaces them; the human decides what to do.&lt;/p&gt;

&lt;h3&gt;
  
  
  /boot skill: session-start verification
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;/boot&lt;/code&gt; skill wraps the verification sweep into a session-start ritual:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run &lt;code&gt;date&lt;/code&gt; to establish the actual current date (never trust stale context)&lt;/li&gt;
&lt;li&gt;Read &lt;code&gt;learned-rules.md&lt;/code&gt; to load all graduated rules&lt;/li&gt;
&lt;li&gt;Execute &lt;code&gt;verify-sweep.sh&lt;/code&gt; to check compliance&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;terraphim-agent learn list&lt;/code&gt; to surface recent learnings&lt;/li&gt;
&lt;li&gt;Report a one-line summary
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Boot complete: 7 rules checked, 4 passed, 2 failed. 10 recent learnings loaded.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Meta Alchemist article proposes a &lt;code&gt;SessionStart&lt;/code&gt; hook to trigger this automatically. Claude Code does not have a &lt;code&gt;SessionStart&lt;/code&gt; hook type. Their article assumes one exists. Ours does not. We invoke &lt;code&gt;/boot&lt;/code&gt; manually at the start of each session. A manual invocation that runs reliably is better than an automatic hook that does not exist.&lt;/p&gt;




&lt;h2&gt;
  
  
  The evolution engine
&lt;/h2&gt;

&lt;p&gt;Verification tells you what is wrong. Evolution fixes it over time.&lt;/p&gt;

&lt;h3&gt;
  
  
  /evolve skill: weekly review with approval gate
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;/evolve&lt;/code&gt; skill is the mechanism by which the system improves. It runs weekly (or on demand) and does the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Gathers corrections&lt;/strong&gt; from &lt;code&gt;terraphim-agent learn list&lt;/code&gt; (recent captures and corrections)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reads current rules&lt;/strong&gt; from &lt;code&gt;learned-rules.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reads the evolution log&lt;/strong&gt; from &lt;code&gt;evolution-log.md&lt;/code&gt; (to avoid re-proposing rejected rules)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Groups failures by pattern&lt;/strong&gt; and identifies repeat corrections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proposes changes&lt;/strong&gt; using a structured format:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;PROPOSE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PROMOTE&lt;/span&gt;
&lt;span class="na"&gt;Rule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Never use timeout command on macOS (does not exist)&lt;/span&gt;
&lt;span class="na"&gt;Source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraphim-agent learning&lt;/span&gt; &lt;span class="c1"&gt;#8, #12&lt;/span&gt;
&lt;span class="na"&gt;Evidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Corrected twice across different sessions&lt;/span&gt;
&lt;span class="na"&gt;Verify&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Grep("timeout ", path="automation/") -&amp;gt; 0 matches&lt;/span&gt;
&lt;span class="na"&gt;Destination&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;learned-rules.md&lt;/span&gt;
&lt;span class="na"&gt;Risk&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Low. The command genuinely does not exist on macOS.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Waits for CTO approval.&lt;/strong&gt; No changes are applied until each proposal is individually approved, rejected, or modified.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Logs everything&lt;/strong&gt; to &lt;code&gt;evolution-log.md&lt;/code&gt;: approved changes, rejected proposals, and the reasoning.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is where we differ most from the Meta Alchemist approach. Their system auto-promotes on the second correction. Ours proposes and waits. The CTO reviews and approves.&lt;/p&gt;

&lt;p&gt;Why? Because a correction is context. "Don't use pip" is correct for our projects. It is not correct for a project that deliberately uses pip. Auto-promotion assumes all corrections are universally true. They are not.&lt;/p&gt;

&lt;p&gt;Our principle: &lt;strong&gt;A rule without CTO approval is an assumption.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Promotion ladder
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Destination&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Failed command captured&lt;/td&gt;
&lt;td&gt;terraphim-agent learn database&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Human adds correction&lt;/td&gt;
&lt;td&gt;Same learning, enriched&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same correction appears twice&lt;/td&gt;
&lt;td&gt;Flagged for &lt;code&gt;/evolve&lt;/code&gt; review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Approved during &lt;code&gt;/evolve&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;learned-rules.md&lt;/code&gt; with verify: pattern&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Passing for 5+ sessions&lt;/td&gt;
&lt;td&gt;Candidate for graduation to CLAUDE.md&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rejected during &lt;code&gt;/evolve&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;evolution-log.md&lt;/code&gt; (never re-proposed)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The ladder is one-way unless the CTO explicitly overrides. Rejected rules do not come back. Graduated rules do not regress. The evolution log is the audit trail that makes this provable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session scorecards
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;session-scorecard.sh&lt;/code&gt; script generates a quantitative summary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=== Session Scorecard: 2026-03-30 ===

--- Recent Learnings ---
Total learnings in database: 10
With corrections: 2

--- Verification Sweep ---
Rules checked: 7 | Passed: 4 | Failed: 2 | Skipped: 1

--- Relevant Past Learnings ---
No learnings matching 'cto-executive-system'.

=== End Scorecard ===
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Over time, the trend data answers a fundamental question: &lt;strong&gt;is the system getting better?&lt;/strong&gt; If corrections decrease and pass rates increase, the evolution loop is working. If they are flat, the rules are too vague or too disconnected from actual work.&lt;/p&gt;




&lt;h2&gt;
  
  
  What we deliberately did not build
&lt;/h2&gt;

&lt;p&gt;Engineering is as much about what you leave out as what you put in. Here is what the Meta Alchemist article proposes that we skipped, and why.&lt;/p&gt;

&lt;h3&gt;
  
  
  JSONL files for corrections and observations
&lt;/h3&gt;

&lt;p&gt;The article creates &lt;code&gt;corrections.jsonl&lt;/code&gt;, &lt;code&gt;observations.jsonl&lt;/code&gt;, &lt;code&gt;violations.jsonl&lt;/code&gt;, and &lt;code&gt;sessions.jsonl&lt;/code&gt;. Each is an append-only log of JSON objects that Claude parses at session start.&lt;/p&gt;

&lt;p&gt;We already have &lt;code&gt;terraphim-agent learn&lt;/code&gt; which provides structured file storage, thesaurus-expanded querying, project/global scoping, and correction chaining. Adding JSONL files would create two sources of truth for the same data. The agent CLI is the single source.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auto-promotion on second correction
&lt;/h3&gt;

&lt;p&gt;The article promotes automatically when the same correction appears twice. We flag for review. The difference matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto-promotion: fast, no human bottleneck, but accumulates rules without judgement&lt;/li&gt;
&lt;li&gt;Reviewed promotion: slower, requires CTO time, but every rule is intentional&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We chose reviewed promotion because our project spans multiple contexts (CTO executive system, Terraphim AI, client projects). A correction that is right in one context might be wrong in another. The human knows the difference.&lt;/p&gt;

&lt;h3&gt;
  
  
  SessionStart and Stop hooks
&lt;/h3&gt;

&lt;p&gt;The article configures &lt;code&gt;SessionStart&lt;/code&gt; and &lt;code&gt;Stop&lt;/code&gt; hooks in Claude Code's &lt;code&gt;settings.json&lt;/code&gt;. These hook types do not exist in Claude Code's documented hook system. The available types are &lt;code&gt;PreToolUse&lt;/code&gt;, &lt;code&gt;PostToolUse&lt;/code&gt;, and (in some versions) &lt;code&gt;SubagentStart&lt;/code&gt;. The article either assumes a future feature or describes a different version.&lt;/p&gt;

&lt;p&gt;We replaced the SessionStart hook with a &lt;code&gt;/boot&lt;/code&gt; skill. We replaced the Stop hook with a manual &lt;code&gt;session-scorecard.sh&lt;/code&gt; invocation. Both work reliably because they use mechanisms that actually exist.&lt;/p&gt;

&lt;h3&gt;
  
  
  Path-scoped rules
&lt;/h3&gt;

&lt;p&gt;The article loads different rule files based on which file Claude is editing: security rules for auth code, API design rules for handlers, performance rules everywhere. This is a genuine Claude Code feature (&lt;code&gt;.claude/rules/&lt;/code&gt; with &lt;code&gt;paths:&lt;/code&gt; frontmatter).&lt;/p&gt;

&lt;p&gt;We skipped it because our project is not a single codebase. The CTO executive system contains knowledge articles, automation scripts, domain models, plans, and publishing workflows. Path-scoped rules make sense for a web application with clear directory boundaries. They are premature for a heterogeneous knowledge system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hard capacity cap
&lt;/h3&gt;

&lt;p&gt;The article enforces a maximum of 50 lines in &lt;code&gt;learned-rules.md&lt;/code&gt;. If you hit the cap, you must graduate or prune before adding more.&lt;/p&gt;

&lt;p&gt;This is a useful forcing function for projects that might accumulate hundreds of rules. We started with 7. When we approach a natural limit, &lt;code&gt;/evolve&lt;/code&gt; will recommend pruning. We do not need an artificial constraint to force a behaviour that good engineering practice already demands.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Meta Alchemist&lt;/th&gt;
&lt;th&gt;terraphim-agent + verify layer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Storage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;JSONL files parsed by LLM&lt;/td&gt;
&lt;td&gt;Rust CLI with structured file storage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Capture trigger&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Custom evolution SKILL.md (auto-triggered)&lt;/td&gt;
&lt;td&gt;PostToolUse bash hook (fail-open)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Query mechanism&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LLM reads and interprets JSONL&lt;/td&gt;
&lt;td&gt;CLI with thesaurus-expanded search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Verification&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Grep patterns in learned-rules.md&lt;/td&gt;
&lt;td&gt;Same (we adopted this idea)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Promotion&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Auto on 2nd correction&lt;/td&gt;
&lt;td&gt;Manual via /evolve with CTO approval&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audit trail&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;evolution-log.md&lt;/td&gt;
&lt;td&gt;Same (we adopted this idea)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Session scoring&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;sessions.jsonl (auto-written)&lt;/td&gt;
&lt;td&gt;session-scorecard.sh (manual)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-tool support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude Code only&lt;/td&gt;
&lt;td&gt;Claude Code + OpenCode + any CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Safety guard&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;settings.json deny list&lt;/td&gt;
&lt;td&gt;terraphim-agent guard (pattern matching)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Text replacement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not included&lt;/td&gt;
&lt;td&gt;terraphim-agent replace (KG-based)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The fundamental difference: Meta Alchemist builds a complete system inside Claude Code's configuration. We build a thin verification layer on top of an existing CLI. The CLI handles storage, querying, and correction chaining. The verification layer handles enforcement and evolution. Each does what it is good at.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where this fits in the broader landscape
&lt;/h3&gt;

&lt;p&gt;The idea of self-improving AI coding agents is not new. Several approaches exist:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Devin's knowledge suggestions&lt;/strong&gt;: captures corrections as project-specific "knowledge" entries that load into future sessions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenClaw's metacognitive loops&lt;/strong&gt;: three-phase review cycle that captures Phase 2 findings as learnings for future Phase 1s&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ouroboros pattern&lt;/strong&gt;: self-modifying agents with constitutional guardrails, event sourcing, and multi-model review chains&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compound agency learning architecture&lt;/strong&gt;: six nested learning loops from failure-to-guardrail up to loop-evolution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our approach sits between Devin (simple capture) and Ouroboros (full self-modification). We capture automatically, verify mechanically, but promote deliberately. The human stays in the loop for rule changes. The machine handles enforcement.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;If you already have terraphim-agent configured with the PostToolUse hook (see the &lt;a href="https://zestic.ai/blog/terraphim-agent-learning-hooks" rel="noopener noreferrer"&gt;foundation post&lt;/a&gt;), adding the verification layer takes five steps:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create the directory structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; automation/learning .claude/skills/boot .claude/skills/evolve .claude/memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Seed learned-rules.md
&lt;/h3&gt;

&lt;p&gt;Start with 3 to 5 rules from your existing CLAUDE.md or project conventions. Each rule needs a verify pattern. If you cannot write a grep check for a rule, the rule is too vague.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Write verify-sweep.sh
&lt;/h3&gt;

&lt;p&gt;Copy the script from above. Make it executable. Test it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x automation/learning/verify-sweep.sh
bash automation/learning/verify-sweep.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see PASS/FAIL for each rule. If a rule fails, either fix the violation or refine the verify pattern.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Create the /boot and /evolve skills
&lt;/h3&gt;

&lt;p&gt;These are SKILL.md files in &lt;code&gt;.claude/skills/boot/&lt;/code&gt; and &lt;code&gt;.claude/skills/evolve/&lt;/code&gt;. The boot skill runs the sweep and surfaces learnings. The evolve skill reviews corrections and proposes rule changes. Full skill definitions are in our &lt;a href="https://github.com/terraphim/terraphim-ai" rel="noopener noreferrer"&gt;repository&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Add to CLAUDE.md
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;### Learning Evolution System&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Run /boot at session start to verify learned rules and surface past learnings
&lt;span class="p"&gt;-&lt;/span&gt; Run /evolve weekly to review corrections and propose rule promotions
&lt;span class="p"&gt;-&lt;/span&gt; Graduated rules with verify: patterns: .claude/memory/learned-rules.md
&lt;span class="p"&gt;-&lt;/span&gt; Evolution audit trail: .claude/memory/evolution-log.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Meta Alchemist article gave us the idea we were missing: &lt;strong&gt;machine-checkable verification patterns on every rule&lt;/strong&gt;. That single concept transforms a learning journal into an immune system. We credit the article for the insight.&lt;/p&gt;

&lt;p&gt;What we brought to the table: a Rust CLI that already handles capture, storage, querying, and correction chaining. The verification layer is 100 lines of bash on top of a structured backend, not 500 lines of JSONL parsing instructions for an LLM.&lt;/p&gt;

&lt;p&gt;The combination works. Corrections are captured automatically by the PostToolUse hook. Rules are verified mechanically by the sweep script. Promotions are approved deliberately by a human. The system gets better every week, and we can prove it with session scorecards.&lt;/p&gt;

&lt;p&gt;Two principles emerged from building this:&lt;/p&gt;

&lt;p&gt;From Meta Alchemist: &lt;strong&gt;A rule without a verification check is a wish.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;From us: &lt;strong&gt;A rule without CTO approval is an assumption.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Only verified, approved guardrails survive.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The terraphim-agent learning system is open source at &lt;a href="https://github.com/terraphim/terraphim-ai" rel="noopener noreferrer"&gt;github.com/terraphim/terraphim-ai&lt;/a&gt;. The verification layer described in this post is configuration, not code: shell scripts and markdown files on top of the existing CLI.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the third post in a series: &lt;a href="https://zestic.ai/blog/terraphim-agent-learning-hooks" rel="noopener noreferrer"&gt;Part 1: Configuring terraphim-agent for Claude Code and OpenCode&lt;/a&gt; | &lt;a href="https://zestic.ai/blog/terraphim-agent-verification-checklist" rel="noopener noreferrer"&gt;Part 2: Verification Checklist&lt;/a&gt; | Part 3: Self-Evolving Rules (this post)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>OpenClaw + Terraphim LLM Proxy: OpenAI, Z.ai GLM-5, and MiniMax M2.5</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Fri, 13 Feb 2026 20:17:04 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/openclaw-terraphim-llm-proxy-openai-zai-glm-5-and-minimax-m25-59m7</link>
      <guid>https://dev.to/alexmikhalev/openclaw-terraphim-llm-proxy-openai-zai-glm-5-and-minimax-m25-59m7</guid>
      <description>&lt;p&gt;If you want OpenClaw to use multiple providers through a single endpoint, with &lt;a href="https://github.com/terraphim/terraphim-llm-proxy" rel="noopener noreferrer"&gt;Terraphim AI intelligent LLM proxy&lt;/a&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI Codex (&lt;code&gt;gpt-5.2&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Z.ai (&lt;code&gt;glm-5&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;MiniMax (&lt;code&gt;MiniMax-M2.5&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;intelligent keyword routing&lt;/li&gt;
&lt;li&gt;automatic fallback when a provider goes down&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This guide reflects a real build-in-public rollout on &lt;code&gt;terraphim-llm-proxy&lt;/code&gt;, including production debugging, fallback drills, and routing verification.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this setup
&lt;/h2&gt;

&lt;p&gt;Most agent stacks fail at provider outages and model sprawl. A single proxy with explicit route chains keeps clients stable while you switch providers underneath.&lt;/p&gt;

&lt;h2&gt;
  
  
  Proxy config pattern
&lt;/h2&gt;

&lt;p&gt;Use route chains in &lt;code&gt;/etc/terraphim-llm-proxy/config.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[router]&lt;/span&gt;
&lt;span class="py"&gt;default&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"openai-codex,gpt-5.2-codex|zai,glm-5"&lt;/span&gt;
&lt;span class="py"&gt;think&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"openai-codex,gpt-5.2|minimax,MiniMax-M2.5|zai,glm-5"&lt;/span&gt;
&lt;span class="py"&gt;long_context&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"openai-codex,gpt-5.2|zai,glm-5"&lt;/span&gt;
&lt;span class="py"&gt;web_search&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"openai-codex,gpt-5.2|zai,glm-5"&lt;/span&gt;
&lt;span class="py"&gt;strategy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"fill_first"&lt;/span&gt;

&lt;span class="nn"&gt;[[providers]]&lt;/span&gt;
&lt;span class="py"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"openai-codex"&lt;/span&gt;
&lt;span class="py"&gt;api_base_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.openai.com/v1"&lt;/span&gt;
&lt;span class="py"&gt;api_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"oauth-token-managed-internally"&lt;/span&gt;
&lt;span class="py"&gt;models&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"gpt-5.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"gpt-5.2-codex"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"gpt-5.3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="py"&gt;transformers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nn"&gt;[[providers]]&lt;/span&gt;
&lt;span class="py"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"zai"&lt;/span&gt;
&lt;span class="py"&gt;api_base_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.z.ai/api/paas/v4"&lt;/span&gt;
&lt;span class="py"&gt;api_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"$ZAI_API_KEY"&lt;/span&gt;
&lt;span class="py"&gt;models&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"glm-5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"glm-4.7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"glm-4.6"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"glm-4.5"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="py"&gt;transformers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nn"&gt;[[providers]]&lt;/span&gt;
&lt;span class="py"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"minimax"&lt;/span&gt;
&lt;span class="py"&gt;api_base_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.minimax.io/anthropic"&lt;/span&gt;
&lt;span class="py"&gt;api_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"$MINIMAX_API_KEY"&lt;/span&gt;
&lt;span class="py"&gt;models&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"MiniMax-M2.5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"MiniMax-M2.1"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="py"&gt;transformers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keep secrets in env, never inline.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenClaw config pattern
&lt;/h2&gt;

&lt;p&gt;In both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/home/alex/.openclaw/openclaw.json&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/home/alex/.openclaw/clawdbot.json&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;set Terraphim provider:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;baseUrl&lt;/code&gt;: &lt;code&gt;http://127.0.0.1:3456/v1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;api&lt;/code&gt;: &lt;code&gt;openai-completions&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;model ids include:

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;openai-codex,gpt-5.2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;zai,glm-5&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;minimax,MiniMax-M2.5&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Intelligent routing example
&lt;/h2&gt;

&lt;p&gt;Add taxonomy file:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/etc/terraphim-llm-proxy/taxonomy/routing_scenarios/minimax_keyword_routing.md&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;route:: minimax, MiniMax-M2.5
priority:: 100
synonyms:: minimax, m2.5, minimax keyword, minimax route
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now a normal request containing a minimax keyword can route to MiniMax even if requested model is generic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Validation commands
&lt;/h2&gt;

&lt;p&gt;Direct provider checks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sS&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://127.0.0.1:3456/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'x-api-key: &amp;lt;PROXY_API_KEY&amp;gt;'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"model":"openai-codex,gpt-5.2","messages":[{"role":"user","content":"Reply exactly: openai-ok"}],"stream":false}'&lt;/span&gt;

curl &lt;span class="nt"&gt;-sS&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://127.0.0.1:3456/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'x-api-key: &amp;lt;PROXY_API_KEY&amp;gt;'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"model":"zai,glm-5","messages":[{"role":"user","content":"Reply exactly: zai-ok"}],"stream":false}'&lt;/span&gt;

curl &lt;span class="nt"&gt;-sS&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://127.0.0.1:3456/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'x-api-key: &amp;lt;PROXY_API_KEY&amp;gt;'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"model":"minimax,MiniMax-M2.5","messages":[{"role":"user","content":"Reply exactly: minimax-ok"}],"stream":false}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fallback proof (simulate Codex outage):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /etc/hosts /tmp/hosts.bak
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'127.0.0.1 chatgpt.com'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /etc/hosts &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null

curl &lt;span class="nt"&gt;-sS&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://127.0.0.1:3456/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'x-api-key: &amp;lt;PROXY_API_KEY&amp;gt;'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"model":"gpt-5.2","messages":[{"role":"user","content":"Reply exactly: fallback-ok"}],"stream":false}'&lt;/span&gt;

&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /tmp/hosts.bak /etc/hosts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look for fallback logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; terraphim-llm-proxy &lt;span class="nt"&gt;-n&lt;/span&gt; 120 &lt;span class="nt"&gt;--no-pager&lt;/span&gt; | rg &lt;span class="s1"&gt;'Primary target failed, attempting fallback target|next_provider='&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key lesson
&lt;/h2&gt;

&lt;p&gt;Reliable multi-model routing is mostly configuration discipline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;explicit route chains&lt;/li&gt;
&lt;li&gt;provider-specific endpoint handling where needed&lt;/li&gt;
&lt;li&gt;deterministic fallback order&lt;/li&gt;
&lt;li&gt;logs that prove routing decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This keeps OpenClaw simple and makes provider outages routine instead of incidents. Donate 3 USD to unlock the open-source proxy on &lt;a href="https://github.com/terraphim/terraphim-llm-proxy" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>minimax</category>
      <category>glm</category>
      <category>openai</category>
    </item>
    <item>
      <title>Deploy BERT Large Question Answering models with a tenth of the second's inference on CPU in Python</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Thu, 21 Jul 2022 12:52:47 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/deploy-bert-large-question-answering-models-with-a-tenth-of-the-seconds-inference-on-cpu-in-python-1nfp</link>
      <guid>https://dev.to/alexmikhalev/deploy-bert-large-question-answering-models-with-a-tenth-of-the-seconds-inference-on-cpu-in-python-1nfp</guid>
      <description>&lt;h3&gt;
  
  
  Deploy BERT Large Question Answering models with a tenth of the second's inference on CPU in Python
&lt;/h3&gt;

&lt;p&gt;How to deploy and benchmark Large BERT uncased model for Question Answering API with ~0.088387 seconds inference&lt;/p&gt;

&lt;h4&gt;
  
  
  Summary of the article
&lt;/h4&gt;

&lt;p&gt;This article will explore the challenges and opportunities of deploying a large BERT Question Answering Transformer model(bert-large-uncased-whole-word-masking-finetuned-squad) from inside Huggingface, where &lt;a href="https://developer.redis.com/howtos/redisgears?utm_campaign=write_for_redis"&gt;RedisGears&lt;/a&gt; and &lt;a href="https://developer.redis.com/howtos/redisai/getting-started?utm_campaign=write_for_redis"&gt;RedisAI&lt;/a&gt; perform heavy lifting while leveraging in-memory datastore Redis. End result will be Question Answering API with ~0.088387 seconds inference on the first run and nanosecond on the second.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why do we need RedisAI?
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;In data science load, you want to load high-performance hardware as close to 100% as possible.&lt;/li&gt;
&lt;li&gt;In user-facing load, you want to be able to distribute the load evenly, so it never reaches 100%, and client-facing servers can perform additional functions.&lt;/li&gt;
&lt;li&gt;In data science, you prefer re-calculate results.&lt;/li&gt;
&lt;li&gt;In a client-facing application, you prefer to cache results of calculation and fetch data from the cache as fast as possible to drive a seamless customer experience&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some numbers for inspiration and why to read this article:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 transformers_plain_bert_qa.py 
airborne transmission of respiratory infections is the lack of established methods for the detection of airborne respiratory microorganisms
10.351818372 seconds

time curl -i -H "Content-Type: application/json" -X POST -d '{"search":"Who performs viral transmission among adults"}' http://localhost:8080/qasearch

real    0m0.747s
user    0m0.004s
sys 0m0.000s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Background
&lt;/h3&gt;

&lt;p&gt;BERT Question Answering inference works where the ML model selects an answer from the given text. In other words, BERT QA “thinks” through the following: “What is the answer from the text, assuming the answer to the question exists within the paragraph selected.”&lt;/p&gt;

&lt;p&gt;So it’s important to select text potentially containing an answer. A typical pattern is to use Wikipedia data to build &lt;a href="https://lilianweng.github.io/posts/2020-10-29-odqa/"&gt;Open Domain Question Answering&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Our QA system is a medical domain-specific question/answering pipeline. Hence we need a first pipeline that turns data into a knowledge graph. This NLP pipeline is available at Redis LaunchPad, is fully &lt;a href="https://github.com/applied-knowledge-systems/the-pattern"&gt;open source&lt;/a&gt;, and is described in &lt;a href="https://dev.to/howtos/nlp"&gt;a previous article&lt;/a&gt;. Here is a 5-minute &lt;a href="https://www.youtube.com/watch?v=VgJ8DTX5Mt4"&gt;video&lt;/a&gt; describing it, and below, you will find an architectural overview:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--hEbCHOhM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/960/1%2Ark_ifZ1FOKihhYeATnzcRw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hEbCHOhM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/960/1%2Ark_ifZ1FOKihhYeATnzcRw.png" alt="" width="880" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  BERT Question Answering pipeline and API
&lt;/h3&gt;

&lt;p&gt;In the BERT QA pipeline (or in any other modern NLP inference task), there are two steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tokenize text — turn text into numbers&lt;/li&gt;
&lt;li&gt;Run the inference — large matrix multiplication&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With Redis, we have the opportunity to pre-compute everything and store it in memory, but how do we do it? Unlike with the summarization ML learning task, the question is not known in advance, so we can’t pre-compute all possible answers. However, we can pre-tokenize all potential answers (i.e. all paragraphs in the dataset) using RedisGears:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def parse_sentence(record):
    import redisAI
    import numpy as np
    global tokenizer
    if not tokenizer:
        tokenizer=loadTokeniser()
    hash_tag="{%s}" % hashtag()

    for idx, value in sorted(record['value'].items(), key=lambda item: int(item[0])):
        tokens = tokenizer.encode(value, add_special_tokens=False, max_length=511, truncation=True, return_tensors="np")
        tokens = np.append(tokens,tokenizer.sep_token_id).astype(np.int64)
        tensor=redisAI.createTensorFromBlob('INT64', tokens.shape, tokens.tobytes())

        key_prefix='sentence:'
        sentence_key=remove_prefix(record['key'],key_prefix)
        token_key = f"tokenized:bert:qa:{sentence_key}:{idx}"
        redisAI.setTensorInKey(token_key, tensor)
        execute('SADD',f'processed_docs_stage3_tokenized{hash_tag}', token_key)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See the &lt;a href="https://github.com/applied-knowledge-systems/the-pattern-api/blob/156633b9934f1243775671ce6c18ff2bf471c0ce/qasearch/tokeniser_gears_redisai.py#L17"&gt;full code on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Then for each Redis Cluster shard, we pre-load the BERT QA model by downloading, exporting it into torchscript, then loading it into each shard:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def load_bert():
    model_file = 'traced_bert_qa.pt'

    with open(model_file, 'rb') as f:
        model = f.read()
    startup_nodes = [{"host": "127.0.0.1", "port": "30001"}, {"host": "127.0.0.1", "port":"30002"}, {"host":"127.0.0.1", "port":"30003"}]
    cc = ClusterClient(startup_nodes = startup_nodes)
    hash_tags = cc.execute_command("RG.PYEXECUTE", "gb = GB('ShardsIDReader').map(lambda x:hashtag()).run()")[0]
    print(hash_tags)
    for hash_tag in hash_tags:
        print("Loading model bert-qa{%s}" %hash_tag.decode('utf-8'))
        cc.modelset('bert-qa{%s}' %hash_tag.decode('utf-8'), 'TORCH', 'CPU', model)
        print(cc.infoget('bert-qa{%s}' %hash_tag.decode('utf-8')))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://github.com/applied-knowledge-systems/the-pattern-api/blob/156633b9934f1243775671ce6c18ff2bf471c0ce/qasearch/export_load_bert.py"&gt;full code is available on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And when a question comes from the user, we tokenize and append the question to the list of potential answers before running the RedisAI model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;token_key = f"tokenized:bert:qa:{sentence_key}"
    # encode question
    input_ids_question = tokenizer.encode(question, add_special_tokens=True, truncation=True, return_tensors="np")
    t=redisAI.getTensorFromKey(token_key)
    input_ids_context=to_np(t,np.int64)
    # merge (append) with potential answer, context - is pre-tokenized paragraph
    input_ids = np.append(input_ids_question,input_ids_context)
    attention_mask = np.array([[1]*len(input_ids)])
    input_idss=np.array([input_ids])
    num_seg_a=input_ids_question.shape[1]
    num_seg_b=input_ids_context.shape[0]
    token_type_ids = np.array([0]*num_seg_a + [1]*num_seg_b)
    # create actual model runner for RedisAI
    modelRunner = redisAI.createModelRunner(f'bert-qa{hash_tag}')
    # make sure all types are correct
    input_idss_ts=redisAI.createTensorFromBlob('INT64', input_idss.shape, input_idss.tobytes())
    attention_mask_ts=redisAI.createTensorFromBlob('INT64', attention_mask.shape, attention_mask.tobytes())
    token_type_ids_ts=redisAI.createTensorFromBlob('INT64', token_type_ids.shape, token_type_ids.tobytes())
    redisAI.modelRunnerAddInput(modelRunner, 'input_ids', input_idss_ts)
    redisAI.modelRunnerAddInput(modelRunner, 'attention_mask', attention_mask_ts)
    redisAI.modelRunnerAddInput(modelRunner, 'token_type_ids', token_type_ids_ts)
    redisAI.modelRunnerAddOutput(modelRunner, 'answer_start_scores')
    redisAI.modelRunnerAddOutput(modelRunner, 'answer_end_scores')
    # run RedisAI model runner
    res = await redisAI.modelRunnerRunAsync(modelRunner)
    answer_start_scores=to_np(res[0],np.float32)
    answer_end_scores = to_np(res[1],np.float32)
    answer_start = np.argmax(answer_start_scores)
    answer_end = np.argmax(answer_end_scores) + 1
    answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end],skip_special_tokens = True))
    log("Answer "+str(answer))
    return answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Checkout the &lt;a href="https://github.com/applied-knowledge-systems/the-pattern-api/blob/156633b9934f1243775671ce6c18ff2bf471c0ce/qasearch/qa_redisai_keymiss_no_cache_np.py#L34"&gt;full code, available on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The process for making a BERT QA API call looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--R4Zxbm_0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2AXVlFo_JNqihOfqU4K9uicg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--R4Zxbm_0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2AXVlFo_JNqihOfqU4K9uicg.png" alt="" width="880" height="495"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Architecture Diagram for BERT QA RedisGears and RedisAI&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here I use two remarkable features of RedisGears: capturing events on key miss and using async/await to run RedisAI on each shard without locking the primary thread — so that Redis Cluster can continue to serve other customers. For benchmarks, caching responses from RedisAI is &lt;a href="https://github.com/applied-knowledge-systems/the-pattern-api/blob/156633b9934f1243775671ce6c18ff2bf471c0ce/qasearch/qa_redisai_keymiss_no_cache_np.py#L29"&gt;disabled&lt;/a&gt;. If you are getting response times in nanoseconds on the second call rather then milliseconds, check to ensure the line linked above is commented out.&lt;/p&gt;
&lt;h3&gt;
  
  
  Running the Benchmark
&lt;/h3&gt;

&lt;p&gt;Pre-requisites for running the benchmark:&lt;/p&gt;

&lt;p&gt;Assuming you are running Debian or Ubuntu and have Docker and docker-compose installed (or can create a virtual environment via conda), run the following commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone --recurse-submodules https://github.com/applied-knowledge-systems/the-pattern.git
cd the-pattern
./bootstrap_benchmark.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above commands should end with a curl call to the qasearch API, since Redis caching is disabled for the benchmark.&lt;/p&gt;

&lt;p&gt;Next, invoke curl like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;time curl -i -H "Content-Type: application/json" -X POST -d '{"search":"Who performs viral transmission among adults"}' [http://localhost:8080/qasearch](http://localhost:8080/qasearch)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expect the following output, or something similar based on your runtime environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
Date: Sun, 29 May 2022 12:05:39 GMT
Content-Type: application/json
Content-Length: 2120
Connection: keep-alive

{"links":[{"created_at":"2002","rank":13,"source":"C0001486","target":"C0152083"}],"results":[{"answer":"adenovirus","sentence":"The medium of 40 T150 flasks of adenovirus transducer dec CAR CHO cells yielded 0 5 1 my of purified msCEACAM1a 1 4 protein","sentencekey":"sentence:PMC125375.xml:{mG}:202","title":"Crystal structure of murine sCEACAM1a[1,4]: a coronavirus receptor in the CEA family"}] OUTPUT_REDUCTED}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I modified the output of API for the benchmark to return results from all shards — even if the answer is empty, in the run above five shards return answers, overall API call response under second with all additional hops to search in RedisGraph.&lt;/p&gt;

&lt;p&gt;I modified the output of the API for the benchmark to return results from all shards — even if the answer is empty. In the run above five shards return answers. The overall API call response takes less than one second with all additional hops to search in RedisGraph!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JIMKUwE---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2Ar0QGATTPREX4Tm1ZiEtalg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JIMKUwE---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2Ar0QGATTPREX4Tm1ZiEtalg.png" alt="" width="880" height="411"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Architecture Diagram for BERT QA API call&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Deep Dive into the Benchmark
&lt;/h3&gt;

&lt;p&gt;Let’s dig deeper into what’s happening under the hood:&lt;/p&gt;

&lt;p&gt;You should have a sentence key with shard id, which you get by looking at the “Cache key” from docker logs -f rgcluster. In my setup the cache key is, "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults". If you think it looks like a function call it's because it is a function call. It is triggered if the key isn't present in the Redis Cluster, which for the benchmark will be every time since if you remember we disabled caching the output.&lt;/p&gt;

&lt;p&gt;One more thing to figure out from the logs is the port of the shard corresponding to the hashtag, also known as the shard id. It is the text found in betweeen the curly brackets – looks like {6fd} above. The same will be in the output for the export_load script. In my case the cache key was found in "30012.log", so my port is 30012.&lt;/p&gt;

&lt;p&gt;Next I run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;redis-cli -c -p 300012 -h 127.0.0.1 get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and then run the benchmark:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;redis-benchmark -p 30012 -h 127.0.0.1 -n 10 get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"
====== get bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults ======
  10 requests completed in 0.04 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

10.00% &amp;lt;= 41 milliseconds
100.00% &amp;lt;= 41 milliseconds
238.10 requests per second
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are wondering, -n = number of times. In this case we run the benchmark 10 times. You can also add:&lt;/p&gt;

&lt;p&gt;– csv if you want to output in CSV format&lt;/p&gt;

&lt;p&gt;– precision 3 if you want more decimals in the ms&lt;/p&gt;

&lt;p&gt;More information about the benchmarking tool can be found on the &lt;a href="https://redis.io/topics/benchmarks"&gt;redis.io Benchmarks page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;if you don’t have redis-utils installed locally, you can use Docker as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker exec -it rgcluster /bin/bash
redis-benchmark -p 30012 -h 127.0.0.1 -n 10 get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"
====== get bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults ======
  10 requests completed in 1.75 seconds
  50 parallel clients
  99 bytes payload
  keep alive: 1
  host configuration "save":
  host configuration "appendonly": no
  multi-thread: no

Latency by percentile distribution:
0.000% &amp;lt;= 243.711 milliseconds (cumulative count 1)
50.000% &amp;lt;= 987.135 milliseconds (cumulative count 5)
75.000% &amp;lt;= 1577.983 milliseconds (cumulative count 8)
87.500% &amp;lt;= 1662.975 milliseconds (cumulative count 9)
93.750% &amp;lt;= 1744.895 milliseconds (cumulative count 10)
100.000% &amp;lt;= 1744.895 milliseconds (cumulative count 10)

Cumulative distribution of latencies:
0.000% &amp;lt;= 0.103 milliseconds (cumulative count 0)
10.000% &amp;lt;= 244.223 milliseconds (cumulative count 1)
20.000% &amp;lt;= 409.343 milliseconds (cumulative count 2)
30.000% &amp;lt;= 575.487 milliseconds (cumulative count 3)
40.000% &amp;lt;= 821.247 milliseconds (cumulative count 4)
50.000% &amp;lt;= 987.135 milliseconds (cumulative count 5)
60.000% &amp;lt;= 1157.119 milliseconds (cumulative count 6)
70.000% &amp;lt;= 1497.087 milliseconds (cumulative count 7)
80.000% &amp;lt;= 1577.983 milliseconds (cumulative count 8)
90.000% &amp;lt;= 1662.975 milliseconds (cumulative count 9)
100.000% &amp;lt;= 1744.895 milliseconds (cumulative count 10)

Summary:
  throughput summary: 5.73 requests per second
  latency summary (msec):
          avg min p50 p95 p99 max
     1067.296 243.584 987.135 1744.895 1744.895 1744.895
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The platform only has 20 articles and 8 Redis nodes (4 masters + 4 slaves), so relevance would be wrong and it doesn’t need a lot of memory.&lt;/p&gt;

&lt;h4&gt;
  
  
  AI.INFO
&lt;/h4&gt;

&lt;p&gt;Now let’s check how long our RedisAI model runs on the {6fd} shard:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;127.0.0.1:30012&amp;gt; AI.INFO bert-qa{6fd}
 1) "key"
 2) "bert-qa{6fd}"
 3) "type"
 4) "MODEL"
 5) "backend"
 6) "TORCH"
 7) "device"
 8) "CPU"
 9) "tag"
10) ""
11) "duration"
12) (integer) 8928136
13) "samples"
14) (integer) 58
15) "calls"
16) (integer) 58
17) "errors"
18) (integer) 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;bert-qa{6fd} is the key of the actual (very large) model saved. The AI.INFO command gives us a cumulative duration of 8928136 microseconds and 58 calls, which is approximately 153 milliseconds per call.&lt;/p&gt;

&lt;p&gt;Let’s double-check to make sure that’s right by resetting the stats and then re-runnning the benchmark.&lt;/p&gt;

&lt;p&gt;First, reset the stats:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;127.0.0.1:30012&amp;gt; AI.INFO bert-qa{6fd} RESETSTAT
OK
127.0.0.1:30012&amp;gt; AI.INFO bert-qa{6fd}
 1) "key"
 2) "bert-qa{6fd}"
 3) "type"
 4) "MODEL"
 5) "backend"
 6) "TORCH"
 7) "device"
 8) "CPU"
 9) "tag"
10) ""
11) "duration"
12) (integer) 0
13) "samples"
14) (integer) 0
15) "calls"
16) (integer) 0
17) "errors"
18) (integer) 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, re-run the benchmark:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;redis-benchmark -p 30012 -h 127.0.0.1 -n 10 get "bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults"
====== get bertqa{6fd}_PMC169038.xml:{6fd}:33_Who performs viral transmission among adults ======
  10 requests completed in 1.78 seconds
  50 parallel clients
  99 bytes payload
  keep alive: 1
  host configuration "save":
  host configuration "appendonly": no
  multi-thread: no

Latency by percentile distribution:
0.000% &amp;lt;= 188.927 milliseconds (cumulative count 1)
50.000% &amp;lt;= 995.839 milliseconds (cumulative count 5)
75.000% &amp;lt;= 1606.655 milliseconds (cumulative count 8)
87.500% &amp;lt;= 1692.671 milliseconds (cumulative count 9)
93.750% &amp;lt;= 1779.711 milliseconds (cumulative count 10)
100.000% &amp;lt;= 1779.711 milliseconds (cumulative count 10)

Cumulative distribution of latencies:
0.000% &amp;lt;= 0.103 milliseconds (cumulative count 0)
10.000% &amp;lt;= 189.183 milliseconds (cumulative count 1)
20.000% &amp;lt;= 392.191 milliseconds (cumulative count 2)
30.000% &amp;lt;= 540.159 milliseconds (cumulative count 3)
40.000% &amp;lt;= 896.511 milliseconds (cumulative count 4)
50.000% &amp;lt;= 996.351 milliseconds (cumulative count 5)
60.000% &amp;lt;= 1260.543 milliseconds (cumulative count 6)
70.000% &amp;lt;= 1456.127 milliseconds (cumulative count 7)
80.000% &amp;lt;= 1606.655 milliseconds (cumulative count 8)
90.000% &amp;lt;= 1692.671 milliseconds (cumulative count 9)
100.000% &amp;lt;= 1779.711 milliseconds (cumulative count 10)

Summary:
  throughput summary: 5.62 requests per second
  latency summary (msec):
          avg min p50 p95 p99 max
     1080.454 188.800 995.839 1779.711 1779.711 1779.711
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now check the stats again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AI.INFO bert-qa{6fd}
 1) "key"
 2) "bert-qa{6fd}"
 3) "type"
 4) "MODEL"
 5) "backend"
 6) "TORCH"
 7) "device"
 8) "CPU"
 9) "tag"
10) ""
11) "duration"
12) (integer) 1767749
13) "samples"
14) (integer) 20
15) "calls"
16) (integer) 20
17) "errors"
18) (integer) 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we get 88387.45 microseconds per call ~0.088387 seconds, which is pretty fast! Also, considering we started with 10 seconds per call, I think the benefits of using RedisAI in combination with RedisGears are pretty obvious. However, the trade-off is high memory usage.&lt;/p&gt;

&lt;p&gt;There are many ways to optimize this deployment. For example, you can add a FP16 quantization and ONNX runtime. If you want to try that, &lt;a href="https://github.com/applied-knowledge-systems/the-pattern-api/blob/7bcf021e537dc8d453036730f0a993dd52e1781f/qasearch/export_load_bert.py"&gt;this script&lt;/a&gt; will be a good starting point.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using Grafana to monitor RedisGears throughput, CPU, and Memory usage
&lt;/h3&gt;

&lt;p&gt;Thanks to the contribution of &lt;a href="https://volkovlabs.com/from-a-basic-redistimeseries-data-source-to-2-million-downloads-in-grafana-marketplace-9921ed9ac5a"&gt;Mikhail Volkov&lt;/a&gt;, we can now observe RedisGears and RedisGraph throughput and memory consumption using Grafana. When you cloned repository it started Graphana Docker, which has pre-build templates to monitor RedisCluster, including RedisGears and RedisAI, and Graph — which is Redis with RedisGraph. “The Pattern” dashboard provides an overview, with all the key benchmark metrics you care about:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_TTCkmBY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2Acb0U5fY0rbBGlbqvAtkQLg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_TTCkmBY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2Acb0U5fY0rbBGlbqvAtkQLg.png" alt="" width="880" height="356"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Grafana for RedisGraph&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--v00__1sx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2AaUvP_YGGFg44kMkaASy8lg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--v00__1sx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/1%2AaUvP_YGGFg44kMkaASy8lg.png" alt="" width="880" height="448"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Grafana for RedisCluster&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This post is in collaboration with Redis.&lt;/p&gt;

</description>
      <category>python</category>
      <category>redis</category>
      <category>nlp</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Announcing Reference Architecture for AI: Build on Redis with Python</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Thu, 16 Jun 2022 16:10:28 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/announcing-reference-architecture-for-ai-build-on-redis-with-python-445g</link>
      <guid>https://dev.to/alexmikhalev/announcing-reference-architecture-for-ai-build-on-redis-with-python-445g</guid>
      <description>&lt;p&gt;We launch in two full-featured articles — &lt;a href="https://reference-architecture.ai/docs/nlp/"&gt;NLP ML pipeline for&lt;/a&gt; turning unstructured JSON text into a knowledge graph and fresh off the press &lt;a href="https://reference-architecture.ai/docs/bert-qa-benchmarking/"&gt;Benchmarks for BERT Large Question Answering inference for RedisAI and RedisGears&lt;/a&gt; with Graphana Dashboards by &lt;a href="https://volkovlabs.com/from-a-basic-redistimeseries-data-source-to-2-million-downloads-in-grafana-marketplace-9921ed9ac5a"&gt;Mikhail Volkov&lt;/a&gt;. Further announcement below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://reference-architecture.ai/posts/post-0/"&gt;Announcing Reference Architecture for AI&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>reference</category>
      <category>ai</category>
    </item>
    <item>
      <title>Ethics of creativity: is it the same for scientists, engineers or other creative professions?</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Sat, 19 Feb 2022 00:17:11 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/ethics-of-creativity-is-it-the-same-for-scientists-engineers-or-other-creative-professions-12gm</link>
      <guid>https://dev.to/alexmikhalev/ethics-of-creativity-is-it-the-same-for-scientists-engineers-or-other-creative-professions-12gm</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--of243uUB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/0%2Ah_BZbya_w-1Plh-Y" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--of243uUB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1024/0%2Ah_BZbya_w-1Plh-Y" alt="" width="880" height="586"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Photo by Clark Tibbs on Unsplash&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There are common assumptions about ethics for scientists:&lt;/p&gt;

&lt;p&gt;that bioscientists, working on even harmless virus, will put an effort of containing the virus, even if it’s as benign as forcing people to sneeze at 8 a.m. will think about consequences of the action releasing such virus into the public: all drivers on motorway sneezing together will cause dire consequences.&lt;/p&gt;

&lt;p&gt;That physicists or radiologists working on new radioactive materials or tracing methods will refrain from spreading radioactive materials in front of their labs to see how much glow the passer-by get.&lt;/p&gt;

&lt;p&gt;When you are creating a computer virus in the software industry, you can be called a hero, hacker or criminal depending on what your virus does and if it damages or repairs infrastructure (virus patching vulnerable DNS server), demand ransom or destroy people’s valuables.&lt;/p&gt;

&lt;p&gt;But the ethics rules seem to work differently for content creators of numerous social networks.&lt;/p&gt;

&lt;p&gt;When are you creating a movie or cartoon that affects the human brain and destroys the children’s behaviour, how shall a creative person be called?&lt;/p&gt;

&lt;p&gt;We know how to trick humans into binge-watching or slot machine pull to refresh for social media through cognitive psychology research and applying it. Even adults can’t resist it, but even children’s cartoon series has a “hook” at the end of the episode, and the hook can be for the PG6 or PG10 series — those children who will have strategic parts of their brain developed in the next ten years or more.&lt;/p&gt;

&lt;p&gt;The variety of the content affecting our children is very substantial:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Noise: from just noise like “kids playing toys” — as a parent, I want them to play with their toys, not watch other people playing with them.&lt;/li&gt;
&lt;li&gt;Behaviour damage: modern high-quality cartoons like Booba (and Masha &amp;amp; Bear, and Tom&amp;amp;Jerry and Grizly&amp;amp; Lemmings and… and… a large number of video material marked by tag “kids” on youtube): I can trace behaviour changes in my 5-year-old son depending on what cartoons he is watching, other parents confirm this observation.&lt;/li&gt;
&lt;li&gt;outdated archetypes: if you are a parent of a daughter, you may not agree to the archetype introduced by Cindarella — where the role of the good daughter is to do chores, sit and wait for fairy godmother and Prince Charming&lt;/li&gt;
&lt;li&gt;pure toxic and harmful content, which introduces self-harm and suicidal behaviour&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And there are creators and re-posters of such content. Some do it for money, some for fame and some out of ignorance.&lt;/p&gt;

&lt;p&gt;But it doesn’t have to be that way:&lt;/p&gt;

&lt;p&gt;Ethics is a personal choice of every person, and we can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Make ethical choices personally, building on existing philosophical and religious ethical frameworks, whether from Stoic philosophy or following ten commandments.&lt;/li&gt;
&lt;li&gt;Think about the consequences of our actions — long term and in the lifecycle of your community, your family and yourself&lt;/li&gt;
&lt;li&gt;we can build not only profitable and lawful but ethical and valuable products and services&lt;/li&gt;
&lt;li&gt;We can expand and build ethical AI to remind humans about their moral choices and biases.&lt;/li&gt;
&lt;li&gt;Build intelligent filters AI filters for ourselves and share them with others&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, why are the best in the brightest minds of our data science and engineering community focused on building advanced noise generators like GPT-3?&lt;/p&gt;

</description>
      <category>engineering</category>
      <category>creativity</category>
      <category>ethics</category>
      <category>leadership</category>
    </item>
    <item>
      <title>Soft skills for the modern world</title>
      <dc:creator>AlexMikhalev</dc:creator>
      <pubDate>Wed, 16 Feb 2022 09:28:27 +0000</pubDate>
      <link>https://dev.to/alexmikhalev/soft-skills-for-the-modern-world-f7c</link>
      <guid>https://dev.to/alexmikhalev/soft-skills-for-the-modern-world-f7c</guid>
      <description>&lt;p&gt;For data scientists and engineers, soft skills for the modern world are not the old &lt;a href="https://www.oxfordreference.com/view/10.1093/oi/authority.20110803105813319"&gt;Trivium&lt;/a&gt; — rhetoric, logic, and grammar. My view on the set of skills that allow you to navigate complexity in the modern world, common for engineers, architects and data scientists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Systems Engineering (Thinking) — the ability to identify and select own system of interest, the operational environment, emergent properties, processes and lifecycle. See INCOSE material and ISO 42010.&lt;/li&gt;
&lt;li&gt;The ability to identify domains, stakeholders, concerns and their needs/requirements. See INCOSE material, IEEE 15288:2015 and NIST standard for Cyber-Physical Systems.&lt;/li&gt;
&lt;li&gt;The ability to formulate a hypothesis with supported measurements (see “Feynman on Scientific Method”)&lt;/li&gt;
&lt;li&gt;Stakeholder management: To work through conflicting goals and requirements — Theory of Constraint (Thinking Tools), see a lot of material in the TOC community by E.Goldratt, E.Schragenheim and many others. For a short introduction, see Clark Ching “Bottleneck Rules”.&lt;/li&gt;
&lt;li&gt;Conflict resolution technical — i.e. metal shall be soft and hard. See TRIZ with variations — Triz+, BioTRIZ.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The above requires the base ability to work productively within a given time — GTD/Superfocus/Pomodoro, and work with long text — Zettelkasten, which leads to identifying and building taxonomy and ontology of own concepts.&lt;/p&gt;

&lt;p&gt;When focused on old Trivium, you can’t build complex systems or organisations (a system of systems), and you can’t make a good decision based on well-sounded arguments, to quote a conversation with one of my friends.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Friend: “Agile/DevOps is a method of delivering things faster/better/cheaper. Would you like it?&lt;/li&gt;
&lt;li&gt;Me: Of course, yes.&lt;/li&gt;
&lt;li&gt;Friend: And I didn’t say anything meaningful until now.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Building complex and fruitful systems require precise communication and shared goals. Let’s focus on building complex, valuable, ethical systems.&lt;/p&gt;

</description>
      <category>engineering</category>
      <category>systemsthinking</category>
      <category>datascience</category>
      <category>softskills</category>
    </item>
  </channel>
</rss>
