<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AJAY SABLE</title>
    <description>The latest articles on DEV Community by AJAY SABLE (@ajay_sable_981452ee970263).</description>
    <link>https://dev.to/ajay_sable_981452ee970263</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2264179%2F1c551b07-e40e-462a-9877-dc2e2eb1fad8.jpeg</url>
      <title>DEV Community: AJAY SABLE</title>
      <link>https://dev.to/ajay_sable_981452ee970263</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ajay_sable_981452ee970263"/>
    <language>en</language>
    <item>
      <title>Same model. Different results. — AgentKit Benchmark + OpenCode Integration</title>
      <dc:creator>AJAY SABLE</dc:creator>
      <pubDate>Thu, 09 Apr 2026 09:59:51 +0000</pubDate>
      <link>https://dev.to/ajay_sable_981452ee970263/same-model-different-results-agentkit-benchmark-opencode-integration-3772</link>
      <guid>https://dev.to/ajay_sable_981452ee970263/same-model-different-results-agentkit-benchmark-opencode-integration-3772</guid>
      <description>&lt;p&gt;&lt;strong&gt;We open-sourced AgentKit&lt;/strong&gt; two weeks ago with zero guarantees anyone would care.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;400+ clones later&lt;/strong&gt; — we're shipping the biggest update yet. And we have benchmark data to back it up.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Quick note:&lt;/strong&gt; AgentKit Preview is our closed, in-development intelligence layer. The fully open-source AgentKit is live and ready to use today at &lt;a href="https://github.com/Ajaysable123/AgentKit" rel="noopener noreferrer"&gt;github.com/Ajaysable123/AgentKit&lt;/a&gt; — &lt;code&gt;npx agentkit-ai@latest init&lt;/code&gt; gets you running in seconds.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Live Benchmark — Gemma 4 31b · Same Model · Same Task
&lt;/h2&gt;

&lt;p&gt;Both runs used &lt;strong&gt;Gemma 4 31b&lt;/strong&gt; via OpenCode. The only variable was AgentKit Preview's workflow enforcement, skill injection, and plan gates.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Vanilla OpenCode&lt;/th&gt;
&lt;th&gt;+ AgentKit Preview&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Structured planning before coding&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Plan approved before first edit&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅ Yes (40.6s review)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Task interruptions&lt;/td&gt;
&lt;td&gt;1x&lt;/td&gt;
&lt;td&gt;0x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Task completion&lt;/td&gt;
&lt;td&gt;20% (scaffolding only)&lt;/td&gt;
&lt;td&gt;80% (DER parser implemented)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hard problem solved&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Without AgentKit&lt;/strong&gt; — Gemma 4 31b gave up on the hard part and shipped placeholder strings (&lt;code&gt;[ASN.1 Decoding Required]&lt;/code&gt;). No plan, no verification, interrupted once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With AgentKit&lt;/strong&gt; — Same Gemma 4 31b implemented a real custom ASN.1 DER parser, handled both UTCTime and GeneralizedTime, built expiration logic. Completed the task properly.&lt;/p&gt;

&lt;p&gt;The model didn't get smarter. AgentKit's workflow gates changed its behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plan gate&lt;/strong&gt; forced it to think through the DER parsing approach before writing code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Approval step&lt;/strong&gt; made it commit to solving the hard problem instead of sidestepping it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State machine&lt;/strong&gt; kept it accountable through RESEARCH → PLAN → EXECUTE → REVIEW&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What else just landed
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🔌 Native OpenCode Integration
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fufri9nslyx3ohacw2n1a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fufri9nslyx3ohacw2n1a.png" alt="OpenCode Integration" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AgentKit now ships a &lt;strong&gt;native TUI plugin&lt;/strong&gt; for OpenCode that lives inside the terminal UI — not just in the system prompt.&lt;/p&gt;

&lt;p&gt;Select the &lt;code&gt;agentkit&lt;/code&gt; agent from the agent switcher and you get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pre-loaded skills injected automatically&lt;/li&gt;
&lt;li&gt;Workflow gates (RESEARCH → PLAN → EXECUTE → REVIEW → SHIP)&lt;/li&gt;
&lt;li&gt;Mandatory approval dialogs before any code edit&lt;/li&gt;
&lt;li&gt;Memory context from previous sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🤖 Works With Any Model
&lt;/h3&gt;

&lt;p&gt;The skill router, workflow engine, and marketplace run entirely via CLI — no Claude API required. Tested on Gemma 4 31b, MiniMax M2.5, and Claude.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Works with any model in OpenCode&lt;/span&gt;
agentkit workflow transition RESEARCH
agentkit workflow approve
agentkit workflow transition EXECUTE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Open-source AgentKit (free — stable &amp;amp; ready to use):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx agentkit-ai@latest init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;👉 &lt;a href="https://github.com/Ajaysable123/AgentKit" rel="noopener noreferrer"&gt;github.com/Ajaysable123/AgentKit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AgentKit Preview (closed beta — in active development)&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;To everyone who cloned, starred, or tried AgentKit — thank you. This is just getting started. 🚀&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


The callout block at the top does the heavy lifting — anyone who lands on the article immediately knows the open-source version is stable and available, and Preview is the next thing being built. Want any other changes?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>opensource</category>
      <category>ai</category>
      <category>gemma</category>
      <category>devtools</category>
    </item>
    <item>
      <title>I studied 5 repos with 200K+ combined stars and built the tool they were all missing</title>
      <dc:creator>AJAY SABLE</dc:creator>
      <pubDate>Tue, 31 Mar 2026 07:02:07 +0000</pubDate>
      <link>https://dev.to/ajay_sable_981452ee970263/i-studied-5-repos-with-200k-combined-stars-and-built-the-tool-they-were-all-missing-4cko</link>
      <guid>https://dev.to/ajay_sable_981452ee970263/i-studied-5-repos-with-200k-combined-stars-and-built-the-tool-they-were-all-missing-4cko</guid>
      <description>&lt;p&gt;I build agentic AI systems for a living — multi-agent compliance pipelines, document orchestration, RAG-powered assistants. Claude Code is my daily driver.&lt;/p&gt;

&lt;p&gt;Last month, my Claude Code bill hit $213.&lt;/p&gt;

&lt;p&gt;Not because I was doing anything unusual. Standard development work. But I was burning tokens on skills that weren't relevant to my current task, re-explaining my project architecture every new session, and running Opus for tasks that Haiku could handle fine.&lt;/p&gt;

&lt;p&gt;So I spent a few weeks studying the most popular tools trying to solve pieces of this problem:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Repo&lt;/th&gt;
&lt;th&gt;⭐ Stars&lt;/th&gt;
&lt;th&gt;What it solves&lt;/th&gt;
&lt;th&gt;What it doesn't&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/obra/superpowers" rel="noopener noreferrer"&gt;Superpowers&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;108K&lt;/td&gt;
&lt;td&gt;Workflow methodology, TDD, subagent development&lt;/td&gt;
&lt;td&gt;No memory, no token optimization, no cost tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/thedotmack/claude-mem" rel="noopener noreferrer"&gt;claude-mem&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;39.9K&lt;/td&gt;
&lt;td&gt;Session memory persistence&lt;/td&gt;
&lt;td&gt;No skills, no workflow, no model routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/hesreallyhim/awesome-claude-code" rel="noopener noreferrer"&gt;awesome-claude-code&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;30.9K&lt;/td&gt;
&lt;td&gt;Curates 1,234+ skills&lt;/td&gt;
&lt;td&gt;It's a directory — no intelligence, no routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/ruvnet/ruflo" rel="noopener noreferrer"&gt;ruflo&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;24.8K&lt;/td&gt;
&lt;td&gt;Multi-agent swarm orchestration&lt;/td&gt;
&lt;td&gt;Complex, heavy, uses 7x tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ui-ux-pro-max-skill&lt;/td&gt;
&lt;td&gt;~500&lt;/td&gt;
&lt;td&gt;Design-specific SKILL.md&lt;/td&gt;
&lt;td&gt;Single domain only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The pattern was obvious: &lt;strong&gt;everyone built one layer. Nobody built the intelligence layer that ties them together while cutting your costs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I built that layer. It's called &lt;a href="https://github.com/Ajaysable123/AgentKit" rel="noopener noreferrer"&gt;AgentKit&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx agentkit init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One command. Detects your platform. Installs everything. Starts saving tokens immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  What AgentKit actually does
&lt;/h2&gt;

&lt;p&gt;Five layers, each solving a specific problem:&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Intelligent Skill Router
&lt;/h3&gt;

&lt;p&gt;This is the single biggest token saver.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; You install 50 skills or dump everything into CLAUDE.md. All of it loads into context on every prompt — even when you're debugging Python and your React, Docker, and GraphQL skills are just sitting there burning tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; A 3-tier classifier that runs on every prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Tier 1: Keyword regex (instant, free)
# Tier 2: Heuristic scoring (instant, free)  
# Tier 3: Haiku fallback for ambiguous prompts (~$0.0003)
&lt;/span&gt;
&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Python AttributeError...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;     &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nf"&gt;debugging    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.00&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write jest tests...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;          &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="nf"&gt;testing      &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.00&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Add JWT auth to REST API...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nf"&gt;work     &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It loads &lt;strong&gt;2-5 relevant skills&lt;/strong&gt; instead of all of them. And it uses progressive disclosure — skills load at 3 detail levels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Level 1:  ~50 tokens   (trigger description — always loaded)
Level 2:  ~500 tokens  (core instructions — loaded when task confirmed)
Level 3:  ~2,000 tokens (full references — loaded for complex work)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result: 45,000 tokens/session → ~5,000 tokens/session. 89% reduction.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Plus a forced-eval hook that bumps skill activation from 20% to 84%:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# hooks/forced_eval.sh — PreToolUse hook&lt;/span&gt;
&lt;span class="nv"&gt;LOADED_SKILLS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AGENTKIT_LOADED_SKILLS&lt;/span&gt;&lt;span class="k"&gt;:-}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOADED_SKILLS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;span class="k"&gt;fi
&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SKILL_EVAL: Before proceeding, check if any active skill applies: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;LOADED_SKILLS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one hook is probably worth the entire install.&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 2: Project Memory Graph
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; Claude forgets everything between sessions. Every morning you re-explain your architecture, re-discover API patterns, re-establish conventions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; A SQLite knowledge graph that automatically captures entities from your coding session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- memory/schema.sql&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;entities&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- file, function, api_route, package, command&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="nb"&gt;REAL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;decisions&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rationale&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- FTS5 for fast full-text search&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;VIRTUAL&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;entities_fts&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;fts5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At session end, it generates a Haiku-compressed handoff (~$0.0015). Next session, it injects only the memory nodes relevant to your current task — not everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result: 10,000 tokens of context → ~2,000 tokens. 80% reduction.&lt;/strong&gt; And your agent actually knows what happened yesterday.&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 3: Token Budget Intelligence
&lt;/h3&gt;

&lt;p&gt;Three automatic optimizations that stack:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auto model routing:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;is_subagent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;is_subagent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-haiku-4-5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;          &lt;span class="c1"&gt;# Always cheapest for subagents
&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;has_complex_signals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;         &lt;span class="c1"&gt;# "architect", "security audit", etc.
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;           &lt;span class="c1"&gt;# $15/M tokens — only when needed
&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;has_simple_signals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;          &lt;span class="c1"&gt;# "find", "list", "rename", etc.
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-haiku-4-5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;          &lt;span class="c1"&gt;# $0.25/M tokens
&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;             &lt;span class="c1"&gt;# $3/M — the 80% default
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Thinking budget tuning:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Trivial tasks (file search, formatting):  0 thinking tokens     → saves $0.48/request
Moderate tasks (bug fixes, features):     8,192 thinking tokens → saves $0.36/request
Complex tasks (architecture, security):   32,000 thinking tokens → full power
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real-time cost dashboard:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;💰 $0.034 | 🧠 Sonnet | ⚡ 12,450 tok | 📈 saved 32% vs baseline
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Combined result: ~$200/mo → ~$60/mo. 70% cost reduction.&lt;/strong&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 4: Workflow Engine
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; AI agents jump straight to coding. No research. No plan. Then you spend 3 hours debugging something that a 5-minute plan would have prevented.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; An enforced state machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;IDLE → RESEARCH → PLAN → EXECUTE → REVIEW → SHIP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The plan gate hook literally blocks Edit/Write operations until a plan exists:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# hooks/plan_gate.sh — PreToolUse hook&lt;/span&gt;
&lt;span class="c"&gt;# Blocks Edit/Write tools if workflow state is not PLAN or EXECUTE&lt;/span&gt;
&lt;span class="nv"&gt;TOOL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;STATE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;python3 &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;AGENTKIT_HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/workflow/enforcer.py"&lt;/span&gt; &lt;span class="nt"&gt;--action&lt;/span&gt; check&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TOOL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^&lt;span class="o"&gt;(&lt;/span&gt;Edit|Write&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$STATE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;"PLAN"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$STATE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;"EXECUTE"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"BLOCK: Cannot edit files without an approved plan. Run research first."&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Quality gates run after every edit — syntax, lint, type checks, tests. Five languages supported: Python, TypeScript, JavaScript, Go, Rust.&lt;/p&gt;




&lt;h3&gt;
  
  
  Layer 5: Universal Platform Layer
&lt;/h3&gt;

&lt;p&gt;One SKILL.md file → 10 platforms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;Claude Code   → Native SKILL.md + full hooks&lt;/span&gt;
&lt;span class="s"&gt;Cursor        → .mdc rules in .cursor/rules/&lt;/span&gt;
&lt;span class="s"&gt;Codex CLI     → AGENTS.md sections&lt;/span&gt;
&lt;span class="s"&gt;Gemini CLI    → .gemini/GEMINI.md&lt;/span&gt;
&lt;span class="s"&gt;Antigravity   → Plugin YAML&lt;/span&gt;
&lt;span class="s"&gt;OpenCode      → Config JSON system prompt&lt;/span&gt;
&lt;span class="s"&gt;Windsurf      → Cascade rules&lt;/span&gt;
&lt;span class="s"&gt;Aider         → .aider.conf.yml&lt;/span&gt;
&lt;span class="s"&gt;Kilo Code     → Plugin YAML&lt;/span&gt;
&lt;span class="s"&gt;Augment       → Context file&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;npx agentkit init&lt;/code&gt; detects which platforms you have installed and configures the right format for each. Zero manual conversion.&lt;/p&gt;




&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;p&gt;Everything above has been smoke-tested with real prompts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Before AgentKit&lt;/th&gt;
&lt;th&gt;After AgentKit&lt;/th&gt;
&lt;th&gt;Change&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tokens per session (skills)&lt;/td&gt;
&lt;td&gt;~45,000&lt;/td&gt;
&lt;td&gt;~5,000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;89% ↓&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory context tokens&lt;/td&gt;
&lt;td&gt;~10,000&lt;/td&gt;
&lt;td&gt;~2,000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;80% ↓&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly cost&lt;/td&gt;
&lt;td&gt;~$200&lt;/td&gt;
&lt;td&gt;~$60&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;70% ↓&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Skill activation rate&lt;/td&gt;
&lt;td&gt;20%&lt;/td&gt;
&lt;td&gt;84%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.2x ↑&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Platforms supported&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Can skip planning&lt;/td&gt;
&lt;td&gt;Always&lt;/td&gt;
&lt;td&gt;Never&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Enforced&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  How it works with existing tools
&lt;/h2&gt;

&lt;p&gt;AgentKit doesn't replace Superpowers or claude-mem — it complements them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;With Superpowers:&lt;/strong&gt; AgentKit adds the memory, token optimization, and model routing that Superpowers doesn't have. Use Superpowers for methodology + AgentKit for intelligence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With claude-mem:&lt;/strong&gt; AgentKit's memory graph is more structured (entities + relationships + decisions vs flat text), but they solve the same core problem. Use whichever fits your workflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With Ruflo swarms:&lt;/strong&gt; AgentKit can optimize Ruflo swarm costs by routing worker agents to Haiku and loading only relevant skills per agent. (Phase 3 roadmap.)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One command install&lt;/span&gt;
npx agentkit init

&lt;span class="c"&gt;# Check what's running&lt;/span&gt;
npx agentkit status

&lt;span class="c"&gt;# See your savings&lt;/span&gt;
npx agentkit costs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Ajaysable123/AgentKit" rel="noopener noreferrer"&gt;github.com/Ajaysable123/AgentKit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;npm:&lt;/strong&gt; &lt;code&gt;npm install -g agentkit-ai&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;MIT licensed. 16 open issues tagged "good first issue" if you want to contribute. We already got our first external contributor submitting 4 new skills via PR within 48 hours of launch.&lt;/p&gt;

&lt;p&gt;If it saves you money, star it ⭐. If something breaks, &lt;a href="https://github.com/Ajaysable123/AgentKit/issues" rel="noopener noreferrer"&gt;open an issue&lt;/a&gt;. PRs welcome — especially skills for languages and frameworks I haven't covered yet.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Ajay — a Senior Gen AI Developer building agentic systems in production for FinTech and Logistics clients. I built AgentKit because I was tired of paying $200/month for Claude Code when 70% of those tokens were wasted. Follow me on &lt;a href="https://github.com/Ajaysable123" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/ajay-sable/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for updates on AgentKit and agentic AI development.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>opensource</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
