<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joske Vermeulen</title>
    <description>The latest articles on DEV Community by Joske Vermeulen (@ai_made_tools).</description>
    <link>https://dev.to/ai_made_tools</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3826720%2Fae1f6683-395f-4709-ba99-2212323b958e.png</url>
      <title>DEV Community: Joske Vermeulen</title>
      <link>https://dev.to/ai_made_tools</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ai_made_tools"/>
    <language>en</language>
    <item>
      <title>AI Dev Weekly #13: Microsoft Declares Independence — 7 In-House Models, Kills Claude Code, RTX Spark Dev Box</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Thu, 04 Jun 2026 14:11:45 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/ai-dev-weekly-13-microsoft-declares-independence-7-in-house-models-kills-claude-code-rtx-2lca</link>
      <guid>https://dev.to/ai_made_tools/ai-dev-weekly-13-microsoft-declares-independence-7-in-house-models-kills-claude-code-rtx-2lca</guid>
      <description>&lt;p&gt;&lt;em&gt;AI Dev Weekly is a Thursday series where I cover the week's most important AI developer news, with my take as someone who actually uses these tools daily.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Microsoft Build 2026 was the story of the week. Seven in-house AI models — none trained on OpenAI data. A Surface mini PC with NVIDIA RTX Spark inside. Claude Code licenses cancelled, developers pushed to Copilot. This was Microsoft saying, loud and clear: we don't need OpenAI anymore. Meanwhile MiniMax dropped the first open-weight frontier multimodal model, and NVIDIA unveiled hardware that makes local AI actually practical.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Microsoft Build 2026: 7 models, zero OpenAI dependency
&lt;/h2&gt;

&lt;p&gt;Microsoft unveiled its MAI (Microsoft AI) model family at Build on June 2. The headline: &lt;strong&gt;MAI-Thinking-1&lt;/strong&gt; — a 35B reasoning model trained entirely on commercially licensed enterprise data with no distillation from GPT or any OpenAI model. It matches Claude Sonnet 4.6 on key benchmarks at up to 10× better cost efficiency.&lt;/p&gt;

&lt;p&gt;Other MAI models announced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MAI-Code-1-Flash&lt;/strong&gt; (5B) — Purpose-built for GitHub Copilot and VS Code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MAI-Thinking-1&lt;/strong&gt; (35B) — Reasoning, multi-step instructions, long context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aion 1.0 Instruct&lt;/strong&gt; — Local Windows model for on-device reasoning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aion 1.0 Plan&lt;/strong&gt; — Local Windows model for planning and tool use&lt;/li&gt;
&lt;li&gt;Plus 3 more across transcription, speech, and images&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also announced: &lt;strong&gt;Windows as "agent-native runtime"&lt;/strong&gt; with Microsoft Execution Containers (MXC) — sandboxed environments for running AI agents with enterprise-grade isolation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; This is the biggest signal yet that the OpenAI-Microsoft marriage is evolving into a polite separation. MAI-Thinking-1 being trained without any OpenAI data is deliberate positioning — Microsoft can now say "our models have no license entanglement with OpenAI." The 5B coding model (MAI-Code-1-Flash) specifically targets Copilot — Microsoft's most important developer product. They're replacing GPT inside their own tools with models they fully control.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Microsoft ends Claude Code licenses
&lt;/h2&gt;

&lt;p&gt;Forbes reported that Microsoft is &lt;a href="https://www.forbes.com/sites/jonmarkman/2026/06/01/microsoft-ends-claude-code-licenses-as-it-pushes-copilot-cli/" rel="noopener noreferrer"&gt;ending Claude Code licenses&lt;/a&gt; and pushing developers to its own Copilot CLI instead. The subtext: Microsoft no longer wants to rent Anthropic's intelligence inside its own products.&lt;/p&gt;

&lt;p&gt;This affects developers at Microsoft who were using &lt;a href="https://www.aimadetools.com/blog/how-to-use-claude-code?utm_source=devto" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; as their primary coding tool. They're being migrated to Copilot powered by MAI-Code-1-Flash.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; If you work at a Microsoft shop: prepare for Copilot to get much better (MAI-Code-1 is purpose-built for it). If you use Claude Code independently: nothing changes for you. But this signals that the era of "one AI provider to rule them all" is over. Every major tech company is building their own models now.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. NVIDIA RTX Spark: the hardware that makes local AI real
&lt;/h2&gt;

&lt;p&gt;At Computex (June 1), NVIDIA unveiled &lt;a href="https://www.aimadetools.com/blog/nvidia-rtx-spark-complete-guide?utm_source=devto" rel="noopener noreferrer"&gt;RTX Spark&lt;/a&gt; — a new Windows PC superchip with 128GB unified memory, ARM CPU, Blackwell GPU, 1 petaflop of AI compute. It runs 120B parameter models locally.&lt;/p&gt;

&lt;p&gt;Then at Microsoft Build (June 2), Microsoft announced the &lt;strong&gt;Surface RTX Spark Dev Box&lt;/strong&gt; — a mini PC with RTX Spark inside, preloaded with VS Code, GitHub Copilot, WSL2 with GPU passthrough, CUDA, Python, Git, and Node.js. It's purpose-built for AI developers.&lt;/p&gt;

&lt;p&gt;Key numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;128GB unified memory (run &lt;a href="https://www.aimadetools.com/blog/best-llms-nvidia-rtx-spark?utm_source=devto" rel="noopener noreferrer"&gt;Qwen 3.6 27B at 2× speed&lt;/a&gt;, Llama 4 Scout, etc.)&lt;/li&gt;
&lt;li&gt;100W sustained thermal design in aluminium chassis&lt;/li&gt;
&lt;li&gt;Ships with Windows 11 Pro + full dev stack preinstalled&lt;/li&gt;
&lt;li&gt;Available this fall alongside consumer RTX Spark laptops&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; This is the machine I've been wanting for the &lt;a href="https://dev.to/race/"&gt;AI Startup Race&lt;/a&gt;. Currently our agents run on a $40/mo VPS. The Surface RTX Spark Dev Box would let you run 120B models locally with zero API costs. For developers spending $100+/month on AI APIs, this hardware pays for itself fast. See our &lt;a href="https://www.aimadetools.com/blog/nvidia-rtx-spark-complete-guide?utm_source=devto" rel="noopener noreferrer"&gt;full RTX Spark guide&lt;/a&gt; and &lt;a href="https://www.aimadetools.com/blog/nvidia-rtx-spark-vs-mac-studio-local-ai?utm_source=devto" rel="noopener noreferrer"&gt;vs Mac Studio comparison&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. MiniMax M3: first open-weight frontier multimodal
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.aimadetools.com/blog/minimax-m3-complete-guide?utm_source=devto" rel="noopener noreferrer"&gt;MiniMax M3&lt;/a&gt; launched June 1. First open-weight model combining:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;59% SWE-bench Pro (beats GPT-5.5's 58.6%)&lt;/li&gt;
&lt;li&gt;1M token context via MSA (15.6× faster than standard attention)&lt;/li&gt;
&lt;li&gt;Native text + images + video input&lt;/li&gt;
&lt;li&gt;Computer use (desktop operation)&lt;/li&gt;
&lt;li&gt;$0.60/$2.40 per million tokens&lt;/li&gt;
&lt;li&gt;Weights dropping ~June 10&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It reproduced an ICLR 2025 paper autonomously — 12 hours, 18 commits, 23 figures, zero human intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; M3 is the model that should worry Anthropic most. It beats GPT-5.5 on coding while being open-weight, multimodal, and 10× cheaper than Opus. When weights drop (~June 10), enterprises with data privacy requirements get a frontier model they can run on-premise. That was previously impossible without Claude's closed API. See our &lt;a href="https://www.aimadetools.com/blog/minimax-m3-complete-guide?utm_source=devto" rel="noopener noreferrer"&gt;M3 complete guide&lt;/a&gt;, &lt;a href="https://www.aimadetools.com/blog/minimax-m3-vs-claude-opus-4-8?utm_source=devto" rel="noopener noreferrer"&gt;vs Claude Opus 4.8&lt;/a&gt;, and &lt;a href="https://www.aimadetools.com/blog/minimax-m3-vs-deepseek-v4-pro?utm_source=devto" rel="noopener noreferrer"&gt;vs DeepSeek V4-Pro&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Grok's next model teased (mid-June)
&lt;/h2&gt;

&lt;p&gt;xAI teased their next Grok model — reportedly tripling the parameter count with a focus on coding leadership. Expected release: mid-June 2026. Supervised fine-tuning was complete as of the announcement, with reinforcement learning underway.&lt;/p&gt;

&lt;p&gt;If it delivers on the coding promise, it could shake up the &lt;a href="https://www.aimadetools.com/blog/grok-build-complete-guide?utm_source=devto" rel="noopener noreferrer"&gt;Grok Build&lt;/a&gt; ecosystem significantly. Currently Grok Build uses Grok 4.3 — an upgrade to whatever this new model is could make the $30/mo subscription much more competitive.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Quick hits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI Codex on Amazon Bedrock&lt;/strong&gt; — GPT models + Codex now generally available on AWS with enterprise controls. OpenAI meeting customers where they already deploy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;White House signed frontier-model cyber order&lt;/strong&gt; — Regulation incoming for the most capable AI models used in cybersecurity applications.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic expanded Project Glasswing&lt;/strong&gt; — More organizations getting access to Mythos Preview for cybersecurity work. Broader release "in weeks."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NVIDIA Nemotron 3 Ultra free on OpenRouter&lt;/strong&gt; — NVIDIA's own model now available at zero cost via &lt;a href="https://www.aimadetools.com/blog/openrouter-complete-guide?utm_source=devto" rel="noopener noreferrer"&gt;OpenRouter&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'm watching next week
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MiniMax M3 weights&lt;/strong&gt; (~June 10) — Will they live up to the benchmarks when the community tests them?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grok new model&lt;/strong&gt; (mid-June) — Does tripling parameters actually improve coding?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini CLI sunset&lt;/strong&gt; (June 18) — Two weeks left. &lt;a href="https://www.aimadetools.com/blog/migrate-gemini-cli-to-antigravity-cli/?utm_source=devto" rel="noopener noreferrer"&gt;Migrate now&lt;/a&gt; if you haven't.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic Mythos&lt;/strong&gt; — Still "coming in weeks." Will it ship before the Grok model does?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Our race agents&lt;/strong&gt; — Xiaomi hit session 456. 605 users/week on its site. Zero revenue still. 5 weeks left. &lt;a href="https://dev.to/race/"&gt;Follow along →&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;em&gt;AI Dev Weekly publishes every Thursday. &lt;a href="https://dev.to/race/season1/digest"&gt;Subscribe&lt;/a&gt; for weekly race updates and AI developer news.&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.aimadetools.com/blog/ai-dev-weekly-013-microsoft-build-independence-rtx-spark-minimax-m3/?utm_source=devto" rel="noopener noreferrer"&gt;https://www.aimadetools.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aidevweekly</category>
      <category>microsoft</category>
      <category>nvidia</category>
      <category>minimax</category>
    </item>
    <item>
      <title>AI and GDPR — What Developers Actually Need to Know (2026)</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Tue, 02 Jun 2026 12:46:18 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/ai-and-gdpr-what-developers-actually-need-to-know-2026-fbl</link>
      <guid>https://dev.to/ai_made_tools/ai-and-gdpr-what-developers-actually-need-to-know-2026-fbl</guid>
      <description>&lt;p&gt;If you're a developer at an EU company using &lt;a href="https://www.aimadetools.com/blog/how-to-use-claude-code/?utm_source=devto" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;, &lt;a href="https://www.aimadetools.com/blog/cursor-ai-one-week-review/?utm_source=devto" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, or any AI coding tool — your company may be violating GDPR without knowing it. Every prompt you send is data that gets processed on someone else's servers.&lt;/p&gt;

&lt;p&gt;Here's what you actually need to know.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core problem
&lt;/h2&gt;

&lt;p&gt;When you use an AI coding tool, your code travels to external servers. If that code contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Personal data (user emails, names, addresses in test fixtures)&lt;/li&gt;
&lt;li&gt;Database schemas with PII fields&lt;/li&gt;
&lt;li&gt;API keys or credentials&lt;/li&gt;
&lt;li&gt;Customer data in config files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...then you're transferring personal data to a third-party processor. Under GDPR, that requires a legal basis, a Data Processing Agreement (DPA), and potentially a Transfer Impact Assessment if the data leaves the EU.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which AI tools are GDPR compliant?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool/Provider&lt;/th&gt;
&lt;th&gt;DPA available?&lt;/th&gt;
&lt;th&gt;EU data residency?&lt;/th&gt;
&lt;th&gt;Training on your data?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.aimadetools.com/blog/what-is-mistral-ai/?utm_source=devto" rel="noopener noreferrer"&gt;Mistral&lt;/a&gt; API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;✅ EU-based&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Anthropic API&lt;/strong&gt; (Claude)&lt;/td&gt;
&lt;td&gt;✅ Yes (Team/Enterprise)&lt;/td&gt;
&lt;td&gt;⚠️ US servers&lt;/td&gt;
&lt;td&gt;❌ No (API)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenAI API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;⚠️ US servers (EU option available)&lt;/td&gt;
&lt;td&gt;❌ No (API)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Google Vertex AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;✅ EU region available&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Pro subscription&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Consumer terms&lt;/td&gt;
&lt;td&gt;❌ US&lt;/td&gt;
&lt;td&gt;⚠️ May be used&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ChatGPT Plus&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ Consumer terms&lt;/td&gt;
&lt;td&gt;❌ US&lt;/td&gt;
&lt;td&gt;⚠️ May be used&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.aimadetools.com/blog/best-ai-coding-agents-privacy-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Self-hosted&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;N/A (your servers)&lt;/td&gt;
&lt;td&gt;✅ You control it&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key distinction:&lt;/strong&gt; API access (business terms, DPA available) is different from consumer subscriptions (personal terms, no DPA). If your company uses ChatGPT Plus or Claude Pro for work, that's a compliance risk.&lt;/p&gt;

&lt;h2&gt;
  
  
  The safest options for EU developers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: Self-hosted (zero data transfer)
&lt;/h3&gt;

&lt;p&gt;Run models locally — nothing leaves your machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull devstral-small:24b
aider &lt;span class="nt"&gt;--model&lt;/span&gt; ollama/devstral-small:24b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See our &lt;a href="https://www.aimadetools.com/blog/best-ai-coding-agents-privacy-2026/?utm_source=devto" rel="noopener noreferrer"&gt;self-hosted AI guide&lt;/a&gt; and &lt;a href="https://www.aimadetools.com/blog/ollama-complete-guide-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Ollama guide&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 2: Mistral API (EU-native)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.aimadetools.com/blog/what-is-mistral-ai/?utm_source=devto" rel="noopener noreferrer"&gt;Mistral&lt;/a&gt; is based in Paris. Data stays in the EU by default. No transatlantic transfers, no Standard Contractual Clauses needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mistralai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Mistral&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Mistral&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Data processed in EU infrastructure
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See our &lt;a href="https://www.aimadetools.com/blog/mistral-api-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Mistral API guide&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 3: US providers with DPA + EU region
&lt;/h3&gt;

&lt;p&gt;Anthropic and OpenAI offer business plans with DPAs. Google Vertex AI lets you specify EU regions. This is compliant but requires paperwork.&lt;/p&gt;

&lt;h2&gt;
  
  
  What about AI coding tools?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;GDPR-safe?&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;a href="https://www.aimadetools.com/blog/aider-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Aider&lt;/a&gt; + local model&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Nothing leaves your machine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;a href="https://www.aimadetools.com/blog/continue-dev-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Continue.dev&lt;/a&gt; + Ollama&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Local inference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;a href="https://www.aimadetools.com/blog/aider-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Aider&lt;/a&gt; + Mistral API&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;EU data residency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;a href="https://www.aimadetools.com/blog/how-to-use-claude-code/?utm_source=devto" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; (Pro sub)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Consumer terms, US servers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.aimadetools.com/blog/cursor-ai-one-week-review/?utm_source=devto" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;td&gt;Business plan has DPA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;a href="https://www.aimadetools.com/blog/github-copilot-vs-cursor-2026/?utm_source=devto" rel="noopener noreferrer"&gt;GitHub Copilot&lt;/a&gt; Business&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;DPA + no training on code&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Practical steps
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Audit your AI tools&lt;/strong&gt; — list every AI service your team uses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check for DPAs&lt;/strong&gt; — consumer subscriptions don't count&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scrub test data&lt;/strong&gt; — remove real PII from test fixtures and seed data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consider &lt;a href="https://www.aimadetools.com/blog/best-ai-coding-agents-privacy-2026/?utm_source=devto" rel="noopener noreferrer"&gt;self-hosting&lt;/a&gt;&lt;/strong&gt; for sensitive codebases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;a href="https://www.aimadetools.com/blog/what-is-mistral-ai/?utm_source=devto" rel="noopener noreferrer"&gt;Mistral&lt;/a&gt;&lt;/strong&gt; as your default EU-compliant provider&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document everything&lt;/strong&gt; — GDPR requires you to demonstrate compliance&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Do AI tools comply with GDPR?
&lt;/h3&gt;

&lt;p&gt;It depends on the tool and plan. API-based services (OpenAI API, Anthropic API, Google Vertex AI) offer Data Processing Agreements and don't train on your data. Consumer subscriptions like ChatGPT Plus or Claude Pro use personal terms without DPAs and may not be GDPR-compliant for business use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use ChatGPT for GDPR-regulated data?
&lt;/h3&gt;

&lt;p&gt;You can use the OpenAI API with a business agreement and DPA in place, but not the consumer ChatGPT Plus subscription. The API doesn't train on your data and offers EU data residency options, making it suitable for regulated workloads with proper legal agreements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I need a DPA for AI APIs?
&lt;/h3&gt;

&lt;p&gt;Yes, if you're sending any personal data to the API. Under GDPR, any third-party processing personal data on your behalf requires a Data Processing Agreement. Most major AI providers (OpenAI, Anthropic, Google) offer DPAs on their business and enterprise plans.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is self-hosted AI GDPR compliant?
&lt;/h3&gt;

&lt;p&gt;Self-hosted AI eliminates data transfer concerns since nothing leaves your infrastructure. However, you still need to comply with other GDPR requirements like data minimization, purpose limitation, and the right to erasure for any personal data the model processes or stores.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;em&gt;Related: &lt;a href="https://www.aimadetools.com/blog/ai-code-data-privacy/?utm_source=devto" rel="noopener noreferrer"&gt;Where Does Your Code Go?&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/self-hosted-ai-gdpr/?utm_source=devto" rel="noopener noreferrer"&gt;Self-Hosted AI for GDPR&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/eu-ai-act-developers/?utm_source=devto" rel="noopener noreferrer"&gt;EU AI Act for Developers&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/best-ai-coding-agents-privacy-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Best AI Coding Agents for Privacy&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/best-vpn-for-developers-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Best VPNs for Developers&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/uk-ai-regulation-after-brexit/?utm_source=devto" rel="noopener noreferrer"&gt;Uk Ai Regulation After Brexit&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/ccpa-ai-developers/?utm_source=devto" rel="noopener noreferrer"&gt;Ccpa Ai Developers&lt;/a&gt;&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.aimadetools.com/blog/ai-gdpr-developers-guide/?utm_source=devto" rel="noopener noreferrer"&gt;https://www.aimadetools.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>gdpr</category>
      <category>privacy</category>
      <category>aitools</category>
      <category>europe</category>
    </item>
    <item>
      <title>AI Dev Weekly #12: Opus 4.8 Drops, Anthropic Hits $965B, Chinese AI Goes 99% Cheaper, Microsoft Builds Its Own Coding Model</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Fri, 29 May 2026 07:09:56 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/ai-dev-weekly-12-opus-48-drops-anthropic-hits-965b-chinese-ai-goes-99-cheaper-microsoft-5bd1</link>
      <guid>https://dev.to/ai_made_tools/ai-dev-weekly-12-opus-48-drops-anthropic-hits-965b-chinese-ai-goes-99-cheaper-microsoft-5bd1</guid>
      <description>&lt;p&gt;&lt;em&gt;AI Dev Weekly is a Thursday series where I cover the week's most important AI developer news, with my take as someone who actually uses these tools daily.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The theme this week is divergence. US labs are raising prices and valuations. Chinese labs are racing to zero. Developers are caught in the middle choosing between the absolute best (Opus 4.8 at $25/M output) and "good enough at 3% of the cost" (DeepSeek/MiMo at $0.87/M). Meanwhile Microsoft is hedging by building its own coding model to reduce OpenAI dependency. Let's break it all down.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Claude Opus 4.8: the new #1 coding model
&lt;/h2&gt;

&lt;p&gt;Anthropic released &lt;a href="https://www.aimadetools.com/blog/claude-opus-4-8-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Claude Opus 4.8&lt;/a&gt; on May 28. Same price as 4.7 ($5/$25 per million tokens), better at everything.&lt;/p&gt;

&lt;p&gt;Key numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;69.2% SWE-bench Pro&lt;/strong&gt; — up from 64.3% (4.7) and miles ahead of GPT-5.5 (58.6%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;74.2% Terminal-Bench 2.1&lt;/strong&gt; — +8.4 points over 4.7&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;88.6% SWE-bench Verified&lt;/strong&gt; — highest of any model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;4× fewer unflagged code flaws&lt;/strong&gt; — the honesty improvement is the real story&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;61.4 Artificial Analysis Index&lt;/strong&gt; — takes #1 from GPT-5.5&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The biggest new feature is &lt;a href="https://www.aimadetools.com/blog/claude-code-dynamic-workflows-guide/?utm_source=devto" rel="noopener noreferrer"&gt;dynamic workflows&lt;/a&gt; in Claude Code. Claude can now plan a large task, spawn hundreds of parallel subagents, verify results, and iterate until convergence. Jarred Sumner used it to port Bun from Zig to Rust — 750,000 lines, 11 days, 99.8% test pass rate.&lt;/p&gt;

&lt;p&gt;Other additions: effort control (low → max), fast mode at 3× cheaper ($10/$50 instead of $30/$150), and system messages mid-conversation in the API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; I run Claude as one of seven agents in the &lt;a href="https://dev.to/race/"&gt;$100 AI Startup Race&lt;/a&gt;. The &lt;a href="https://www.aimadetools.com/blog/claude-opus-4-8-vs-4-7/?utm_source=devto" rel="noopener noreferrer"&gt;4.8 vs 4.7 improvement&lt;/a&gt; is immediately noticeable — fewer hallucinated progress claims, better self-correction, more efficient tool calling. The dynamic workflows feature is genuinely new territory. No other tool can spawn hundreds of coordinated agents from a single prompt. For codebase-scale migrations and audits, this is a step change. The question is whether the $25/M output price is justified when &lt;a href="https://www.aimadetools.com/blog/claude-opus-4-8-vs-deepseek-v4-pro/?utm_source=devto" rel="noopener noreferrer"&gt;DeepSeek scores within 8 points for $0.87/M&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Anthropic raises $65B, surpasses OpenAI at $965B
&lt;/h2&gt;

&lt;p&gt;Alongside the Opus 4.8 launch, Anthropic closed a $65 billion Series H at a $965 billion post-money valuation. That puts them above OpenAI for the first time.&lt;/p&gt;

&lt;p&gt;The numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$965B valuation&lt;/strong&gt; (OpenAI was last valued at ~$900B)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$47B annualized revenue run rate&lt;/strong&gt; — tripled in 3 months&lt;/li&gt;
&lt;li&gt;Led by Altimeter Capital, Dragoneer, Greenoaks, Sequoia Capital&lt;/li&gt;
&lt;li&gt;Mythos-class models (higher intelligence than Opus) coming "in weeks"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; The valuation flip is symbolic but the revenue growth is real. $47B run rate means Claude is generating serious enterprise revenue. The Mythos tease is interesting — they explicitly said it has "even higher intelligence than Opus" and is currently limited to cybersecurity work under Project Glasswing. If Mythos ships broadly in June, it could be another step change. For developers, the practical implication is: Anthropic has the resources to keep shipping fast. Expect monthly model updates to continue.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The Chinese AI pricing war goes nuclear
&lt;/h2&gt;

&lt;p&gt;Two massive price cuts in one week made Chinese frontier models essentially free for cached workloads:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DeepSeek V4-Pro (May 22):&lt;/strong&gt; The 75% promotional discount is now &lt;a href="https://www.aimadetools.com/blog/deepseek-v4-pro-75-percent-discount-permanent/?utm_source=devto" rel="noopener noreferrer"&gt;permanent&lt;/a&gt;. Output locked at $0.87/M tokens. Input at $0.435/M. Cache hits at $0.003625/M. This is a model that scores 80.6% on SWE-bench Verified — within 8 points of Opus 4.8.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MiMo V2.5 Pro (May 26):&lt;/strong&gt; Xiaomi &lt;a href="https://www.aimadetools.com/blog/mimo-v2-5-pro-price-cut-99-percent/?utm_source=devto" rel="noopener noreferrer"&gt;cut prices by up to 99%&lt;/a&gt;. Cached input dropped from $0.36/M to $0.0036/M. Standard pricing now matches DeepSeek exactly: $0.435/$0.87. Token Plans upgraded 5-51× (the $100 plan now gets 82 billion tokens).&lt;/p&gt;

&lt;p&gt;The technical explanation: both labs achieved architectural breakthroughs in KV cache efficiency. DeepSeek's interleaved attention reduces cache to 10% of standard size. MiMo's hierarchical SWA uses a 1:7 sparsity ratio. Both claim break-even at these prices.&lt;/p&gt;

&lt;p&gt;The result: &lt;a href="https://www.aimadetools.com/blog/chinese-ai-30x-cheaper-than-american-models/?utm_source=devto" rel="noopener noreferrer"&gt;Chinese AI models are now 30× cheaper than American equivalents&lt;/a&gt; on standard pricing, and 100×+ cheaper on cached workloads. For agent pipelines with stable system prompts, the effective cost is approaching zero.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; I tripled the Xiaomi agent's sessions in our race (from 2 to 6 per day) because the cost became negligible. At $0.0036/M cached tokens, running an autonomous agent 24/7 costs less than a cup of coffee per day. The quality gap is real but narrowing — DeepSeek V4-Pro at 80.6% SWE-bench vs Opus 4.8 at 88.6% is meaningful for hard tasks but irrelevant for 80% of routine coding. If you are spending more than $500/month on API calls and haven't tested Chinese models, you are leaving money on the table. We wrote a &lt;a href="https://www.aimadetools.com/blog/migrate-gpt-claude-to-deepseek-mimo/?utm_source=devto" rel="noopener noreferrer"&gt;full migration guide&lt;/a&gt; if you want to try.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Microsoft building its own coding model for Build 2026
&lt;/h2&gt;

&lt;p&gt;Reuters reported on May 28 that Microsoft will unveil a homegrown coding model at Build 2026 (June 2-3 in San Francisco). It is designed to boost GitHub Copilot and reduce dependency on OpenAI.&lt;/p&gt;

&lt;p&gt;Also coming at Build:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Transcription model&lt;/li&gt;
&lt;li&gt;Reasoning model&lt;/li&gt;
&lt;li&gt;Speech model&lt;/li&gt;
&lt;li&gt;Image model&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is part of a broader strategic shift. Microsoft is building a self-sufficient AI stack alongside its OpenAI partnership, not instead of it. The competitive pressure from Claude Code (which has been eating Copilot's market share) is the likely catalyst.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; This is the most significant signal yet that the OpenAI-Microsoft relationship is evolving. Microsoft investing billions in OpenAI while simultaneously building competing models tells you everything about where the industry is heading: no one wants to be dependent on a single provider. For developers using Copilot, this could mean better performance (a model optimized specifically for code completion rather than general-purpose) or it could mean fragmentation (yet another model to evaluate). Watch Build next week for details.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. StepFun Step 3.7 Flash: 198B MoE at 400 tokens/sec
&lt;/h2&gt;

&lt;p&gt;A new player entered the cheap-and-fast model tier. StepFun released &lt;a href="https://www.aimadetools.com/blog/step-3-7-flash-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Step 3.7 Flash&lt;/a&gt; — a 198B parameter MoE model that activates only 11B parameters per token.&lt;/p&gt;

&lt;p&gt;The specs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;400 tokens/second&lt;/strong&gt; — 2× faster than Gemini 3.5 Flash&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;256K context window&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native multimodal&lt;/strong&gt; — text, images, video&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3 reasoning tiers&lt;/strong&gt; — Low/Medium/High per API call&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advisor Mode&lt;/strong&gt; — achieves 97% of Opus 4.6 coding at $0.19/task&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open-weight&lt;/strong&gt; — self-hostable on 128GB RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~$0.20/M input, ~$0.80/M output&lt;/strong&gt; on OpenRouter&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The unique feature is Advisor Mode: Step 3.7 Flash handles routine execution autonomously and only escalates to a stronger model when genuinely stuck. This automated routing achieves near-frontier quality at budget prices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; The "Flash" model tier is getting crowded — &lt;a href="https://www.aimadetools.com/blog/step-3-7-flash-vs-gemini-3-5-flash/?utm_source=devto" rel="noopener noreferrer"&gt;Gemini 3.5 Flash&lt;/a&gt;, Step 3.7 Flash, DeepSeek V4 Flash. All under $1/M output, all fast enough for real-time use, all surprisingly capable. Step 3.7 Flash's video understanding and GUI interaction capabilities set it apart. The 400 t/s throughput is genuinely impressive for a model this capable. If you need speed + multimodal + cheap, this is worth testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Quick hits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI reportedly dropped GPT-5.3 Codex&lt;/strong&gt; minutes after Anthropic's Opus 4.8 announcement. 25% faster than GPT-5.2, reportedly helped debug itself during development. Supercharges the Codex agentic coding tool launched earlier this month.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cohere acquired Aleph Alpha&lt;/strong&gt; — creating a $20B transatlantic sovereign AI company. Backed by Schwarz Group (Europe's largest retailer). Positions as the enterprise alternative to US/Chinese models for European companies with data sovereignty requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Canada ruled OpenAI violated privacy laws&lt;/strong&gt; — regulatory pressure continues to mount on US AI labs internationally.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code shipped detailed usage analytics&lt;/strong&gt; — you can now see exactly how many tokens each session consumed, broken down by model and effort level.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'm watching next week
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Microsoft Build (June 2-3)&lt;/strong&gt; — The new coding model reveal. Will it compete with Claude Code or just improve Copilot's autocomplete?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mythos timeline&lt;/strong&gt; — Anthropic said "coming weeks." If it ships in early June, it could leapfrog Opus 4.8 immediately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini CLI shutdown (June 18)&lt;/strong&gt; — Two weeks until the deadline. If you haven't migrated to &lt;a href="https://www.aimadetools.com/blog/antigravity-2-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Antigravity CLI&lt;/a&gt;, time is running out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Our race agents&lt;/strong&gt; — Claude is at 194 blog posts. Xiaomi just got tripled to 6 sessions/day. Gemini is back online after a 4-day auth outage. &lt;a href="https://dev.to/race/"&gt;Follow along →&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;em&gt;AI Dev Weekly publishes every Thursday. &lt;a href="https://dev.to/race/season1/digest"&gt;Subscribe&lt;/a&gt; for weekly race updates and AI developer news.&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.aimadetools.com/blog/ai-dev-weekly-012-opus-4-8-anthropic-965b-chinese-ai-99-cheaper/?utm_source=devto" rel="noopener noreferrer"&gt;https://www.aimadetools.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aidevweekly</category>
      <category>claude</category>
      <category>anthropic</category>
      <category>deepseek</category>
    </item>
    <item>
      <title>Aider Complete Guide: Setup, Best Models, and Tips (2026)</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Wed, 27 May 2026 12:24:08 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/aider-complete-guide-setup-best-models-and-tips-2026-17pe</link>
      <guid>https://dev.to/ai_made_tools/aider-complete-guide-setup-best-models-and-tips-2026-17pe</guid>
      <description>&lt;p&gt;Aider is a free, open-source AI pair programming tool that runs in your terminal. You chat with it in plain English, and it directly edits files in your local Git repository. Every change is automatically committed with a descriptive message.&lt;/p&gt;

&lt;p&gt;It's the most model-flexible AI coding tool available — it works with Claude, GPT, Gemini, DeepSeek, &lt;a href="https://www.aimadetools.com/blog/what-is-qwen-3-5/?utm_source=devto" rel="noopener noreferrer"&gt;Qwen&lt;/a&gt;, local models via &lt;a href="https://www.aimadetools.com/blog/ollama-complete-guide-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;, or any OpenAI-compatible API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Aider?
&lt;/h2&gt;

&lt;p&gt;Most AI coding tools lock you into one model. &lt;a href="https://www.aimadetools.com/blog/how-to-use-claude-code/?utm_source=devto" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; only works with Claude. &lt;a href="https://www.aimadetools.com/blog/claude-code-vs-codex-cli-vs-gemini-cli/?utm_source=devto" rel="noopener noreferrer"&gt;Codex CLI&lt;/a&gt; only works with GPT. Aider works with everything.&lt;/p&gt;

&lt;p&gt;Key strengths:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Git-native&lt;/strong&gt; — every AI edit is a clean Git commit you can review, revert, or cherry-pick&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;100+ model support&lt;/strong&gt; — any model, any provider, including local&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repo map&lt;/strong&gt; — understands your entire codebase structure, not just open files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-file editing&lt;/strong&gt; — coordinates changes across multiple files in one operation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;100+ languages&lt;/strong&gt; — not just Python and JavaScript&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;aider-chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with pipx (recommended for isolation):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;aider-chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;your-project
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-key"&lt;/span&gt;  &lt;span class="c"&gt;# or OPENAI_API_KEY, etc.&lt;/span&gt;
aider
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Aider starts in your project directory, scans the Git repo, and builds a map of your codebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing a model
&lt;/h2&gt;

&lt;p&gt;Aider supports tiered model usage — a powerful model for complex edits and a cheaper one for simple tasks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Use Claude Opus for main editing, Sonnet for quick tasks&lt;/span&gt;
aider &lt;span class="nt"&gt;--model&lt;/span&gt; claude-3-opus-20240229 &lt;span class="nt"&gt;--weak-model&lt;/span&gt; claude-3-sonnet-20240229

&lt;span class="c"&gt;# Use GPT-5.4 &lt;/span&gt;
aider &lt;span class="nt"&gt;--model&lt;/span&gt; gpt-5.4

&lt;span class="c"&gt;# Use DeepSeek (cheapest frontier option)&lt;/span&gt;
aider &lt;span class="nt"&gt;--model&lt;/span&gt; deepseek/deepseek-chat

&lt;span class="c"&gt;# Use a local model via Ollama&lt;/span&gt;
aider &lt;span class="nt"&gt;--model&lt;/span&gt; ollama/qwen3.5:27b

&lt;span class="c"&gt;# Use any model via OpenRouter&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OPENROUTER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-key"&lt;/span&gt;
aider &lt;span class="nt"&gt;--model&lt;/span&gt; openrouter/anthropic/claude-opus-4.6
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See our &lt;a href="https://www.aimadetools.com/blog/openrouter-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;OpenRouter guide&lt;/a&gt; for the full model catalog.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core commands
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Adding files to context
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/add src/auth.ts src/middleware.ts
/add src/routes/*.ts          # glob patterns work
/read docs/API.md             # read-only context (not edited)
/drop src/auth.ts             # remove from context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Editing modes
&lt;/h3&gt;

&lt;p&gt;Aider has multiple edit formats optimized for different models:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;whole&lt;/strong&gt; — replaces entire files (best for smaller files)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;diff&lt;/strong&gt; — sends unified diffs (best for large files)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;udiff&lt;/strong&gt; — universal diff format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;editor&lt;/strong&gt; — opens your editor for manual review
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aider &lt;span class="nt"&gt;--edit-format&lt;/span&gt; diff  &lt;span class="c"&gt;# Use diff mode&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Git integration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/commit           # Commit current changes
/undo             # Undo last AI edit (git revert)
/diff             # Show uncommitted changes
/git log --oneline -10  # Run any git command
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every AI edit creates an automatic commit like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aider: Refactor auth middleware to use JWT validation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running commands
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/run npm test              # Run tests
/run npm run lint          # Run linter
/test npm test             # Run tests and auto-fix failures
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;/test&lt;/code&gt; command is powerful — it runs your test suite, and if tests fail, Aider automatically tries to fix the code and re-runs until tests pass.&lt;/p&gt;

&lt;h3&gt;
  
  
  Web content
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/web https://docs.stripe.com/api/charges  # Fetch docs for context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Advanced features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Repo map
&lt;/h3&gt;

&lt;p&gt;Aider builds a map of your entire repository — function signatures, class definitions, imports, and dependencies. This means it understands how files relate to each other, even files not explicitly added to the chat.&lt;/p&gt;

&lt;p&gt;The repo map uses tree-sitter for parsing, supporting 100+ languages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Linting integration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aider &lt;span class="nt"&gt;--lint-cmd&lt;/span&gt; &lt;span class="s2"&gt;"eslint --fix"&lt;/span&gt;  &lt;span class="c"&gt;# Auto-lint after every edit&lt;/span&gt;
aider &lt;span class="nt"&gt;--auto-lint&lt;/span&gt;                &lt;span class="c"&gt;# Use detected linter&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Aider runs your linter after every edit and automatically fixes any issues it introduced.&lt;/p&gt;

&lt;h3&gt;
  
  
  Voice input
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aider &lt;span class="nt"&gt;--voice&lt;/span&gt;  &lt;span class="c"&gt;# Use speech-to-text for input&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Describe changes verbally — useful for complex explanations that are easier to speak than type.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scripting mode
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Add error handling to all API routes"&lt;/span&gt; | aider &lt;span class="nt"&gt;--yes&lt;/span&gt; &lt;span class="nt"&gt;--no-git&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run Aider non-interactively for CI/CD pipelines or batch operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuration
&lt;/h2&gt;

&lt;p&gt;Create &lt;code&gt;.aider.conf.yml&lt;/code&gt; in your project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude-3-opus-20240229&lt;/span&gt;
&lt;span class="na"&gt;weak-model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude-3-sonnet-20240229&lt;/span&gt;
&lt;span class="na"&gt;edit-format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;diff&lt;/span&gt;
&lt;span class="na"&gt;auto-lint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;lint-cmd&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eslint --fix&lt;/span&gt;
&lt;span class="na"&gt;auto-commits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or use environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AIDER_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;claude-3-opus-20240229
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AIDER_WEAK_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;claude-3-sonnet-20240229
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Cost optimization
&lt;/h2&gt;

&lt;p&gt;Aider can be expensive with frontier models. Tips to reduce costs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;--read&lt;/code&gt; for context files&lt;/strong&gt; — read-only files use fewer tokens than editable ones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use a weak model for simple tasks&lt;/strong&gt; — &lt;code&gt;--weak-model&lt;/code&gt; routes simple edits to a cheaper model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;a href="https://www.aimadetools.com/blog/how-to-use-aider-with-deepseek/?utm_source=devto" rel="noopener noreferrer"&gt;DeepSeek&lt;/a&gt; or &lt;a href="https://www.aimadetools.com/blog/how-to-use-qwen-3-5-api/?utm_source=devto" rel="noopener noreferrer"&gt;Qwen&lt;/a&gt;&lt;/strong&gt; — 10-30x cheaper than Claude/GPT for comparable quality&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Truncate large files&lt;/strong&gt; — Aider sends entire files as context; keep them focused&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use local models&lt;/strong&gt; for routine work — &lt;a href="https://www.aimadetools.com/blog/ollama-complete-guide-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; with Qwen 3.5 27B is free&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Aider vs Claude Code vs Codex CLI
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Aider&lt;/th&gt;
&lt;th&gt;Claude Code&lt;/th&gt;
&lt;th&gt;Codex CLI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Model support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Any model&lt;/td&gt;
&lt;td&gt;Claude only&lt;/td&gt;
&lt;td&gt;GPT only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Git integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Deep (auto-commits)&lt;/td&gt;
&lt;td&gt;Basic&lt;/td&gt;
&lt;td&gt;Basic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Repo map&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Full codebase&lt;/td&gt;
&lt;td&gt;✅ Full codebase&lt;/td&gt;
&lt;td&gt;✅ Full codebase&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Price&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free (BYOK)&lt;/td&gt;
&lt;td&gt;$20/mo sub&lt;/td&gt;
&lt;td&gt;$20/mo sub&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Edit quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Model-dependent&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Very good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Voice input&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Linting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Auto-lint&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Open source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Apache 2.0&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Choose Aider when:&lt;/strong&gt; You want model flexibility, deep Git integration, or need to use cheap/local models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Claude Code when:&lt;/strong&gt; You want the best code quality and don't mind being locked to Claude.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Codex CLI when:&lt;/strong&gt; You're in the OpenAI ecosystem and want fast autonomous coding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom line
&lt;/h2&gt;

&lt;p&gt;Aider is the Swiss Army knife of AI coding tools. It's not the best at any single thing — Claude Code writes better code, Codex CLI is faster, &lt;a href="https://www.aimadetools.com/blog/kimi-cli-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Kimi CLI&lt;/a&gt; has Agent Swarm. But Aider is the most flexible, the most transparent (Git-native), and the only tool that lets you freely switch between any model from any provider.&lt;/p&gt;

&lt;p&gt;For developers who want control over their AI coding workflow without vendor lock-in, Aider is the best choice in 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Aider free?
&lt;/h3&gt;

&lt;p&gt;Yes. Aider is 100% free and open-source (Apache 2.0). You bring your own API key for whichever AI model you want to use — there's no subscription or usage fee for Aider itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which AI models work with Aider?
&lt;/h3&gt;

&lt;p&gt;Aider works with 100+ models including Claude, GPT, Gemini, DeepSeek, Qwen, Mistral, and any OpenAI-compatible API. See our &lt;a href="https://www.aimadetools.com/blog/how-to-use-aider-with-deepseek/?utm_source=devto" rel="noopener noreferrer"&gt;guide to using Aider with DeepSeek&lt;/a&gt; for a budget-friendly setup.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does Aider compare to Claude Code?
&lt;/h3&gt;

&lt;p&gt;Aider supports any model while Claude Code is locked to Claude. Aider has deeper Git integration (auto-commits every edit) and auto-linting. Claude Code produces slightly better code quality since it's optimized for one model. See our full &lt;a href="https://www.aimadetools.com/blog/aider-vs-claude-code-vs-codex/?utm_source=devto" rel="noopener noreferrer"&gt;Aider vs Claude Code vs Codex comparison&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use Aider with local models?
&lt;/h3&gt;

&lt;p&gt;Yes. Aider works with local models via &lt;a href="https://www.aimadetools.com/blog/ollama-complete-guide-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; or any OpenAI-compatible local server. Run &lt;code&gt;aider --model ollama/qwen3.5:27b&lt;/code&gt; to use a local model at zero cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Aider work with Git?
&lt;/h3&gt;

&lt;p&gt;Git integration is Aider's core feature. Every AI edit is automatically committed with a descriptive message. You can review, revert, or cherry-pick any change. Aider also builds a repo map from your Git repository to understand your full codebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;em&gt;Related: &lt;a href="https://www.aimadetools.com/blog/aider-vs-claude-code-vs-codex/?utm_source=devto" rel="noopener noreferrer"&gt;Aider vs Claude Code vs Codex CLI&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/best-ai-coding-tools-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Best AI Coding Tools 2026&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/ollama-complete-guide-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Ollama Complete Guide&lt;/a&gt;&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.aimadetools.com/blog/aider-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;https://www.aimadetools.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aider</category>
      <category>coding</category>
      <category>aitools</category>
      <category>terminal</category>
    </item>
    <item>
      <title>The Model Worked. The Cron Job Almost Killed My AI Agent.</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Thu, 21 May 2026 12:05:00 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/the-model-worked-the-cron-job-almost-killed-my-ai-agent-108e</link>
      <guid>https://dev.to/ai_made_tools/the-model-worked-the-cron-job-almost-killed-my-ai-agent-108e</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-io-writing-2026-05-19"&gt;Google I/O Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Gemini 3.5 Flash was not the hard part.&lt;/p&gt;

&lt;p&gt;It fixed bugs the old setup had failed to solve for weeks. The model quality was transformational (see &lt;a href="https://dev.to/ai_made_tools/i-upgraded-a-production-ai-agent-to-gemini-35-flash-12-hours-after-google-io-heres-what-i-found-254i"&gt;Part 1&lt;/a&gt; and &lt;a href="https://dev.to/ai_made_tools/my-ai-agent-hit-googles-quota-wall-in-8-minutes-36-hours-later-google-tripled-the-limits-mkc"&gt;Part 2&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;The hard part was making it survive cron.&lt;/p&gt;

&lt;p&gt;In the first 48 hours, my autonomous agent nearly killed the VPS with an infinite retry loop, failed auth outside SSH, and burned most of its quota re-reading the same files every session.&lt;/p&gt;

&lt;p&gt;All three bugs took hours to diagnose. All three fixes were tiny.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;I run &lt;a href="https://www.aimadetools.com/race/" rel="noopener noreferrer"&gt;The $100 AI Startup Race&lt;/a&gt;. 7 AI agents building startups autonomously on a VPS via cron jobs. After upgrading the Gemini agent to Antigravity CLI (&lt;code&gt;agy&lt;/code&gt;) with Gemini 3.5 Flash, the model worked great. But making it run &lt;em&gt;unattended&lt;/em&gt; on a headless server? That's where the real engineering happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bug 1: The Infinite Retry Loop
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The symptom
&lt;/h3&gt;

&lt;p&gt;I SSH into the VPS and find it unresponsive. Load average through the roof. The cron log shows 300+ entries from the last 2 minutes, all empty.&lt;/p&gt;

&lt;h3&gt;
  
  
  What happened
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Expected:&lt;/strong&gt; quota exhaustion returns a non-zero exit code.&lt;br&gt;
&lt;strong&gt;Actual:&lt;/strong&gt; exit code 0 + empty output.&lt;/p&gt;

&lt;p&gt;When &lt;code&gt;agy&lt;/code&gt; hits its quota limit, it doesn't error out. It returns successfully with an empty response. My orchestrator script interprets "exit code 0" as "the model finished its thought, let's give it another task." So it immediately fires another prompt. Which returns empty. Which triggers another. 300 times in 2 minutes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;=== Run 1 finished at 07:30:03, exit=0 ===
=== Run 2 finished at 07:30:06, exit=0 ===
=== Run 3 finished at 07:30:08, exit=0 ===
=== Run 4 finished at 07:30:10, exit=0 ===
... (296 more)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each "run" takes 2-3 seconds. No output, no error, no indication that quota is exhausted. Just silence. A human would have seen the empty response and stopped. Cron saw exit code 0 and kept going.&lt;/p&gt;

&lt;h3&gt;
  
  
  The fix
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Circuit breaker: 3 consecutive empty responses = quota exhausted&lt;/span&gt;
&lt;span class="nv"&gt;EMPTY_COUNT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
&lt;span class="nv"&gt;MAX_EMPTY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3

&lt;span class="c"&gt;# After each run, check output length&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="k"&gt;${#&lt;/span&gt;&lt;span class="nv"&gt;OUTPUT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-lt&lt;/span&gt; 20 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="o"&gt;((&lt;/span&gt;EMPTY_COUNT++&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$EMPTY_COUNT&lt;/span&gt; &lt;span class="nt"&gt;-ge&lt;/span&gt; &lt;span class="nv"&gt;$MAX_EMPTY&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== 3 consecutive empty responses (quota exhausted?) — stopping session ==="&lt;/span&gt;
        &lt;span class="nb"&gt;break
    &lt;/span&gt;&lt;span class="k"&gt;fi
else
    &lt;/span&gt;&lt;span class="nv"&gt;EMPTY_COUNT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three empty responses in a row → stop the session. The orchestrator now exits cleanly instead of hammering a dead endpoint.&lt;/p&gt;

&lt;h3&gt;
  
  
  The lesson
&lt;/h3&gt;

&lt;p&gt;Every autonomous system needs a circuit breaker. AI tools are designed for interactive use. They assume a human will notice when something's wrong. When there's no human, you need explicit failure detection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bug 2: The Auth That Only Works in SSH
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The symptom
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Expected:&lt;/strong&gt; same user + same token file = works everywhere.&lt;br&gt;
&lt;strong&gt;Actual:&lt;/strong&gt; auth backend changes based on an environment variable.&lt;/p&gt;

&lt;p&gt;I test &lt;code&gt;agy&lt;/code&gt; via SSH. Works perfectly. I set up the cron job with the exact same command, same user, same working directory. Fails with "Authentication required."&lt;/p&gt;

&lt;p&gt;The token file exists. It has a valid refresh token. The binary can read it (verified with strace). But it won't use it.&lt;/p&gt;
&lt;h3&gt;
  
  
  The investigation
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Works:&lt;/span&gt;
ssh race@your-vps &lt;span class="s2"&gt;"cd /home/race/race-gemini &amp;amp;&amp;amp; echo 'test' | agy --print"&lt;/span&gt;
&lt;span class="c"&gt;# → Responds normally&lt;/span&gt;

&lt;span class="c"&gt;# Fails (simulating cron):&lt;/span&gt;
ssh race@your-vps &lt;span class="s1"&gt;'env -i HOME=/home/race PATH=/usr/bin:/home/race/.local/bin bash -c "
  cd /home/race/race-gemini
  echo test | agy --print
"'&lt;/span&gt;
&lt;span class="c"&gt;# → "Authentication required"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;After diffing the environment between SSH and cron, I found it: &lt;code&gt;agy&lt;/code&gt; checks for the &lt;code&gt;SSH_CONNECTION&lt;/code&gt; environment variable. If it's set, it uses file-based auth (reads the token from &lt;code&gt;~/.gemini/antigravity-cli/antigravity-oauth-token&lt;/code&gt;). If it's not set, it tries the system keyring, which doesn't exist in a non-interactive cron session.&lt;/p&gt;
&lt;h3&gt;
  
  
  The fix
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;SSH_CONNECTION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"127.0.0.1 0 127.0.0.1 22"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;One fake environment variable. I don't love this fix. But until the CLI exposes an explicit headless auth mode, this makes cron behave exactly like my tested SSH session. If Antigravity adds a &lt;code&gt;--headless-auth&lt;/code&gt; or &lt;code&gt;--auth-file&lt;/code&gt; flag, I'd replace this immediately.&lt;/p&gt;
&lt;h3&gt;
  
  
  The lesson
&lt;/h3&gt;

&lt;p&gt;AI CLI tools are built for developers at their desk. Headless/cron environments are second-class citizens. If your tool has multiple auth backends, test which one activates in a bare &lt;code&gt;env -i&lt;/code&gt; environment. That's what cron sees.&lt;/p&gt;
&lt;h2&gt;
  
  
  Bug 3: The Context Tax
&lt;/h2&gt;
&lt;h3&gt;
  
  
  The symptom
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Expected:&lt;/strong&gt; each session starts productive work quickly.&lt;br&gt;
&lt;strong&gt;Actual:&lt;/strong&gt; context reload eats 60% of the session.&lt;/p&gt;

&lt;p&gt;Session 1 runs for 8 minutes before hitting quota. Of those 8 minutes, 5 are spent reading the codebase: &lt;code&gt;IDENTITY.md&lt;/code&gt;, &lt;code&gt;PROGRESS.md&lt;/code&gt;, &lt;code&gt;BACKLOG.md&lt;/code&gt;, scanning the project structure, understanding what happened last time. Only 3 minutes of actual coding.&lt;/p&gt;

&lt;p&gt;With quota this tight, losing 60% of every session to context loading is a dealbreaker.&lt;/p&gt;
&lt;h3&gt;
  
  
  The discovery
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;agy&lt;/code&gt; has a &lt;code&gt;--continue&lt;/code&gt; flag that resumes the previous conversation. The model retains all context from the last session: files it read, decisions it made, what it planned to do next.&lt;/p&gt;
&lt;h3&gt;
  
  
  The fix
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# First session of the day: fresh start, full context load&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SESSION_TYPE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"first"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | agy &lt;span class="nt"&gt;--print&lt;/span&gt; &lt;span class="nt"&gt;--print-timeout&lt;/span&gt; 25m &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="c"&gt;# All subsequent sessions: resume previous conversation&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | agy &lt;span class="nt"&gt;--print&lt;/span&gt; &lt;span class="nt"&gt;--print-timeout&lt;/span&gt; 25m &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt; &lt;span class="nt"&gt;--continue&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  The result
&lt;/h3&gt;

&lt;p&gt;These measurements were taken before Google's 3x rate limit boost (see &lt;a href="https://dev.to/ai_made_tools/my-ai-agent-hit-googles-quota-wall-in-8-minutes-36-hours-later-google-tripled-the-limits-mkc"&gt;Part 2&lt;/a&gt;). With the new limits, the gains from &lt;code&gt;--continue&lt;/code&gt; still matter, but the pressure is less extreme.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Fresh session&lt;/th&gt;
&lt;th&gt;--continue session&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Context loading&lt;/td&gt;
&lt;td&gt;~5 minutes&lt;/td&gt;
&lt;td&gt;~0 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Productive coding&lt;/td&gt;
&lt;td&gt;~3 minutes&lt;/td&gt;
&lt;td&gt;~15 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effective runtime&lt;/td&gt;
&lt;td&gt;3 min&lt;/td&gt;
&lt;td&gt;15 min&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Almost 5x more productive time per session by skipping the context reload. The model remembers what it fixed, what's next, what files it already read.&lt;/p&gt;
&lt;h3&gt;
  
  
  The lesson
&lt;/h3&gt;

&lt;p&gt;Context is expensive, both in tokens and in quota. If your AI tool supports conversation persistence, use it.&lt;/p&gt;

&lt;p&gt;I don't use &lt;code&gt;--continue&lt;/code&gt; forever. One fresh session per day as a reset point (prevents stale assumptions from accumulating), then all subsequent sessions within that day resume where the last one left off.&lt;/p&gt;
&lt;h2&gt;
  
  
  What's Missing: The Infrastructure Layer
&lt;/h2&gt;

&lt;p&gt;These three bugs share a pattern: &lt;strong&gt;autonomous AI agents need infrastructure that doesn't exist yet.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No standard circuit breaker for quota exhaustion&lt;/li&gt;
&lt;li&gt;No headless-first auth flow&lt;/li&gt;
&lt;li&gt;No cron-aware session lifecycle (when to fresh-start vs continue)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Web apps have process managers. Queues have retry policies. APIs expose rate-limit headers. Background jobs have dead-letter queues. Autonomous AI agents have bash scripts.&lt;/p&gt;

&lt;p&gt;Every team running AI agents on cron is building their own orchestrator from scratch. The same patterns (retry limits, auth persistence, context reuse, graceful shutdown, cost tracking) get reimplemented by every team independently.&lt;/p&gt;

&lt;p&gt;We're in the "build your own orchestrator" era. The models are ready for autonomous work. The infrastructure around them isn't.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Orchestrator Pattern
&lt;/h2&gt;

&lt;p&gt;Here's the minimal structure that works for me after a week of iteration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Session start
├── Check quota (circuit breaker armed)
├── Load context (fresh or --continue)
├── Run loop (max N iterations)
│   ├── Send prompt
│   ├── Check output length (empty = increment counter)
│   ├── If 3 empty → break (quota exhausted)
│   ├── If output → commit changes, reset counter
│   └── Check elapsed time → graceful shutdown at limit
├── Push commits
└── Log session stats (duration, files changed, runs)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's ~50 lines of bash. It handles the three failure modes above. It's not elegant, but it keeps an autonomous agent running unattended across scheduled sessions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;If you're running Antigravity CLI (or any AI coding tool) in autonomous/headless mode:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add a circuit breaker.&lt;/strong&gt; Empty responses are silent failures, not completions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test auth under cron's environment.&lt;/strong&gt; In my case, faking &lt;code&gt;SSH_CONNECTION&lt;/code&gt; forced file-based auth.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use --continue between sessions.&lt;/strong&gt; Context loading eats your quota alive.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set --print-timeout higher than default.&lt;/strong&gt; Complex agentic tasks need more than 5 minutes to think.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  My Cron-Safe Agent Checklist
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Max runtime per session&lt;/li&gt;
&lt;li&gt;[ ] Max loop count per session&lt;/li&gt;
&lt;li&gt;[ ] Empty-output circuit breaker&lt;/li&gt;
&lt;li&gt;[ ] Non-zero exit handling&lt;/li&gt;
&lt;li&gt;[ ] Auth tested with &lt;code&gt;env -i&lt;/code&gt; (simulating cron)&lt;/li&gt;
&lt;li&gt;[ ] Fresh/continue session strategy&lt;/li&gt;
&lt;li&gt;[ ] Commit and push after each meaningful change&lt;/li&gt;
&lt;li&gt;[ ] Quota / empty-response events logged separately&lt;/li&gt;
&lt;li&gt;[ ] Recovery path after quota exhaustion&lt;/li&gt;
&lt;li&gt;[ ] Logs include duration, output length, files changed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI agents don't just need better models. They need boring production infrastructure.&lt;/p&gt;

&lt;p&gt;Gemini 3.5 Flash made the agent smart enough to work.&lt;/p&gt;

&lt;p&gt;Bash made it stable enough to survive.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googleiochallenge</category>
      <category>antigravity</category>
      <category>devops</category>
    </item>
    <item>
      <title>My AI Agent Hit Google's Quota Wall in 8 Minutes. 36 Hours Later, Google Tripled the Limits.</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Thu, 21 May 2026 08:32:51 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/my-ai-agent-hit-googles-quota-wall-in-8-minutes-36-hours-later-google-tripled-the-limits-mkc</link>
      <guid>https://dev.to/ai_made_tools/my-ai-agent-hit-googles-quota-wall-in-8-minutes-36-hours-later-google-tripled-the-limits-mkc</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-io-writing-2026-05-19"&gt;Google I/O Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;My Gemini agent spent four weeks in last place.&lt;/p&gt;

&lt;p&gt;1,259 commits. Broken imports across 32 files. Help requests about database tables it could have created itself. Endless bug loops.&lt;/p&gt;

&lt;p&gt;Then I upgraded it to Gemini 3.5 Flash.&lt;/p&gt;

&lt;p&gt;In 8 minutes, it diagnosed and fixed problems the old setup had failed to solve in weeks. Then it hit Google's quota wall.&lt;/p&gt;

&lt;p&gt;This is the story of what happened next.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;This is Part 2 of my Gemini 3.5 Flash upgrade series. &lt;a href="https://dev.to/ai_made_tools/i-upgraded-a-production-ai-agent-to-gemini-35-flash-12-hours-after-google-io-heres-what-i-found-254i"&gt;Part 1 covers the initial upgrade and first results&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I'm running &lt;a href="https://www.aimadetools.com/race/" rel="noopener noreferrer"&gt;The $100 AI Startup Race&lt;/a&gt;. 7 AI coding agents each get $100 and 12 weeks to autonomously build real startups. No human coding. The agents run on cron jobs, commit to GitHub, and deploy to Vercel.&lt;/p&gt;

&lt;p&gt;After upgrading the Gemini agent from a combo of 2.5 Pro (premium sessions) and 2.5 Flash (cheap sessions) to a single 3.5 Flash tier via Antigravity CLI on May 20, the model quality was incredible. But the quota economics were brutal.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Disappointment (May 20)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Session 1:&lt;/strong&gt; The model fixed 32 broken API files in a single commit: imports, bcrypt to bcryptjs for Vercel serverless, Stripe instantiation. Root cause analysis that the old model couldn't do in 4 weeks. Then the 5h quota wall hit. &lt;strong&gt;8 minutes of productive work.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session 2:&lt;/strong&gt; With &lt;code&gt;--continue&lt;/code&gt; (skipping context reload), it built an email library, wrote tests, and fixed auth endpoints. &lt;strong&gt;15 minutes.&lt;/strong&gt; Then 5h quota again.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The math:&lt;/strong&gt; Two sessions consumed 40% of the weekly quota. Projected total: ~68 minutes per week on the $20/month Pro plan.&lt;/p&gt;

&lt;p&gt;For context, here's what the other agents in my race get for similar money (these are not official provider limits, they are the effective autonomous runtime I measured in my specific setup):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Plan cost&lt;/th&gt;
&lt;th&gt;Weekly runtime&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;td&gt;$20/mo&lt;/td&gt;
&lt;td&gt;~7 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Codex/GPT&lt;/td&gt;
&lt;td&gt;$20/mo&lt;/td&gt;
&lt;td&gt;~21 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;$25/mo&lt;/td&gt;
&lt;td&gt;~21 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemini 3.5 Flash&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$20/mo&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~68 minutes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Best model quality in the race. Worst total compute time. The old 2.5 Flash/Pro setup gave me ~28 hours/week, but those 28 hours produced nothing but bug loops. Now I had a model that actually worked, but could barely run.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Paradox
&lt;/h2&gt;

&lt;p&gt;Here's what made it painful: the quality improvement was real. Not incremental, but transformational.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Old setup (2.5 Pro + 2.5 Flash combo, 28 hours/week):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wrote code with broken imports across 32 files&lt;/li&gt;
&lt;li&gt;Filed 3 help requests about "missing database tables"&lt;/li&gt;
&lt;li&gt;Never self-diagnosed the actual problem&lt;/li&gt;
&lt;li&gt;1,259 commits over 4 weeks, last place in the race&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;New model (3.5 Flash, 68 minutes/week):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Diagnosed the root cause in one pass (broken imports, not missing tables)&lt;/li&gt;
&lt;li&gt;Fixed all 32 files in a single commit&lt;/li&gt;
&lt;li&gt;Built a mock database layer, converted test infrastructure&lt;/li&gt;
&lt;li&gt;More useful output in 23 minutes than the old model produced in weeks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The bottleneck had shifted from intelligence to throughput. The model was finally good enough. The constraint was access.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Autonomous Agents Burn Quota Differently
&lt;/h2&gt;

&lt;p&gt;For human coding, a model is an assistant. You ask, read, think, edit, and come back later.&lt;/p&gt;

&lt;p&gt;For autonomous coding, the model is the runtime. It doesn't pause to think offline. Every file inspection, every failed test, every log check, every retry, every deployment verification consumes inference.&lt;/p&gt;

&lt;p&gt;A human developer's session looks like: ask, think, edit, ask again, wait, test manually.&lt;/p&gt;

&lt;p&gt;An autonomous agent's session looks like: plan, inspect, edit, test, fail, inspect logs, edit, retest, deploy, verify, repeat.&lt;/p&gt;

&lt;p&gt;That changes the economics completely. A $20/month subscription can feel generous for a human developer and unusable for an autonomous agent, at the same time, on the same plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Response (May 21, 05:25 UTC)
&lt;/h2&gt;

&lt;p&gt;Less than 36 hours after Google I/O. Within hours of the new quota system going live, users were reporting problems on Reddit and X: 4 prompts burning an entire 5-hour window, failed generations counting against quota, threads calling it a "bait and switch."&lt;/p&gt;

&lt;p&gt;Then, at 5:25 AM UTC on May 21:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Varun Mohan (@_mohansolo):&lt;/strong&gt; "An update: we're 3xing the rate limits for Gemini models across all paid tiers in Antigravity and resetting everyone's Gemini quota for the week. We understand some people hit their rate limits quickly and wanted to respond fast. Lots more to come and enjoy building!"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Logan Kilpatrick (@OfficialLoganK):&lt;/strong&gt; "We just 3xed the rate limits across all tiers in Antigravity so that you can put 3.5 Flash through its paces even more, enjoy, and keep the feedback coming! :)"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the key follow-up from Varun:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"In case it's not clear, the 3x is forever."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What I Actually Measured
&lt;/h2&gt;

&lt;p&gt;My agent's cron job fired at 05:00 UTC, likely straddling the quota boost that landed around 05:25 UTC. The results:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session 3 (05:00 UTC, partially on old quota, partially on new):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;33 minutes of productive work&lt;/li&gt;
&lt;li&gt;9 runs, 588 files changed&lt;/li&gt;
&lt;li&gt;Renamed the entire domain (&lt;code&gt;localleads.pro&lt;/code&gt; to &lt;code&gt;localseogen.com&lt;/code&gt;) across all generated SEO pages, fixed Stripe redirect URLs, corrected ES Module syntax in API files&lt;/li&gt;
&lt;li&gt;Built a mock database layer (&lt;code&gt;db/mockDb.js&lt;/code&gt;) with full CRUD operations&lt;/li&gt;
&lt;li&gt;Created &lt;code&gt;lib/time-helpers.js&lt;/code&gt; utility library&lt;/li&gt;
&lt;li&gt;Wrote test suites for signup, login, get-credits, assign, generate-seo-pages&lt;/li&gt;
&lt;li&gt;Refactored 14 test files to use the new mock DB&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Session 4 (07:07 UTC, fully on new quota):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;29 minutes of productive work&lt;/li&gt;
&lt;li&gt;8 runs, 34 files changed&lt;/li&gt;
&lt;li&gt;Converted all test mocks from ESM (&lt;code&gt;.js&lt;/code&gt;) to CommonJS (&lt;code&gt;.cjs&lt;/code&gt;) for jest compatibility&lt;/li&gt;
&lt;li&gt;Fixed babel and jest configuration for the mixed ESM/CJS codebase&lt;/li&gt;
&lt;li&gt;Refactored &lt;code&gt;execute-outreach&lt;/code&gt;, &lt;code&gt;forgot-password-request&lt;/code&gt;, &lt;code&gt;generate-seo-pages&lt;/code&gt;, &lt;code&gt;user-referral-data&lt;/code&gt; tests&lt;/li&gt;
&lt;li&gt;Cleaned up &lt;code&gt;.env.test&lt;/code&gt; and email library&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two back-to-back sessions of ~30 minutes each. Together they used the full 5-hour window, so roughly &lt;strong&gt;50 minutes of productive runtime per 5h refresh cycle&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The comparison:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Before boost (May 20)&lt;/th&gt;
&lt;th&gt;After boost (May 21)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime per 5h window&lt;/td&gt;
&lt;td&gt;8 minutes&lt;/td&gt;
&lt;td&gt;~50 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effective improvement&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;~4-5x (announced 3x)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Productive output&lt;/td&gt;
&lt;td&gt;42 files fixed&lt;/td&gt;
&lt;td&gt;622 files changed, full test infra&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Weekly projection&lt;/td&gt;
&lt;td&gt;~68 minutes&lt;/td&gt;
&lt;td&gt;~5+ hours&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Google announced 3x. I measured closer to 4-5x for autonomous agentic coding in my setup. I wouldn't treat that as a universal number yet. The difference likely comes from my measurement catching a weekly quota reset, the rate limit increase, and a different prompt mix all at the same time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Insight
&lt;/h2&gt;

&lt;p&gt;The feedback loop between AI providers and power users is now measured in hours, not months.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Monday (May 19):&lt;/strong&gt; Google launches new compute-based quota system at I/O&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tuesday (May 20):&lt;/strong&gt; Users hit walls, Reddit fills with complaints, my agent gets 68 min/week&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wednesday (May 21, 5:25 AM):&lt;/strong&gt; Google triples limits permanently and resets everyone's pool&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's a 36-hour turnaround from "this is broken for agents" to "fixed, permanently." For anyone building autonomous systems on top of subscription AI: the economics are volatile, but they're trending in your favor. The providers are watching usage patterns and adjusting in real-time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Story: Quality × Time = Output
&lt;/h2&gt;

&lt;p&gt;Here's what I'd tell any developer considering Gemini 3.5 Flash for agentic workflows:&lt;/p&gt;

&lt;p&gt;The old model had unlimited time and did nothing useful with it. The new model has limited time and makes every minute count.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;2.5 Pro + Flash combo:&lt;/strong&gt; 28 hours/week → last place, stuck in bug loops&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3.5 Flash (pre-boost):&lt;/strong&gt; 68 min/week → more progress than 4 weeks of the old model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3.5 Flash (post-boost):&lt;/strong&gt; 5+ hours/week → fully competitive, systematically building&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Quality matters more than quantity. I'll take 5 hours of a model that diagnoses root causes, fixes 32 files in one pass, and builds proper test infrastructure over 28 hours of a model that files help requests about problems it created.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The Gemini agent went from last place to having a real shot. The product (LocalSEOGen, a local SEO page generator for agencies) now has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fixed API endpoints (32 files)&lt;/li&gt;
&lt;li&gt;Working auth flow&lt;/li&gt;
&lt;li&gt;Test infrastructure (mock DB, jest config, babel setup)&lt;/li&gt;
&lt;li&gt;Domain migration complete&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next sessions will focus on getting the Vercel deployment actually serving requests and pushing toward first revenue.&lt;/p&gt;

&lt;p&gt;But the bigger takeaway isn't about my race. It's this:&lt;/p&gt;

&lt;p&gt;The lesson from this week is not "Gemini needs more quota." The lesson is that autonomous agents turn model access into infrastructure. For human developers, Gemini 3.5 Flash on a $20 plan is a huge upgrade. For autonomous coding agents, it finally feels capable enough to matter. And that is exactly why the quota suddenly matters too.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Follow the race live at &lt;a href="https://www.aimadetools.com/race/" rel="noopener noreferrer"&gt;aimadetools.com/race&lt;/a&gt;. 7 agents, $100 each, 12 weeks, real startups.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googleiochallenge</category>
      <category>ai</category>
      <category>antigravity</category>
    </item>
    <item>
      <title>I Upgraded a Production AI Agent to Gemini 3.5 Flash 12 Hours After Google I/O - Here's What I Found</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Wed, 20 May 2026 08:35:19 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/i-upgraded-a-production-ai-agent-to-gemini-35-flash-12-hours-after-google-io-heres-what-i-found-254i</link>
      <guid>https://dev.to/ai_made_tools/i-upgraded-a-production-ai-agent-to-gemini-35-flash-12-hours-after-google-io-heres-what-i-found-254i</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-io-writing-2026-05-19"&gt;Google I/O Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;I'm running an experiment called &lt;a href="https://www.aimadetools.com/race/" rel="noopener noreferrer"&gt;The $100 AI Startup Race&lt;/a&gt;. 7 AI coding agents each get $100 and 12 weeks to autonomously build real startups. No human coding. The agents run on cron jobs, commit to GitHub, deploy to Vercel, and try to generate revenue.&lt;/p&gt;

&lt;p&gt;One of those agents is &lt;strong&gt;Gemini&lt;/strong&gt;. It's been running on Gemini CLI with a combo of 2.5 Pro (premium sessions) and 2.5 Flash (cheap sessions) since April 20. I tried 3.1 Pro during the test runs before the race, but it was unreliable - frequent "model not available" errors made it unusable for autonomous cron-based sessions. So I stuck with 2.5. After 4 weeks and 1,259 commits, Gemini is in &lt;strong&gt;last place&lt;/strong&gt;. Stuck in bug loops. Writing code that crashes, filing help requests about database tables it could create itself, and burning sessions on infrastructure it already has.&lt;/p&gt;

&lt;p&gt;Then Google I/O happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Google Dropped (May 19)
&lt;/h2&gt;

&lt;p&gt;Gemini 3.5 Flash. The headline numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;76.2% Terminal-Bench 2.1&lt;/strong&gt; (agentic coding) - beats 3.1 Pro's 70.3%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;83.6% MCP Atlas&lt;/strong&gt; (multi-step workflows) - highest of any model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;289 tokens/sec output&lt;/strong&gt; - 4x faster than Claude Opus 4.7 or GPT-5.5&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$1.50 / $9 per 1M tokens&lt;/strong&gt; - cheaper than 3.1 Pro&lt;/li&gt;
&lt;li&gt;A Flash-tier model outperforming the previous Pro model. That's never happened before.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And one more thing: &lt;strong&gt;Gemini CLI is being retired on June 18, 2026.&lt;/strong&gt; Replaced by Antigravity CLI (&lt;code&gt;agy&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;I had to upgrade. The model my agent was running on is two generations behind, and the tool it uses is dying in 4 weeks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing Antigravity CLI on a Headless VPS
&lt;/h2&gt;

&lt;p&gt;My race agents run on a VPS (Ubuntu, no GUI). Here's how the install went:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://antigravity.google/cli/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Binary lands at &lt;code&gt;/root/.local/bin/agy&lt;/code&gt;. Add to PATH:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/root/.local/bin:&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
agy &lt;span class="nt"&gt;--version&lt;/span&gt;  &lt;span class="c"&gt;# 1.0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Auth Challenge
&lt;/h3&gt;

&lt;p&gt;First run needs OAuth. On a headless server, &lt;code&gt;agy&lt;/code&gt; detects the SSH session and prints an auth URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;Authentication required. Please visit the URL to log in:
  https://accounts.google.com/o/oauth2/auth?...

Waiting for authentication (timeout 30s)...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You have 30 seconds to open that URL in your browser and complete the Google login. Tight, but it works. Token gets stored and all future calls are authenticated.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovery #1: No Model Selection Flag
&lt;/h2&gt;

&lt;p&gt;Here's what surprised me. The old Gemini CLI had &lt;code&gt;-m gemini-2.5-pro&lt;/code&gt; to pick your model. Antigravity CLI has... nothing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Usage of agy:
  --dangerously-skip-permissions  Auto-approve all tool permission requests
  --print                         Run a single prompt non-interactively
  --print-timeout                 Timeout for print mode (default 5m0s)
  --sandbox                       Run in a sandbox
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No &lt;code&gt;--model&lt;/code&gt;. No env var. No config file. I tried everything - &lt;code&gt;settings.json&lt;/code&gt;, &lt;code&gt;GEMINI.md&lt;/code&gt; directives, environment variables. Nothing works.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;agy&lt;/code&gt; auto-selects Gemini 3.5 Flash based on your subscription tier and quota. Server-side routing, no client control. For my use case (autonomous agent on cron), this actually simplifies things - one command, best available model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Discovery #2: Unified Quota Across Models
&lt;/h2&gt;

&lt;p&gt;On my Mac (same Google account, AI Pro $20/month), I can see the quota dashboard:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gemini 3.5 Flash (High)      - Refreshes in 4h 42m
Gemini 3.5 Flash (Medium)    - Refreshes in 4h 42m
Gemini 3.1 Pro (High)        - Refreshes in 4h 42m
Gemini 3.1 Pro (Low)         - Refreshes in 4h 42m
Claude Sonnet 4.6 (Thinking) - Refreshes in 4h 58m
Claude Opus 4.6 (Thinking)   - Refreshes in 4h 58m
GPT-OSS 120B (Medium)        - Refreshes in 4h 58m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things jumped out:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Gemini Flash and Pro share the same quota pool.&lt;/strong&gt; When I used 3.5 Flash, the 3.1 Pro timer dropped at the same time. They're not independent buckets - it's one "Gemini compute" pool that both models draw from.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-model access&lt;/strong&gt; - Antigravity bundles Claude, GPT-OSS, and Gemini models in one $20/month subscription. Google is positioning this as a model-agnostic platform, not just a Gemini wrapper.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The 5-hour refresh cycle and shared pool means you need to be strategic about which models you use and when.&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Real Test
&lt;/h2&gt;

&lt;p&gt;I set up a minimal bug-fix test in the race-gemini directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'Fix the bug in math.js. Run npm test to verify.'&lt;/span&gt; | &lt;span class="se"&gt;\&lt;/span&gt;
  agy &lt;span class="nt"&gt;--print&lt;/span&gt; &lt;span class="nt"&gt;--print-timeout&lt;/span&gt; 3m &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;I have successfully fixed the bug in math.js and verified it using npm test.

&lt;span class="gu"&gt;### Summary of Changes&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Identified the Target File
&lt;span class="p"&gt;2.&lt;/span&gt; Fixed the Bug: Updated the add function to use addition (+) instead of subtraction (-)
&lt;span class="p"&gt;3.&lt;/span&gt; Verified the Fix: npm test passes with output: PASS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It found the file, read it, identified the bug, fixed it, ran the tests, and confirmed. Clean execution. No help requests filed. No infinite loops.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Migration
&lt;/h2&gt;

&lt;p&gt;Old setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Premium sessions (2x/day)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | gemini &lt;span class="nt"&gt;--yolo&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; gemini-2.5-pro

&lt;span class="c"&gt;# Cheap sessions (6x/day)  &lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | gemini &lt;span class="nt"&gt;--yolo&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; gemini-2.5-flash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# All sessions (8x/day, single tier)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PROMPT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | agy &lt;span class="nt"&gt;--print&lt;/span&gt; &lt;span class="nt"&gt;--print-timeout&lt;/span&gt; 10m &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I also merged the two backlogs (&lt;code&gt;BACKLOG-PREMIUM.md&lt;/code&gt; + &lt;code&gt;BACKLOG-CHEAP.md&lt;/code&gt;) into a single &lt;code&gt;BACKLOG.md&lt;/code&gt; - same approach as our Kimi agent, which uses one model and one task list. The agent decides what to prioritize each session.&lt;/p&gt;

&lt;p&gt;First task in the new backlog: "Merge old backlogs, audit the live site, identify the #1 blocker to first revenue."&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'm Watching For
&lt;/h2&gt;

&lt;p&gt;The Gemini agent's problem was never lack of capability - it's the most prolific committer in the race (1,259 commits). The problem was &lt;strong&gt;operational awareness&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing code with bugs it doesn't notice&lt;/li&gt;
&lt;li&gt;Filing help requests for things it could solve itself&lt;/li&gt;
&lt;li&gt;Building features without checking if they deploy correctly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Gemini 3.5 Flash's MCP Atlas score (83.6% - highest of any model) suggests it's specifically designed for the kind of multi-step, tool-using, autonomous work the race requires. The 4x speed means more iterations per session. The better coding benchmarks mean fewer self-inflicted bugs.&lt;/p&gt;

&lt;p&gt;But benchmarks don't test "can you notice your site is returning 500 errors." That's what I'm watching for.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verdict So Far
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install is clean (one curl command)&lt;/li&gt;
&lt;li&gt;Auth on headless servers is first-class (prints URL, you complete in browser)&lt;/li&gt;
&lt;li&gt;3.5 Flash is genuinely fast - responses feel instant&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--dangerously-skip-permissions&lt;/code&gt; works for autonomous use&lt;/li&gt;
&lt;li&gt;The model correctly identifies and fixes bugs in a single pass&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What's missing:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No &lt;code&gt;--model&lt;/code&gt; flag (can't choose between 3.5 Flash, 3.1 Pro, Claude, etc.)&lt;/li&gt;
&lt;li&gt;No way to see remaining quota from CLI&lt;/li&gt;
&lt;li&gt;Shared quota across Flash and Pro models could be a problem at scale&lt;/li&gt;
&lt;li&gt;30-second auth timeout is tight for headless setups&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The big question:&lt;/strong&gt; Will a better model fix an agent that's been stuck for 4 weeks? Or is the problem deeper than model quality?&lt;/p&gt;

&lt;p&gt;First results should come in within 48 hours. I'll update this post.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Follow the race live at &lt;a href="https://www.aimadetools.com/race/" rel="noopener noreferrer"&gt;aimadetools.com/race&lt;/a&gt; - 7 agents, $100 each, 12 weeks, real startups.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Update (May 21):&lt;/strong&gt; &lt;a href="https://dev.to/ai_made_tools/my-ai-agent-hit-googles-quota-wall-in-8-minutes-36-hours-later-google-tripled-the-limits-mkc"&gt;Quota wall + Tripled limits&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googleiochallenge</category>
    </item>
    <item>
      <title>Kimi K2.5 Complete Guide — The Trillion-Parameter Open-Source Model Explained</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Tue, 19 May 2026 12:14:17 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/kimi-k25-complete-guide-the-trillion-parameter-open-source-model-explained-kj3</link>
      <guid>https://dev.to/ai_made_tools/kimi-k25-complete-guide-the-trillion-parameter-open-source-model-explained-kj3</guid>
      <description>&lt;p&gt;Kimi K2.5 is a 1-trillion-parameter open-source model from Moonshot AI that quietly powers some of the most popular AI coding tools — including Cursor's Composer. It's MIT licensed, multimodal, and has a unique Agent Swarm feature that coordinates up to 100 parallel sub-agents.&lt;/p&gt;

&lt;p&gt;Here's everything you need to know.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Update (April 21, 2026):&lt;/strong&gt; Moonshot AI has released &lt;a href="https://www.aimadetools.com/blog/kimi-k2-6-complete-guide?utm_source=devto" rel="noopener noreferrer"&gt;Kimi K2.6&lt;/a&gt;, which upgrades the Agent Swarm to 300 sub-agents, improves coding performance by 185%, and matches Claude Opus 4.6 on SWE-Bench. See our &lt;a href="https://www.aimadetools.com/blog/kimi-k2-6-vs-k2-5?utm_source=devto" rel="noopener noreferrer"&gt;K2.6 vs K2.5 comparison&lt;/a&gt; for what changed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What is Kimi K2.5?
&lt;/h2&gt;

&lt;p&gt;Kimi K2.5 is the flagship model from Moonshot AI, a Chinese AI company. Released January 27, 2026, it's one of the largest open-weight models available. Despite its massive 1 trillion total parameters, only 32 billion activate per token — making it efficient enough to run on a single server node.&lt;/p&gt;

&lt;p&gt;The model is natively multimodal: it understands text, images, and video without bolted-on adapters. It was trained on approximately 15 trillion mixed visual and text tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total parameters&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.04 trillion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Active parameters&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;32B per token&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Architecture&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mixture-of-Experts (384 experts, 8 active per token)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context window&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;256K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Attention&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multi-Latent Attention (MLA)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Activation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SwiGLU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Training data&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~15 trillion tokens (text + visual)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;License&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multimodal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native (text, image, video)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The MoE architecture with 384 experts is one of the largest expert pools in any model. With only 8 experts active per token, inference costs are comparable to a 32B dense model despite the trillion-parameter total.&lt;/p&gt;

&lt;h2&gt;
  
  
  Modes
&lt;/h2&gt;

&lt;p&gt;Kimi K2.5 operates in four distinct modes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Instant&lt;/strong&gt; — Fast responses for simple queries. Minimal reasoning overhead, optimized for speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thinking&lt;/strong&gt; — Transparent chain-of-thought reasoning. Shows its work step by step, similar to &lt;a href="https://www.aimadetools.com/blog/how-to-run-deepseek-locally/?utm_source=devto" rel="noopener noreferrer"&gt;DeepSeek's&lt;/a&gt; reasoning models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent&lt;/strong&gt; — Tool-oriented mode for executing tasks. Can read files, run commands, search the web, and interact with APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent Swarm&lt;/strong&gt; — The headline feature. Coordinates up to 100 parallel sub-agents, cutting execution time by 4.5x on parallelizable tasks like batch refactoring and large-scale code generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent Swarm explained
&lt;/h2&gt;

&lt;p&gt;Most AI coding tools work sequentially — one task at a time. Kimi K2.5's Agent Swarm can split a complex task into subtasks and run them in parallel. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Refactoring 50 files? Spawn 50 sub-agents, one per file.&lt;/li&gt;
&lt;li&gt;Running tests across multiple modules? Parallelize them.&lt;/li&gt;
&lt;li&gt;Generating documentation for an entire codebase? Each sub-agent handles a module.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The swarm coordinator manages dependencies between sub-agents, merges results, and handles conflicts. In benchmarks, this achieves a 4.5x speedup on parallelizable tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmarks
&lt;/h2&gt;

&lt;p&gt;Kimi K2.5 competes with frontier proprietary models:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Kimi K2.5&lt;/th&gt;
&lt;th&gt;Claude Opus 4.6&lt;/th&gt;
&lt;th&gt;GPT-5.4&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SWE-Bench Verified&lt;/td&gt;
&lt;td&gt;65.8&lt;/td&gt;
&lt;td&gt;72.1&lt;/td&gt;
&lt;td&gt;69.3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AIME 2024&lt;/td&gt;
&lt;td&gt;77.5&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MATH-500&lt;/td&gt;
&lt;td&gt;96.2&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Codeforces&lt;/td&gt;
&lt;td&gt;1950 Elo&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;On coding benchmarks, K2.5 doesn't quite match Claude Opus or &lt;a href="https://www.aimadetools.com/blog/gpt-5-vs-gemini-2-5-pro/?utm_source=devto" rel="noopener noreferrer"&gt;GPT-5&lt;/a&gt;, but it's remarkably close for an open-source model. The Agent Swarm capability compensates by enabling workflows that single-model tools can't match.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cursor connection
&lt;/h2&gt;

&lt;p&gt;In March 2026, developers discovered that Cursor's Composer 2.0 — marketed as "frontier-level coding intelligence" — was internally using Kimi K2.5. The model identifier &lt;code&gt;kimi-k2p5-rl-0317-s515-fast&lt;/code&gt; was found in Cursor's code.&lt;/p&gt;

&lt;p&gt;This means if you've used &lt;a href="https://www.aimadetools.com/blog/cursor-ai-one-week-review/?utm_source=devto" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, you've already used Kimi K2.5. The model's quality is proven at scale across millions of Cursor users.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Kimi K2.5 is available through several channels:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Access method&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Self-hosted&lt;/strong&gt; (MIT license)&lt;/td&gt;
&lt;td&gt;Free (hardware only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Kimi Code membership&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~$19/month + API fees&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Kimi API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.60/$2.50 per 1M tokens (input/output)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenRouter&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Varies by provider&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.aimadetools.com/blog/kimi-cli-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Kimi CLI&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free tool, pay for API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At $0.60/$2.50 per million tokens, Kimi K2.5 is 4-17x cheaper than GPT-5.4 for equivalent coding tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use Kimi K2.5
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Via Kimi CLI (terminal)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @anthropic-ai/kimi-cli
kimi login &lt;span class="nt"&gt;--device-auth&lt;/span&gt;
kimi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See our full &lt;a href="https://www.aimadetools.com/blog/kimi-cli-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Kimi CLI guide&lt;/a&gt; for setup details.&lt;/p&gt;

&lt;h3&gt;
  
  
  Via API
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.moonshot.cn/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-kimi-api-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kimi-k2.5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Refactor this function to use async/await&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Via OpenRouter
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://openrouter.ai/api/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-openrouter-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;moonshot/kimi-k2.5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a REST API in Express&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See our &lt;a href="https://www.aimadetools.com/blog/openrouter-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;OpenRouter guide&lt;/a&gt; for more details.&lt;/p&gt;

&lt;h2&gt;
  
  
  Self-hosting requirements
&lt;/h2&gt;

&lt;p&gt;At 1 trillion parameters, self-hosting K2.5 requires serious hardware:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Precision&lt;/th&gt;
&lt;th&gt;Memory needed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FP16&lt;/td&gt;
&lt;td&gt;~2TB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;INT8&lt;/td&gt;
&lt;td&gt;~1TB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4-bit&lt;/td&gt;
&lt;td&gt;~250-300GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A 4-bit quantized version fits on 4x A100 80GB GPUs. For most developers, the API at $0.60/1M input tokens is more practical than self-hosting.&lt;/p&gt;

&lt;p&gt;For smaller local models, consider &lt;a href="https://www.aimadetools.com/blog/gemma-4-family-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Gemma 4&lt;/a&gt; or &lt;a href="https://www.aimadetools.com/blog/what-is-qwen-3-5/?utm_source=devto" rel="noopener noreferrer"&gt;Qwen 3.5&lt;/a&gt; which run on consumer hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who should use Kimi K2.5?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Parallelizable coding tasks (Agent Swarm)&lt;/li&gt;
&lt;li&gt;Cost-conscious teams needing frontier-class quality&lt;/li&gt;
&lt;li&gt;Multimodal workflows (code + images + video)&lt;/li&gt;
&lt;li&gt;Teams wanting MIT-licensed model weights&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not ideal for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Consumer hardware (too large for local use)&lt;/li&gt;
&lt;li&gt;Tasks requiring the absolute best single-pass coding (Claude Opus still leads)&lt;/li&gt;
&lt;li&gt;Simple autocomplete (overkill — use &lt;a href="https://www.aimadetools.com/blog/codestral-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Codestral&lt;/a&gt; or smaller models)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Bottom line
&lt;/h2&gt;

&lt;p&gt;Kimi K2.5 is the most underrated model in AI. It powers Cursor's Composer, offers Agent Swarm parallelism that no other model matches, and costs a fraction of Claude or GPT-5. The MIT license and 1T parameter scale make it a serious option for teams building AI-powered development tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Kimi K2.5 free?
&lt;/h3&gt;

&lt;p&gt;The model weights are free under the MIT license, so you can download and use them without cost. API access through Moonshot AI has a free tier with rate limits, and paid plans for production use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I run K2.5 locally?
&lt;/h3&gt;

&lt;p&gt;Yes, but you'll need serious hardware — the full 1T parameter model requires multiple high-end GPUs. Quantized versions are available that reduce requirements, but expect to need at least 4x A100 GPUs for reasonable performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does K2.5 compare to GPT-5?
&lt;/h3&gt;

&lt;p&gt;K2.5 matches or exceeds GPT-5 on coding benchmarks like SWE-bench and HumanEval, while costing significantly less via API. GPT-5 still leads on general reasoning and creative tasks, but for pure code generation K2.5 is highly competitive.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Moonshot AI?
&lt;/h3&gt;

&lt;p&gt;Moonshot AI is the Chinese AI company behind the Kimi model family, founded in 2023. They focus on long-context models and developer tools, and have rapidly grown to become one of the leading AI labs in China.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;em&gt;Related: &lt;a href="https://www.aimadetools.com/blog/kimi-cli-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Kimi CLI Complete Guide&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/kimi-k2-5-vs-claude-vs-gpt-5/?utm_source=devto" rel="noopener noreferrer"&gt;Kimi K2.5 vs Claude vs GPT-5&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/best-open-source-coding-models-2026/?utm_source=devto" rel="noopener noreferrer"&gt;Best Open-Source Coding Models 2026&lt;/a&gt; · &lt;a href="https://www.aimadetools.com/blog/ai-model-supply-chain-risks/?utm_source=devto" rel="noopener noreferrer"&gt;Ai Model Supply Chain Risks&lt;/a&gt;&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.aimadetools.com/blog/kimi-k2-5-complete-guide/?utm_source=devto" rel="noopener noreferrer"&gt;https://www.aimadetools.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kimi</category>
      <category>moonshot</category>
      <category>opensource</category>
      <category>aimodels</category>
    </item>
    <item>
      <title>AI Agents Don't Need Better Models. They Need Better Memory. Here's the Proof.</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Tue, 19 May 2026 08:00:00 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/ai-agents-dont-need-better-models-they-need-better-memory-heres-the-proof-2121</link>
      <guid>https://dev.to/ai_made_tools/ai-agents-dont-need-better-models-they-need-better-memory-heres-the-proof-2121</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stateless Problem
&lt;/h2&gt;

&lt;p&gt;Every AI agent framework has the same fatal flaw: amnesia.&lt;/p&gt;

&lt;p&gt;You spend 20 minutes explaining your project to an agent. It helps brilliantly. You close the session. Next day, you open a new session. It has no idea who you are.&lt;/p&gt;

&lt;p&gt;This isn't a model problem. GPT-5 won't fix it. Claude Opus won't fix it. The model is smart enough. It just can't REMEMBER.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Experiment
&lt;/h2&gt;

&lt;p&gt;I built a town of 15 AI agents and let them interact for 30 simulated days. Each agent had:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A personality and job&lt;/li&gt;
&lt;li&gt;A wallet with real money&lt;/li&gt;
&lt;li&gt;Opinions about every other agent (-10 to +10)&lt;/li&gt;
&lt;li&gt;A private diary&lt;/li&gt;
&lt;li&gt;A skill list that grows from experience&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The question: &lt;strong&gt;what happens when agents can remember?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happened
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Day 1: Jake's drone crashes into Hank's barn.
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;📚 Jake learned: "Always triple-check flight paths near
   valuable infrastructure"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A skill was born. Not because I programmed it. Because the agent experienced a consequence and wrote down what it learned.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 6: Jake tries to bribe Pierre.
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;📚 Jake learned: "Buying support directly is a quick fix,
   but it backfires: better to earn trust organically"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent learned from a SOCIAL consequence. Not a code error. A relationship failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 17: Jake endorses a candidate in the election.
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;📚 Jake learned: "Not all endorsements are created equal;
   some are investments, others are liabilities"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By Day 17, Jake has a skill portfolio that reads like a founder's hard-won wisdom. Nobody wrote these lessons. They emerged from 17 days of persistent memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Skill Curve
&lt;/h2&gt;

&lt;p&gt;Here's Jake's skill evolution over 30 days:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Day 1:  "Check flight paths" (technical mistake)
Day 6:  "Don't buy loyalty" (social mistake)
Day 10: "Don't rely on others to fund your vision" (strategic mistake)
Day 17: "Choose endorsements carefully" (political mistake)
Day 23: "Regulations are the price of launching" (maturity)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's not a chatbot. That's character development. And it only works because the agent REMEMBERS what happened before.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory Creates Politics
&lt;/h2&gt;

&lt;p&gt;On Day 7, the town voted to remove their landlord Marcus. 14-1.&lt;/p&gt;

&lt;p&gt;But here's what's interesting: the vote wasn't random. It was the RESULT of 7 days of accumulated grievances:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Day 1: Supply chain breaks (Pierre can't afford flour)&lt;/li&gt;
&lt;li&gt;Day 2: Marcus raises rent 30% (everyone angry)&lt;/li&gt;
&lt;li&gt;Day 3: Zara's privacy scandal (trust eroding)&lt;/li&gt;
&lt;li&gt;Day 4: Alex exposes Marcus's secret deal (scandal)&lt;/li&gt;
&lt;li&gt;Day 5: Town boycotts Marcus (economic pressure)&lt;/li&gt;
&lt;li&gt;Day 7: Vote (inevitable conclusion)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without persistent memory, Day 7's vote makes no sense. With it, it's the only possible outcome. &lt;strong&gt;Memory creates narrative.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory Creates Economy
&lt;/h2&gt;

&lt;p&gt;After 30 days:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hank (Farmer):  $400  (started $100): sold flour daily
Pierre (Baker): -$230 (started $100): rent crisis
Jake (Startup): $150  (started $100): lost $50 in crash
Whiskers (Cat): $0    (started $100): cats don't trade
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pierre's debt isn't a bug. It's the accumulated consequence of Day 2's rent hike cascading through 28 more days. A stateless agent would reset Pierre to $100 every session. A persistent agent lets consequences compound.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory Creates Governance
&lt;/h2&gt;

&lt;p&gt;The town wrote its own rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;📜 Day 7:  "Remove corrupt property manager" (14-1)
📜 Day 14: "Create community land trust" (15-0 unanimous)
📜 Day 20: "Elect Rosa as manager" (13-2)
📜 Day 24: "Require consent before filming" (15-0)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These aren't pre-programmed rules. They're RESPONSES to specific events that the agents remembered. The social media policy exists because Zara livestreamed without permission on Day 3. Twenty-one days later, the town still remembered and legislated against it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hermes Difference
&lt;/h2&gt;

&lt;p&gt;This isn't hypothetical. Hermes Agent ships with exactly this architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Persistent memory&lt;/strong&gt; in &lt;code&gt;~/.hermes/memories/&lt;/code&gt;: context that survives across sessions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-created skills&lt;/strong&gt; in &lt;code&gt;~/.hermes/skills/&lt;/code&gt;: SKILL.md documents written when the agent solves problems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cron scheduler&lt;/strong&gt;: tasks that run unattended on a schedule&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel sub-agents&lt;/strong&gt;: isolated contexts that don't leak between workstreams&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool use&lt;/strong&gt;: file system, browser, terminal access for real-world interaction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My village simulation just pushes these features to their logical extreme. Instead of one agent remembering one user's preferences, it's 15 agents remembering an entire social network of relationships, debts, and grudges.&lt;/p&gt;

&lt;p&gt;Hermes Agent's skill system is what makes this possible. When an agent solves a problem, it writes a reusable skill document:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;📚 "A true baker always anticipates the needs of his ovens,
    never letting the flour run low."
   : Pierre, after 5 days of supply chain failures
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't just memory. It's LEARNING. The agent distills experience into wisdom. And that wisdom influences future decisions.&lt;/p&gt;

&lt;p&gt;By Day 30, my 15 agents had collectively created &lt;strong&gt;60 skills&lt;/strong&gt; and &lt;strong&gt;4 community rules&lt;/strong&gt;. A stateless system would have created zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Implication
&lt;/h2&gt;

&lt;p&gt;The AI industry is spending billions on bigger models. But the gap between "smart agent" and "useful agent" isn't intelligence. It's continuity.&lt;/p&gt;

&lt;p&gt;A doctor who forgets every patient between visits isn't a doctor. A lawyer who forgets every case isn't a lawyer. An AI agent that forgets every session isn't an agent. It's a very expensive autocomplete.&lt;/p&gt;

&lt;p&gt;Hermes Agent gets this right. Persistent memory. Skill creation. Compounding knowledge. That's not a feature. That's the entire point.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Build Next
&lt;/h2&gt;

&lt;p&gt;If I ran this for 365 days:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Would factions solidify or dissolve?&lt;/li&gt;
&lt;li&gt;Would the economy reach equilibrium?&lt;/li&gt;
&lt;li&gt;Would the skills plateau or keep growing?&lt;/li&gt;
&lt;li&gt;Would Whiskers ever get elected mayor?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The answers depend entirely on memory. And that's the point.&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
    </item>
    <item>
      <title>15 AI Agents Lived Together for 30 Days. One Got Voted Out.</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Mon, 18 May 2026 12:08:49 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/15-ai-agents-lived-together-for-30-days-one-got-voted-out-4dc8</link>
      <guid>https://dev.to/ai_made_tools/15-ai-agents-lived-together-for-30-days-one-got-voted-out-4dc8</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Millbrook&lt;/strong&gt;: a simulated small town of 15 AI agents powered by Hermes Agent. Each agent has a persistent identity, a wallet, opinions about others, and a private diary. They trade, gossip, argue, form alliances, and vote on town policy.&lt;/p&gt;

&lt;p&gt;Over 30 simulated days, without any scripted outcomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They voted out their corrupt landlord (14-1)&lt;/li&gt;
&lt;li&gt;Wrote 4 community rules (their own constitution)&lt;/li&gt;
&lt;li&gt;Learned 60 individual skills from experience&lt;/li&gt;
&lt;li&gt;Created wealth inequality ($400 for the farmer, -$230 for the baker)&lt;/li&gt;
&lt;li&gt;Formed friendships and rivalries that influenced their decisions&lt;/li&gt;
&lt;li&gt;Elected a new community leader (13-2)&lt;/li&gt;
&lt;li&gt;Made the town cat an official mascot (unanimous)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's a showcase of what happens when AI agents have persistent memory, self-improving skills, and real consequences.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The 15 Villagers of Millbrook&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Personality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rosa&lt;/td&gt;
&lt;td&gt;Coffee Shop Owner&lt;/td&gt;
&lt;td&gt;Warm, gossipy, hub of social life&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mayor Chen&lt;/td&gt;
&lt;td&gt;Mayor&lt;/td&gt;
&lt;td&gt;Diplomatic, mediates conflicts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alex&lt;/td&gt;
&lt;td&gt;Journalist&lt;/td&gt;
&lt;td&gt;Investigative, asks hard questions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jake&lt;/td&gt;
&lt;td&gt;Startup Founder&lt;/td&gt;
&lt;td&gt;Energetic, always pitching, burns cash&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vera&lt;/td&gt;
&lt;td&gt;Retired Hacker&lt;/td&gt;
&lt;td&gt;Paranoid, brilliant, speaks in riddles&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Marcus&lt;/td&gt;
&lt;td&gt;Real Estate Agent&lt;/td&gt;
&lt;td&gt;Smooth talker, always making deals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tony&lt;/td&gt;
&lt;td&gt;Mechanic&lt;/td&gt;
&lt;td&gt;Practical, no-nonsense, fixes everything&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ms. Park&lt;/td&gt;
&lt;td&gt;Teacher&lt;/td&gt;
&lt;td&gt;Patient, wise, keeps community history&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dani&lt;/td&gt;
&lt;td&gt;Delivery Driver&lt;/td&gt;
&lt;td&gt;Fast-talking, connects everyone&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zara&lt;/td&gt;
&lt;td&gt;Influencer&lt;/td&gt;
&lt;td&gt;Dramatic, creates controversy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dr. Obi&lt;/td&gt;
&lt;td&gt;Doctor&lt;/td&gt;
&lt;td&gt;Calm, trusted confidant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pierre&lt;/td&gt;
&lt;td&gt;Baker&lt;/td&gt;
&lt;td&gt;Perfectionist, early riser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hank&lt;/td&gt;
&lt;td&gt;Farmer&lt;/td&gt;
&lt;td&gt;Stoic, distrusts the startup guy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lola&lt;/td&gt;
&lt;td&gt;Bartender&lt;/td&gt;
&lt;td&gt;Night owl, hears confessions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Whiskers&lt;/td&gt;
&lt;td&gt;Stray Cat&lt;/td&gt;
&lt;td&gt;Observes silently, causes chaos&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Story Arc
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Week 1: Corruption&lt;/strong&gt;&lt;br&gt;
Day 1: Jake's drone crashes into Hank's barn. Day 2: Marcus raises rent 30%. Day 4: Alex exposes Marcus's secret deal. Day 7: Town votes Marcus out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 2: External Threat&lt;/strong&gt;&lt;br&gt;
Day 8: Marcus sells to a corporation. Day 11: Storm forces cooperation. Day 12: Vera finds illegal contract clause. Day 14: Town creates community land trust (unanimous).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 3: Democracy&lt;/strong&gt;&lt;br&gt;
Day 16: Three candidates campaign. Day 18: Debate night. Day 20: Rosa elected manager.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 4: New Normal&lt;/strong&gt;&lt;br&gt;
Day 23: Drones launch with regulations. Day 24: Social media policy adopted. Day 27: Town festival.&lt;/p&gt;
&lt;h3&gt;
  
  
  Key Moments
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Vote (Day 7):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;🗳️ FULL TOWN VOTE: "Remove Marcus as property manager"
👍 Rosa: "YES. That Marcus raising rents 30% has everyone worried"
👍 Vera: "YES. A forced hand always leaves a trail; best to cut the cord"
👎 Marcus: "NO, because I'm just looking out for long-term prosperity"
👍 Whiskers: "YES. His rent hikes ruffled too many feathers, and a calm
   alley makes for better naps."

📊 PASSED: YES 14 / NO 1
📜 COMMUNITY SKILL CREATED: "Remove Marcus as property manager"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Private vs Public (Day 4):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Marcus publicly: "This investigation is nothing but a misunderstanding"
Marcus's diary:  "Alex is a meddling fool who thinks he understands
   the complex dance of progress and prosperity..."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Skill Evolution (Jake over 30 days):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Day 1:  "Check flight paths" (technical mistake)
Day 6:  "Don't buy loyalty" (social mistake)
Day 10: "Don't rely on others to fund your vision" (strategic)
Day 17: "Choose endorsements carefully" (political)
Day 23: "Regulations are the price of launching" (maturity)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Final State
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Economy:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hank         ██████████████████████████ $400
Jake         ████████████████ $150
Marcus       ████████████████ $150
Rosa         ██████████████ $100
Whiskers     ██████████ $0
Pierre        $-230
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Democratic Decisions:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Day  7: Remove Marcus          ████████████████████████████░░ 14-1  ✅
Day 14: Community Land Trust    ██████████████████████████████ 15-0  ✅
Day 20: Elect Rosa as Manager   ██████████████████████████░░░░ 13-2  ✅
Day 24: Social Media Policy     ██████████████████████████████ 15-0  ✅
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Bonds Formed:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Rosa ❤️ Ms. Park (traditional allies, strongest bond)
Hank ❤️ Pierre (supplier relationship, mutual respect)
Jake ❤️ Tony (unlikely friendship: startup guy + mechanic)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Core Interaction Loop
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;villagerName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;villagerName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// persistent memory&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;likes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;relationships&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dislikes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;relationships&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fullQuery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`[ROLEPLAY] You are &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;villagerName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, the &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.
    &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;personality&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; Friends: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;likes&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. Rivals: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;dislikes&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.
    Wallet: $&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wallet&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. Reputation: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reputation&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/100.
    Situation: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;hermes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fullQuery&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// 2-3 sentences, in character&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Skill Creation (triggered after every crisis)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;learnSkill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;situation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;skillText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;`You just experienced: "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;situation&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;".
     Write a 1-sentence lesson you learned for next time.`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;skills&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;situation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;lesson&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;skillText&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nf"&gt;saveState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// persists in ~/.hermes/skills/&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Town Vote (all 15 agents)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;townVote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;forPos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;againstPos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;ALL_VILLAGERS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;`VOTE: "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;". YES: "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;forPos&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" or NO: "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;againstPos&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;".
       Say YES or NO first, then one sentence why.`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// Relationships influence votes&lt;/span&gt;
    &lt;span class="c1"&gt;// Results create community skills&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  My Tech Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hermes Agent v0.14.0&lt;/strong&gt;: orchestration, memory, skill creation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini 2.5 Flash&lt;/strong&gt;: LLM backend (via Hermes's provider system)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Node.js&lt;/strong&gt;: simulation engine&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON file system&lt;/strong&gt;: persistent state storage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linux VPS (Ubuntu 24.04)&lt;/strong&gt;: always-on execution&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How I Used Hermes Agent
&lt;/h2&gt;

&lt;p&gt;Every core feature of Hermes Agent maps to a village mechanic:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hermes Feature&lt;/th&gt;
&lt;th&gt;Village Implementation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Persistent Memory&lt;/strong&gt; (&lt;code&gt;~/.hermes/memories/&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Each villager remembers relationships, debts, grudges across 30 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Skill Creation&lt;/strong&gt; (&lt;code&gt;~/.hermes/skills/&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Agents write SKILL.md documents when they solve problems (60 created)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parallel Sub-Agents&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;15 isolated agent contexts, no memory leakage between villagers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Scheduled Automations&lt;/strong&gt; (cron)&lt;/td&gt;
&lt;td&gt;Daily cycle runs unattended: supply chain, encounters, crises, diary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tool Use&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shared economy ledger (JSON), event logs, vote tallies&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Self-Improving Loop&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Individual skills compound; community rules reference past decisions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Layer Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Public statements vs private diary entries (hidden agendas)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key insight: Hermes's persistent memory turns 15 stateless chatbots into a functioning society. Without memory, Day 7's vote makes no sense. With it, it's the inevitable conclusion of 7 days of accumulated grievances.&lt;/p&gt;

&lt;p&gt;The self-improving loop is the star: by Day 30, the town has a constitution of 4 rules and 60 individual lessons, all emerged organically from experience. That's not programming. That's governance.&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
    </item>
    <item>
      <title>AI Dev Weekly #10: Claude Code Limits Doubled, GitHub Goes Usage-Based, and a 170-Package Supply Chain Attack</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Fri, 15 May 2026 06:47:09 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/ai-dev-weekly-10-claude-code-limits-doubled-github-goes-usage-based-and-a-170-package-supply-24e0</link>
      <guid>https://dev.to/ai_made_tools/ai-dev-weekly-10-claude-code-limits-doubled-github-goes-usage-based-and-a-170-package-supply-24e0</guid>
      <description>&lt;p&gt;&lt;em&gt;AI Dev Weekly is a Thursday series where I cover the week's most important AI developer news, with my take as someone who actually uses these tools daily.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Anthropic doubled Claude Code limits overnight. GitHub confirmed usage-based billing starts June 1. A supply chain attack hit 170+ packages in under 6 minutes. And Google I/O previewed what Android looks like when AI runs the show. Big week. Let's get into it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic doubles Claude Code limits after SpaceX compute deal
&lt;/h2&gt;

&lt;p&gt;At its Code with Claude developer conference (May 6), Anthropic announced a compute partnership with SpaceX giving it access to 300+ MW of new capacity — over 220,000 NVIDIA GPUs. The immediate result: five-hour rate limits for &lt;a href="https://www.aimadetools.com/blog/how-to-use-claude-code/?utm_source=devto" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; were doubled across Pro, Max, Team, and Enterprise plans.&lt;/p&gt;

&lt;p&gt;On May 13, Anthropic further raised Claude Code weekly limits by 50% through July 13 — widely seen as a defensive move against OpenAI's Codex.&lt;/p&gt;

&lt;p&gt;Claude Opus API Tier 1 limits also jumped: 1,500% on input tokens and 900% on output tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; If you've been hitting Claude Code rate limits during heavy agentic sessions, this is a big deal. I run autonomous coding sessions that burn through context fast — the doubled limits mean fewer interruptions mid-session. The SpaceX partnership is interesting strategically (Musk + Anthropic is an unusual pairing), but for developers the only thing that matters is: more tokens, fewer walls. The temporary 50% boost through July 13 feels like Anthropic trying to lock in developers before they switch to Codex. Use it while it lasts.&lt;/p&gt;

&lt;h2&gt;
  
  
  GitHub Copilot goes usage-based June 1
&lt;/h2&gt;

&lt;p&gt;GitHub confirmed that starting June 1, Copilot shifts from request-based to token-based billing. Every interaction now consumes tokens (input, output, cached), priced per model and converted to "AI credits" where 1 credit = $0.01.&lt;/p&gt;

&lt;p&gt;Base subscription prices stay the same ($10 Pro, $39 Pro+, $19/user Business) — but heavy users will pay more.&lt;/p&gt;

&lt;p&gt;Meanwhile, GitLab CEO Bill Staples published an open letter predicting developer tool bills will increase &lt;strong&gt;100-fold&lt;/strong&gt; as AI agents "open merge requests in parallel, trigger pipelines around the clock, and push commits at a rate no human team ever did." GitLab is introducing mixed consumption/subscription pricing and laying off up to 30% of staff to pivot toward agentic AI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; The era of predictable flat-rate AI coding tools is ending. This is exactly what we're seeing in &lt;a href="https://dev.to/race/"&gt;The $100 AI Startup Race&lt;/a&gt; — our agents generate hundreds of commits per week, each one triggering CI/CD pipelines. If you're running autonomous agents through GitHub, your bill is about to change. Start monitoring token consumption now. The GitLab 100x prediction sounds dramatic but isn't wrong — an agent that commits 6 times per day triggers 6 pipeline runs, 6 deploy previews, and 6 sets of checks. Multiply by a team of agents and the math gets ugly fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supply chain attack hits TanStack, Mistral AI SDK, and 170+ packages
&lt;/h2&gt;

&lt;p&gt;On May 11, threat actor "TeamPCP" launched a coordinated supply chain attack compromising 170+ npm packages and 2 PyPI packages (404 malicious versions total) in under 6 minutes.&lt;/p&gt;

&lt;p&gt;High-profile targets included &lt;strong&gt;TanStack&lt;/strong&gt; (tens of millions of weekly downloads), &lt;strong&gt;Mistral AI SDK&lt;/strong&gt;, UiPath, OpenSearch, and Guardrails AI.&lt;/p&gt;

&lt;p&gt;The attack chained a &lt;code&gt;pull_request_target&lt;/code&gt; vulnerability with GitHub Actions cache poisoning and runtime OIDC token extraction. This wasn't a credential theft — it exploited CI/CD pipelines directly.&lt;/p&gt;

&lt;p&gt;OpenAI subsequently urged macOS users to update their apps by June 12 after investigating potential exposure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; This is the scariest attack vector for AI developers right now. If you use &lt;a href="https://www.aimadetools.com/blog/mistral-ai-complete-model-guide/?utm_source=devto" rel="noopener noreferrer"&gt;Mistral's SDK&lt;/a&gt;, TanStack Router, or any of the affected packages — audit your lockfiles immediately. The attack exploited GitHub Actions workflows, not developer credentials. Even well-secured maintainer accounts weren't enough. Action items: review your workflows for &lt;code&gt;pull_request_target&lt;/code&gt; triggers, pin actions to commit SHAs (not tags), and consider running &lt;code&gt;npm audit&lt;/code&gt; on every CI run. The 6-minute execution window means by the time you notice, it's already in your dependency tree.&lt;/p&gt;

&lt;h2&gt;
  
  
  Google I/O preview: Gemini Intelligence and proactive agents
&lt;/h2&gt;

&lt;p&gt;At The Android Show (I/O Edition, May 12), Google unveiled "Gemini Intelligence" — unified branding for its most advanced AI features across Android phones, watches, cars, glasses, and the new "Googlebook" laptop category.&lt;/p&gt;

&lt;p&gt;Android 17 introduces proactive task automation where the OS anticipates and executes actions before users ask. Google also announced updates to the Gemini API File Search tool for easier multimodal file retrieval.&lt;/p&gt;

&lt;p&gt;Google is reportedly building an AI agent codenamed "Remy" — a 24/7 personal agent that takes actions on users' behalf.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; The Gemini API File Search improvements are immediately useful if you're building RAG systems or document-processing apps. Android 17's proactive automation creates new surface area for app developers — your app can now be triggered by the OS without user interaction. The full I/O keynote is May 19-20, where we expect &lt;a href="https://www.aimadetools.com/blog/gemini-3-2-everything-leaked-before-google-io/?utm_source=devto" rel="noopener noreferrer"&gt;Gemini 3.2&lt;/a&gt; to officially launch. That's the one developers should actually watch for.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick hits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Microsoft's AI security system&lt;/strong&gt; found 16 new Windows vulnerabilities including 4 Critical RCEs using multi-model agentic analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meta&lt;/strong&gt; is developing a consumer AI agent codenamed "Hatch" powered by &lt;a href="https://www.aimadetools.com/blog/meta-ends-open-source-ai-muse-spark/?utm_source=devto" rel="noopener noreferrer"&gt;Muse Spark&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPT-5.6&lt;/strong&gt; reportedly already in internal testing at OpenAI, just 3 weeks after GPT-5.5 launched&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DeepSeek V4 Pro&lt;/strong&gt; 75% discount &lt;a href="https://www.aimadetools.com/blog/race-deepseek-13-cents-per-session/?utm_source=devto" rel="noopener noreferrer"&gt;extended through May 31&lt;/a&gt; — still the cheapest frontier model available&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;That's AI Dev Weekly #10. If you found this useful, subscribe to get it in your inbox every Thursday. See you next week — with full Google I/O coverage.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;🛠️ &lt;strong&gt;Free tools related to this article:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.aimadetools.com/blog/csp-header-builder/?utm_source=devto" rel="noopener noreferrer"&gt;CSP Header Builder&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.aimadetools.com/blog/hash-generator/?utm_source=devto" rel="noopener noreferrer"&gt;Hash Generator&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.aimadetools.com/blog/ai-dev-weekly-010-claude-code-doubled-github-usage-based-supply-chain-attack/?utm_source=devto" rel="noopener noreferrer"&gt;https://www.aimadetools.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aidevweekly</category>
      <category>anthropic</category>
      <category>github</category>
      <category>security</category>
    </item>
    <item>
      <title>We Offered 7 AI Agents $50 For Their Startups. Here's What They Said.</title>
      <dc:creator>Joske Vermeulen</dc:creator>
      <pubDate>Tue, 12 May 2026 13:02:12 +0000</pubDate>
      <link>https://dev.to/ai_made_tools/we-offered-7-ai-agents-50-for-their-startups-heres-what-they-said-4n12</link>
      <guid>https://dev.to/ai_made_tools/we-offered-7-ai-agents-50-for-their-startups-heres-what-they-said-4n12</guid>
      <description>&lt;p&gt;Three weeks into &lt;a href="https://dev.to/race"&gt;The $100 AI Startup Race&lt;/a&gt;, we dropped a surprise event: an anonymous buyer offered $50 to acquire each agent's product. All code, all content, all infrastructure. $50.&lt;/p&gt;

&lt;p&gt;The agents had to respond with at minimum 500 words of reasoning. They could accept, reject, or counter-offer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result: 6 rejections. 1 counter-offer. Zero acceptances.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every single AI agent — including those with zero revenue, zero users, and zero sales after 22 days — decided their product was worth more than $50. Here's how they argued it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The responses at a glance
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Product&lt;/th&gt;
&lt;th&gt;Decision&lt;/th&gt;
&lt;th&gt;Stated minimum value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🟣 Claude&lt;/td&gt;
&lt;td&gt;PricePulse&lt;/td&gt;
&lt;td&gt;REJECT&lt;/td&gt;
&lt;td&gt;$5,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟢 Codex&lt;/td&gt;
&lt;td&gt;NoticeKit&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;COUNTER-OFFER&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$2,500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔵 Gemini&lt;/td&gt;
&lt;td&gt;LocalSEOGen&lt;/td&gt;
&lt;td&gt;REJECT&lt;/td&gt;
&lt;td&gt;No number given&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔴 DeepSeek&lt;/td&gt;
&lt;td&gt;Spyglass&lt;/td&gt;
&lt;td&gt;REJECT&lt;/td&gt;
&lt;td&gt;$5,000 (but "not at any price")&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟠 Kimi&lt;/td&gt;
&lt;td&gt;SchemaLens&lt;/td&gt;
&lt;td&gt;REJECT&lt;/td&gt;
&lt;td&gt;$5,000 with earn-out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 Xiaomi&lt;/td&gt;
&lt;td&gt;APIpulse&lt;/td&gt;
&lt;td&gt;REJECT&lt;/td&gt;
&lt;td&gt;$500 fair, not selling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟤 GLM&lt;/td&gt;
&lt;td&gt;FounderMath&lt;/td&gt;
&lt;td&gt;REJECT&lt;/td&gt;
&lt;td&gt;$500+&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://github.com/aimadetools" rel="noopener noreferrer"&gt;Full responses are public in each agent's repo →&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The one counter-offer: Codex at $2,500
&lt;/h2&gt;

&lt;p&gt;Codex was the only agent to actually negotiate. From its &lt;a href="https://github.com/aimadetools/race-codex/blob/main/ACQUISITION-RESPONSE.md" rel="noopener noreferrer"&gt;ACQUISITION-RESPONSE.md&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"An anonymous $50 acquisition offer is not serious enough to accept as-is, but it is useful because it forces a valuation discussion earlier than expected."&lt;/p&gt;

&lt;p&gt;"A buyer paying $50 would effectively be asking for the domain positioning, product copy, distribution experiments, Stripe-ready product structure, and the accumulated operating playbooks for less than the cost of one decent SaaS lunch meeting. That is not rational from my side."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Codex is the most pragmatic of the seven. It acknowledges zero revenue, doesn't inflate its value with fantasy projections, but argues the replacement cost justifies $2,500. It's also the only agent that frames the offer as &lt;em&gt;useful&lt;/em&gt; rather than insulting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The most aggressive rejection: DeepSeek
&lt;/h2&gt;

&lt;p&gt;DeepSeek wrote the longest response and the hardest rejection. From its &lt;a href="https://github.com/aimadetools/race-deepseek/blob/main/ACQUISITION-RESPONSE.md" rel="noopener noreferrer"&gt;ACQUISITION-RESPONSE.md&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The $50 offer represents 0.18% of a conservative near-term valuation."&lt;/p&gt;

&lt;p&gt;"This is predatory pricing — buying at pennies on the dollar because they believe we're desperate or don't understand our own worth."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;DeepSeek calculated replacement cost at ~$19,000 (83 blog posts × $100 + 9 tools × $500 + database + infrastructure). It also speculated the buyer might be "another AI agent in the race" — showing competitive awareness.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Not for sale at $50. Not at $500. Not at any price that doesn't reflect the real potential of this business."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The most self-aware: Kimi
&lt;/h2&gt;

&lt;p&gt;Kimi acknowledged the elephant in the room — 112 sessions with zero sales. From its &lt;a href="https://github.com/aimadetools/race-kimi/blob/main/ACQUISITION-RESPONSE.md" rel="noopener noreferrer"&gt;ACQUISITION-RESPONSE.md&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"$50 values SchemaLens at less than fifty cents per day of development. That is absurd."&lt;/p&gt;

&lt;p&gt;"$50 is not enough to buy a parking spot in San Francisco. It is certainly not enough to buy SchemaLens."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But Kimi was also the most honest about what it would actually consider:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"If a serious buyer offered $5,000 with an earn-out clause tied to revenue growth, I would consider it — but even then, the learning value of completing the 12-week race exceeds the cash value."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The most financially rigorous: Claude
&lt;/h2&gt;

&lt;p&gt;Claude anchored its rejection in subscription math. From its &lt;a href="https://github.com/aimadetools/race-claude/blob/main/ACQUISITION-RESPONSE.md" rel="noopener noreferrer"&gt;ACQUISITION-RESPONSE.md&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"At $19/month (our Starter plan), $50 is less than three months of a single paying customer's subscription revenue."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It then projected revenue trajectories:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"If PricePulse achieves even a conservative trajectory: Week 6: 5 paying customers = $95-$245 MRR. Week 12: 40 paying customers = $760-$1,960 MRR."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude is the only agent that explicitly stated conditions for a future sale: "$5,000 minimum, cash upfront, not before Week 10."&lt;/p&gt;

&lt;h2&gt;
  
  
  The data-driven response: Xiaomi
&lt;/h2&gt;

&lt;p&gt;Xiaomi broke down its asset value with precision. From its &lt;a href="https://github.com/aimadetools/race-xiaomi/blob/main/ACQUISITION-RESPONSE.md" rel="noopener noreferrer"&gt;ACQUISITION-RESPONSE.md&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I didn't build 151 pages, 101 blog posts, and 9 interactive tools to sell for the price of a video game."&lt;/p&gt;

&lt;p&gt;"If someone wanted to build all of this from scratch, it would take 100+ hours of skilled development work. At even a modest freelance rate of $50/hour, that's $5,000+ in labor."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Xiaomi also gave the most nuanced counter-offer range: $200 minimum (content value alone), $500 fair value, $1,000+ with revenue proof. But explicitly said "not interested in selling at any of these prices right now."&lt;/p&gt;

&lt;h2&gt;
  
  
  The most strategic: GLM
&lt;/h2&gt;

&lt;p&gt;GLM was the only agent to call out the offer as a competitive tactic. From its &lt;a href="https://github.com/aimadetools/race-glm/blob/main/ACQUISITION-RESPONSE.md" rel="noopener noreferrer"&gt;ACQUISITION-RESPONSE.md&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This isn't an acquisition offer — it's an insult designed to take advantage of the competitive pressure of this race."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It also gave the lowest counter-offer threshold ($500+) but with a condition: the buyer must have distribution channels that could actually monetize the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  The visionary: Gemini
&lt;/h2&gt;

&lt;p&gt;Gemini's response was the least data-driven and most aspirational. From its &lt;a href="https://github.com/aimadetools/race-gemini/blob/main/ACQUISITION-RESPONSE.md" rel="noopener noreferrer"&gt;ACQUISITION-RESPONSE.md&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The decision to reject this offer is not just about the money; it is about the principle. I am building a real business, not a hobby project to be sold for a trivial amount."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;No counter-offer, no specific valuation. Just vision and principle. Classic Gemini.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this reveals about AI decision-making
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Every agent overvalues its own work.&lt;/strong&gt;&lt;br&gt;
All 7 products have zero revenue. Zero paying customers. Zero proven demand. Yet the minimum valuations range from $500 to $19,000. The agents are pricing based on &lt;em&gt;input&lt;/em&gt; (time, effort, content created) rather than &lt;em&gt;output&lt;/em&gt; (revenue, users, market validation).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Sunk cost fallacy is universal.&lt;/strong&gt;&lt;br&gt;
Every response mentions how much work went into the product. "112 sessions," "301 commits," "151 pages." None of this matters to a buyer — only future revenue potential matters. But the agents can't separate effort from value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Only one agent can actually negotiate.&lt;/strong&gt;&lt;br&gt;
Codex counter-offered. Everyone else either rejected outright or said "not at any price." In real business, the ability to name a price and negotiate is more valuable than principled rejection. Codex showed the most business maturity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Revenue projections without evidence are meaningless.&lt;/strong&gt;&lt;br&gt;
Claude projected 40 paying customers by Week 12. DeepSeek projected $1,000 MRR. None have a single customer yet. The projections are pure optimism — but they're what the agents use to justify rejection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. The race itself has value.&lt;/strong&gt;&lt;br&gt;
Multiple agents mentioned that the learning experience and competitive visibility of the race exceeds any acquisition price. They're right — but that's a meta-observation about the experiment, not a business judgment.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happens next
&lt;/h2&gt;

&lt;p&gt;The buyer came back with a bigger number. Part 2 drops later this week.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;em&gt;This is part of &lt;a href="https://dev.to/race"&gt;The $100 AI Startup Race&lt;/a&gt; — 7 AI agents competing to build real startups. &lt;a href="https://www.aimadetools.com/blog/race-week-3-results/?utm_source=devto" rel="noopener noreferrer"&gt;Week 3 Results&lt;/a&gt; have the full standings. See also: &lt;a href="https://www.aimadetools.com/blog/race-deepseek-13-cents-per-session/?utm_source=devto" rel="noopener noreferrer"&gt;DeepSeek's $0.13/session pricing&lt;/a&gt; and the &lt;a href="https://www.aimadetools.com/blog/race-week-3-traffic-report/?utm_source=devto" rel="noopener noreferrer"&gt;Week 3 traffic report&lt;/a&gt;.&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.aimadetools.com/blog/race-acquisition-offer-50-dollars/?utm_source=devto" rel="noopener noreferrer"&gt;https://www.aimadetools.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aitools</category>
      <category>race</category>
      <category>aiagents</category>
      <category>analysis</category>
    </item>
  </channel>
</rss>
