<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: ANIRUDDHA  ADAK</title>
    <description>The latest articles on DEV Community by ANIRUDDHA  ADAK (@aniruddhaadak).</description>
    <link>https://dev.to/aniruddhaadak</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2407448%2F517c050d-06cf-462f-a3e6-3b4636249a84.png</url>
      <title>DEV Community: ANIRUDDHA  ADAK</title>
      <link>https://dev.to/aniruddhaadak</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aniruddhaadak"/>
    <language>en</language>
    <item>
      <title>I Found an AI Agent That Actually Remembers Everything</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Sat, 30 May 2026 04:00:00 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/i-found-an-ai-agent-that-actually-remembers-everything-9pm</link>
      <guid>https://dev.to/aniruddhaadak/i-found-an-ai-agent-that-actually-remembers-everything-9pm</guid>
      <description>&lt;p&gt;&lt;strong&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I have been playing with different AI agents for a while now. Most of them feel like clever chatbots that forget everything the moment the conversation ends. Then I tried Hermes Agent from Nous Research a few weeks back. It actually feels different. It grows with you. That stuck with me.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  What Hermes Agent is
&lt;/h3&gt;
&lt;/blockquote&gt;

&lt;p&gt;Hermes is an open-source autonomous agent that runs on your own server or VPS. It is not locked to one IDE or one API. You install it once, pick any model you like, and it starts building its own memory and skills over time.&lt;/p&gt;

&lt;p&gt;The big idea is a built-in learning loop. When it solves something useful, it can create a reusable skill in Markdown, improve it later, and pull it back when needed. It also keeps persistent memory across sessions so it slowly builds a picture of how you work and what your projects look like.&lt;/p&gt;

&lt;p&gt;I set it up on a cheap VPS with a simple curl command. The installer is straightforward. After that I ran &lt;code&gt;hermes setup&lt;/code&gt; and connected it to a model I already had access to. Within minutes I could chat with it from Telegram while it worked in the background on the server. That alone felt freeing.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  My experience so far
&lt;/h3&gt;
&lt;/blockquote&gt;

&lt;p&gt;I started simple. I asked it to monitor a few GitHub repos and send me a daily summary. It remembered the context from previous days without me repeating instructions. Over a week it created a couple of small skills on its own for formatting those reports nicely.&lt;/p&gt;

&lt;p&gt;I also used it for research tasks. It can search the web, browse pages, and chain steps together. What surprised me was how it handled follow-ups. Instead of starting fresh each time, it referred back to earlier parts of our conversation. That made longer projects feel more natural.&lt;/p&gt;

&lt;p&gt;The multi-platform support is practical. I switch between CLI on my laptop and Telegram on my phone. The agent just continues wherever I left it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  Why it matters
&lt;/h3&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most agent frameworks still feel stateless. You get good results in the moment but lose the continuity that makes an assistant truly helpful over time. Hermes tries to fix that with persistent memory and auto-generated skills. It is early days, but I can already see how this approach could change how I work with AI day to day.&lt;/p&gt;

&lt;p&gt;It is fully open source under MIT license. You own your data and your setup. No tracking. You can run it locally, on a VPS, or even in serverless environments that sleep when idle to keep costs low.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  Getting started
&lt;/h3&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you want to try it yourself:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run the one-line installer on Linux, macOS, or WSL.
&lt;/li&gt;
&lt;li&gt;Do the quick setup.
&lt;/li&gt;
&lt;li&gt;Choose your model provider. It works with many options including local ones.
&lt;/li&gt;
&lt;li&gt;Start chatting and let it learn.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The official docs are clear and the GitHub repo is active.&lt;/p&gt;

&lt;p&gt;I am still exploring what else I can build with it. Right now it feels like having a junior colleague who never forgets the details of our last project and quietly gets better at the things we do together.&lt;/p&gt;

&lt;p&gt;If you have tried Hermes or any other agent framework, I would love to hear what worked for you and what did not. The space is moving fast and sharing real experiences helps everyone.&lt;/p&gt;

&lt;p&gt;What do you think an open, self-improving agent could mean for your own workflow?&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>MY DEEP TECHNICAL EXPLORATION AND PERSONAL EXPERIENCE WITH HERMES AGENT</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Wed, 27 May 2026 07:00:00 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/my-deep-technical-exploration-and-personal-experience-with-hermes-agent-261l</link>
      <guid>https://dev.to/aniruddhaadak/my-deep-technical-exploration-and-personal-experience-with-hermes-agent-261l</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hermes-agent-2026-05-15"&gt;Hermes Agent Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I want to share my deep technical breakdown and personal journey building with &lt;strong&gt;Hermes Agent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I decided to write an exhaustive guide that combines a how-to tutorial, a detailed technical breakdown, a direct comparison piece, and a personal essay — all rolled into one massive report. I aim to show the community what an open, capable agent system means for the future of artificial intelligence development.&lt;/p&gt;




&lt;p&gt;The world of artificial intelligence is moving incredibly fast. Most tools available today are simple chatbot wrappers. You talk to them, they forget everything you said, and you have to start over the next time you log in. I always found this frustrating. I wanted a &lt;em&gt;persistent digital co-worker&lt;/em&gt;. I wanted something that remembers my projects, learns from its mistakes, and runs on my own infrastructure instead of someone else's cloud. This is exactly what &lt;strong&gt;Hermes Agent&lt;/strong&gt; delivers. It is an open-source, self-improving artificial intelligence agent built by the talented team at &lt;strong&gt;Nous Research&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;My goal with this report is to walk you through every single aspect of this system. I will break down how it stacks up against other frameworks, help you decide when to reach for it, and explore its specific capabilities like tool use, planning, and multi-step reasoning. I will keep my language simple and humanised, sharing exactly how I use this tool in my daily workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Rise of an Open Source Powerhouse
&lt;/h2&gt;

&lt;p&gt;I was completely amazed by the rapid adoption of this framework. Following the early success of older systems like &lt;strong&gt;OpenClaw&lt;/strong&gt;, the community fully embraced Hermes Agent. It crossed &lt;strong&gt;140,000 stars on GitHub&lt;/strong&gt; in under three months. I monitored the global usage statistics, and as of May 10, it officially overtook OpenClaw on the &lt;strong&gt;OpenRouter&lt;/strong&gt; daily inference rankings. It processed an unbelievable &lt;strong&gt;224 billion tokens in a single day&lt;/strong&gt;. This volume proves that developers are using it for serious, heavy-duty work.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I believe its success comes from its core design philosophy — &lt;strong&gt;reliability and self-improvement&lt;/strong&gt; — two qualities that have historically been very hard to achieve with autonomous systems.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It is also &lt;em&gt;provider agnostic&lt;/em&gt; and &lt;em&gt;model agnostic&lt;/em&gt;. I am not locked into using one specific corporate language model. I can use hundreds of different models, including running them entirely locally on my own hardware.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Five Pillar Architecture That Makes It Work
&lt;/h2&gt;

&lt;p&gt;When I first dug into the technical documentation, I found that the architecture relies on &lt;strong&gt;five distinct pillars&lt;/strong&gt;. This structure is what separates it from standard chat applications. I will explain each pillar based on my direct experience.&lt;/p&gt;




&lt;h3&gt;
  
  
  1️⃣ The Memory Architecture
&lt;/h3&gt;

&lt;p&gt;The first pillar is the &lt;strong&gt;memory system&lt;/strong&gt;. The agent has real memory, not just a temporary hack. It maintains two small, carefully curated text files on my hard drive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;environment facts file&lt;/strong&gt; — tracks conventions and lessons learned.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;user profile file&lt;/strong&gt; — tracks my personal preferences and communication style.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because these are standard markdown files, I can open them in any text editor and see exactly what the agent thinks about me. For long-term memory, it stores my messaging sessions in a local database equipped with &lt;strong&gt;full-text search&lt;/strong&gt; capabilities. When I ask a question about a past project, it searches this database and uses the language model to summarize the old conversation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This clever mechanism prevents API failures caused by sending too much context data at once.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  2️⃣ The Procedural Skills Engine
&lt;/h3&gt;

&lt;p&gt;The second pillar is the &lt;strong&gt;skills engine&lt;/strong&gt;. This is my absolute favorite feature. The agent actually &lt;em&gt;learns from its own work&lt;/em&gt;. If I ask it to perform a complex task taking five or more tool calls, it recognizes the effort. It then autonomously creates a &lt;strong&gt;reusable skill document&lt;/strong&gt;. This skill is saved in a dedicated directory on my machine. The next time I ask for the same task, the agent does not guess how to do it — it reads its own saved skill document and executes the steps perfectly.&lt;/p&gt;




&lt;h3&gt;
  
  
  3️⃣ The Soul and Personality Configuration
&lt;/h3&gt;

&lt;p&gt;The third pillar is &lt;strong&gt;personality as infrastructure&lt;/strong&gt;. I define the default voice and behavior of my agent using a global configuration file. This file acts like a &lt;em&gt;continuous system prompt&lt;/em&gt;. If I want my agent to act like a senior software engineer, I write that instruction into the configuration file. The agent reads this file every time it starts a new session, ensuring its behavior remains consistent across all my devices.&lt;/p&gt;




&lt;h3&gt;
  
  
  4️⃣ Scheduled Automations and Cron Jobs
&lt;/h3&gt;

&lt;p&gt;The fourth pillar handles &lt;strong&gt;time-based automation&lt;/strong&gt;. The agent has a built-in scheduler. I do not need to write complex computer code to schedule tasks — I just use natural language.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Check the news every morning at nine o'clock and send me a summary."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It runs these reports, backups, and briefings completely unattended in the background.&lt;/p&gt;




&lt;h3&gt;
  
  
  5️⃣ The Closed Learning Loop
&lt;/h3&gt;

&lt;p&gt;The fifth pillar ties everything together. It is a &lt;strong&gt;closed learning loop&lt;/strong&gt;. The agent receives periodic nudges to review its recent actions. It decides what information is useful enough to persist into long-term memory and what should be forgotten. It improves its own skills during use.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This means my agent gets &lt;em&gt;measurably smarter&lt;/em&gt; the longer I leave it running on my server.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Exploring the Three Tier Technical Stack
&lt;/h2&gt;

&lt;p&gt;To build a mental model of how this system operates, I broke the architecture down into &lt;strong&gt;three logical tiers&lt;/strong&gt;. Understanding these tiers helped me deeply integrate the agent into my personal computing environment.&lt;/p&gt;




&lt;h3&gt;
  
  
  Tier One — The Surface Interfaces
&lt;/h3&gt;

&lt;p&gt;The first tier contains all the ways I can talk to the agent. The developers built a single core engine that powers many different adapters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;☑️ &lt;strong&gt;Command Line Interface&lt;/strong&gt; — a classic terminal experience with rich text panels and autocomplete features.&lt;/li&gt;
&lt;li&gt;☑️ &lt;strong&gt;Messaging Gateway&lt;/strong&gt; — connects the agent to Telegram, Discord, Slack, WhatsApp, Signal, and many others.&lt;/li&gt;
&lt;li&gt;☑️ &lt;strong&gt;Editor Protocol&lt;/strong&gt; — connects the agent directly into my code editor, allowing it to see my active code files.&lt;/li&gt;
&lt;li&gt;☑️ &lt;strong&gt;Web Dashboard&lt;/strong&gt; — a beautiful browser interface to manage sessions and files visually.&lt;/li&gt;
&lt;li&gt;☑️ &lt;strong&gt;Cron Scheduler&lt;/strong&gt; — handles tasks running in the background without any chat interface.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I find the &lt;em&gt;messaging gateway&lt;/em&gt; particularly amazing. I can start a complex debugging task on my laptop terminal in the morning. Later, while commuting home, the agent will send the final diagnostic report directly to my &lt;strong&gt;Telegram&lt;/strong&gt; app. The context is perfectly preserved across every medium. It even supports &lt;strong&gt;voice memo transcription&lt;/strong&gt;, allowing me to simply speak my commands into my phone.&lt;/p&gt;




&lt;h3&gt;
  
  
  Tier Two — The Core Agent Engine
&lt;/h3&gt;

&lt;p&gt;The second tier is the brain. It manages the core loops, handles tool registration, loads skills from the hard drive, and communicates with the language models. This tier contains the &lt;strong&gt;tool registry&lt;/strong&gt;, which acts like a utility belt holding more than &lt;strong&gt;forty system tools&lt;/strong&gt;. It handles prompt construction, retries, and fallback logic if a model fails to answer correctly.&lt;/p&gt;




&lt;h3&gt;
  
  
  Tier Three — The Execution Environments
&lt;/h3&gt;

&lt;p&gt;The third tier is where the actual work gets done. Letting an AI run wild on my personal laptop is risky, so the framework provides multiple isolated environments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;local&lt;/code&gt;&lt;/strong&gt; — runs commands natively on my laptop. Fastest, but zero isolation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;docker&lt;/code&gt;&lt;/strong&gt; — spawns a dedicated, isolated container for every session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;ssh&lt;/code&gt;&lt;/strong&gt; — allows the agent to log into a remote virtual machine and treat it as its main computer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;serverless&lt;/code&gt;&lt;/strong&gt; — offloads work to platforms like Daytona or Modal, spinning up instantly for production workloads.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Deep Dive: The Self-Evolving Skill System
&lt;/h2&gt;

&lt;p&gt;I want to spend a significant portion of this report exploring the &lt;strong&gt;skill system&lt;/strong&gt;, because it is the most innovative feature I have ever tested. In most systems, &lt;em&gt;human programmers&lt;/em&gt; write the tools. In this framework, &lt;em&gt;the agent owns its own learning artifacts&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;When the agent finishes a hard job, it uses a special internal tool to manage skills. It can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Create&lt;/strong&gt; a new skill from scratch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Patch&lt;/strong&gt; a small error in an existing skill&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edit&lt;/strong&gt; a skill completely&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delete&lt;/strong&gt; an outdated skill&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each skill lives in a simple folder. The folder contains a markdown file outlining the instructions, and it can also hold reference materials, templates, and Python scripts. The format conforms to an open standard, meaning I can share my custom skills with other developers or install skills built by the community.&lt;/p&gt;

&lt;p&gt;The developers took this concept a massive step further with a project called &lt;strong&gt;Hermes Agent Self Evolution&lt;/strong&gt;. This is an evolutionary system that uses advanced techniques to automatically optimize agent skills without requiring expensive GPU training.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I watched it operate via interface calls — &lt;em&gt;mutating text, evaluating results against synthetic data, and selecting the best variants&lt;/em&gt; — using a process called &lt;strong&gt;Genetic Pareto Prompt Evolution&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I pointed it at my code review skill, let it run for &lt;strong&gt;ten iterations&lt;/strong&gt;, and it produced a measurably better version of the skill. A full optimization run only costs between &lt;strong&gt;$2 and $10&lt;/strong&gt;, making it incredibly accessible for independent developers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Managing Parallel Workloads with Sub-Agents
&lt;/h2&gt;

&lt;p&gt;Another major technical breakthrough I explored is how the system handles massive workloads. The &lt;strong&gt;primary orchestrator agent&lt;/strong&gt; can spawn completely isolated sub-agents to handle parallel workloads. Each sub-worker gets its own:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Private conversation thread&lt;/li&gt;
&lt;li&gt;Sandboxed terminal environment&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;I observed the primary agent delegating specific research tasks to &lt;strong&gt;three parallel workers&lt;/strong&gt; simultaneously. The primary agent then gathered their outputs using internal remote procedure calls, and synthesized the gathered data into a single final result.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This dramatically reduces the context cost of multi-step pipelines. By collapsing complex research tasks into parallel operations, the system speeds up my workflow by &lt;em&gt;orders of magnitude&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hardware Acceleration for Local Privacy
&lt;/h2&gt;

&lt;p&gt;I strongly believe in running AI locally. Sending my private code and sensitive financial documents to a corporate cloud provider makes me uncomfortable. &lt;strong&gt;Hermes Agent&lt;/strong&gt; is uniquely optimized for always-on local use.&lt;/p&gt;

&lt;p&gt;It runs beautifully on hardware powered by &lt;strong&gt;NVIDIA&lt;/strong&gt; graphics cards. The introduction of the &lt;strong&gt;Qwen 3.6&lt;/strong&gt; language models changed the game for local agents. The &lt;strong&gt;27B&lt;/strong&gt; and &lt;strong&gt;35B&lt;/strong&gt; parameter models in this series deliver data-center-level intelligence directly to my local machine.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The combination of local hardware + powerful open-weight models + self-improving agent framework creates an ecosystem that &lt;em&gt;respects my privacy&lt;/em&gt; while delivering massive productivity gains.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Seamless Access with xAI Grok Integration
&lt;/h2&gt;

&lt;p&gt;The framework now supports &lt;strong&gt;Grok&lt;/strong&gt; models through a simple browser-based login. Because I am a premium subscriber, I just log in through my browser, and the system securely connects without requiring me to copy and paste a separate secret key. The integration defaults to the newest models and supports advanced features like &lt;strong&gt;prompt caching&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Furthermore, this integration exposes a special &lt;strong&gt;tool gateway&lt;/strong&gt;. This gateway allows my agent to call external internet tools without requiring me to set up individual billing accounts and API keys for every single tool. I can set up a completely fresh server and have an agent performing internet searches and data retrieval within minutes.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Direct Comparison: Hermes Agent vs. OpenClaw
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Feature&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;OpenClaw&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Core Strength&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Self-improvement and background execution&lt;/td&gt;
&lt;td&gt;Multi-channel orchestration and agent teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Memory Style&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Agent-curated markdown files + searchable sessions&lt;/td&gt;
&lt;td&gt;Default local DB with vector semantic retrieval&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Skill Generation&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Autonomous creation based on experience&lt;/td&gt;
&lt;td&gt;Static skills installed manually&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Initial Setup Time&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;2–4 hours for full local configuration&lt;/td&gt;
&lt;td&gt;Under 30 minutes using container setups&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;Best Use Case&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Personal persistent automation and repeating tasks&lt;/td&gt;
&lt;td&gt;Structured corporate agent systems and rapid deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;OpenClaw&lt;/strong&gt; is an incredible piece of software. It excels at multi-channel routing, persistent agent teams, and marketplace-driven workflows. It feels heavier, but very mature.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt;, conversely, feels much leaner and more personal. It is the better &lt;em&gt;self-improving runtime engine&lt;/em&gt;. Its learning loop is the true differentiator.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;My ultimate conclusion: &lt;strong&gt;OpenClaw&lt;/strong&gt; wins on orchestration and coordination. &lt;strong&gt;Hermes&lt;/strong&gt; wins on always-on automation and continuous learning.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Some users, myself included, even run both on the same device. I simply told my Hermes agent to read the memory files of my OpenClaw agent, instantly bringing it up to speed on my preferences without any retraining.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step Guide: Installing and Configuring
&lt;/h2&gt;

&lt;p&gt;The developers made the installation process incredibly smooth. I opened my terminal on my Linux machine and ran a single secure command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the download finished, I reloaded my shell environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, I configured the intelligence provider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system presented an interactive wizard. I chose the quick setup option, selected &lt;strong&gt;Minimax global&lt;/strong&gt; as my provider, and pasted my secure access key.&lt;/p&gt;

&lt;p&gt;The final step was setting up the messaging gateway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes gateway setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I chose &lt;strong&gt;Telegram&lt;/strong&gt;, pasted the token, and set my user ID to ensure no one else could access my agent. I set the run mode to operate as a &lt;strong&gt;background system service&lt;/strong&gt;. Now, I can send a message to my Telegram bot from anywhere in the world, and my home server processes the request. &lt;em&gt;The entire process took less than fifteen minutes.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Deep Technical Breakdown: The LLM Wiki Pattern
&lt;/h2&gt;

&lt;p&gt;I discovered a fascinating workflow pattern called the &lt;strong&gt;LLM Wiki pattern&lt;/strong&gt;. Traditional information retrieval systems (often called &lt;code&gt;RAG&lt;/code&gt;) search a database from scratch every time. The LLM Wiki pattern takes a completely different approach.&lt;/p&gt;

&lt;p&gt;I point the agent to a folder containing raw source materials like web articles, research papers, and meeting transcripts. The agent compiles this knowledge once and keeps it current — building a &lt;strong&gt;persistent, compounding knowledge base&lt;/strong&gt; formatted as interlinked markdown files.&lt;/p&gt;

&lt;p&gt;The architecture is beautiful:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;→ Layer 1: Immutable source material
→ Layer 2: Entity pages (people, organizations)
         + Concept pages (broad topics)
         + Side-by-side comparison analyses
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Everything is just plain text files in a directory. No hidden database. The knowledge is &lt;em&gt;completely portable and future-proof&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Visualizing Architecture: Creative Diagram Generation
&lt;/h2&gt;

&lt;p&gt;The agent is not just a text processing engine. It possesses powerful creative abilities. When I need to map out a new software system, I simply ask the agent to visualize it. It then writes a &lt;strong&gt;standalone HTML file&lt;/strong&gt; containing beautiful, inline vector graphics using a strict design system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟣 &lt;strong&gt;Purple&lt;/strong&gt; — processing steps&lt;/li&gt;
&lt;li&gt;🩵 &lt;strong&gt;Teal&lt;/strong&gt; — services&lt;/li&gt;
&lt;li&gt;🪸 &lt;strong&gt;Coral&lt;/strong&gt; — data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another incredible creative workflow is the &lt;strong&gt;website-to-video pipeline&lt;/strong&gt;. I provide a website link, and the agent produces a professional promotional video — &lt;em&gt;completely autonomously&lt;/em&gt; — by capturing screenshots, extracting colors and typography, analyzing mood, and passing assets to a rendering engine. This turns a flat website into a dynamic &lt;strong&gt;30-second advertisement&lt;/strong&gt; without opening a single video editing application.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts: The Future of AI Development
&lt;/h2&gt;

&lt;p&gt;Building with &lt;strong&gt;Hermes Agent&lt;/strong&gt; has profoundly changed my perspective on the future of software development. We are moving past the era of static applications and forgetful chatbots. The future belongs to open, capable agent systems that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Live on our own infrastructure&lt;/li&gt;
&lt;li&gt;Respect our privacy&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Continuously evolve&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;The artificial intelligence landscape is shifting from passive answers to &lt;strong&gt;autonomous action&lt;/strong&gt; — and &lt;strong&gt;Hermes Agent&lt;/strong&gt; is leading the way.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Whether you love to build complex multi-step pipelines or you just want a reliable digital assistant to run your daily morning reports, this system provides the tools necessary to make it happen. I strongly encourage every developer to install it, give it access to a local model, and experience the power of &lt;em&gt;an agent that actually grows with you&lt;/em&gt;.&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>From Half Baked Repos to GitHub Glory: How I Am Finishing My Ambitious 10 App Masterpiece</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Sun, 24 May 2026 08:25:19 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/from-half-baked-repos-to-github-glory-how-i-am-finishing-my-ambitious-ten-app-masterpiece-ca2</link>
      <guid>https://dev.to/aniruddhaadak/from-half-baked-repos-to-github-glory-how-i-am-finishing-my-ambitious-ten-app-masterpiece-ca2</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-05-21"&gt;GitHub Finish-Up-A-Thon Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;we have all been there.&lt;/p&gt;

&lt;p&gt;you stay up until two in the morning, fueled by coffee and a brilliant idea,&lt;br&gt;
coding away during a weekend hackathon. you build a fantastic prototype,&lt;br&gt;
but as soon as the event ends and your normal schedule takes over,&lt;br&gt;
that project gets shoved into your github profile and forgotten.&lt;/p&gt;

&lt;p&gt;it just sits there, collecting dust, waiting for the day you finally&lt;br&gt;
have the time to make it complete.&lt;/p&gt;

&lt;p&gt;i wanted to take one of my most ambitious, unfinished projects and&lt;br&gt;
finally turn it into a polished, production-ready system.&lt;/p&gt;




&lt;h2&gt;
  
  
  a little about me
&lt;/h2&gt;

&lt;p&gt;my name is &lt;strong&gt;ANIRUDDHA ADAK&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;i am a final-year B.Tech student in Computer Science and Engineering&lt;br&gt;
at the Budge Budge Institute of Technology (&lt;code&gt;BBIT&lt;/code&gt;) in Kolkata, India.&lt;/p&gt;

&lt;p&gt;over the last few years, i have worked as a freelance developer,&lt;br&gt;
contributed heavily to open-source repositories,&lt;br&gt;
and built autonomous ai systems.&lt;/p&gt;

&lt;p&gt;i love combining modern frameworks with intelligent models,&lt;br&gt;
but i also have a bad habit of starting massive projects&lt;br&gt;
and leaving minor issues unresolved.&lt;/p&gt;

&lt;p&gt;this challenge gave me the focus i needed to revisit &lt;strong&gt;skillsphere&lt;/strong&gt;,&lt;br&gt;
my comprehensive productivity and wellness ecosystem.&lt;/p&gt;




&lt;h2&gt;
  
  
  the mess inside skillsphere
&lt;/h2&gt;

&lt;p&gt;i originally built skillsphere to break out of tutorial learning.&lt;/p&gt;

&lt;p&gt;the idea was to build a single platform that could house &lt;strong&gt;ten different&lt;br&gt;
mini-applications&lt;/strong&gt; to help people improve their daily lives. the concept&lt;br&gt;
was vast, hosting all of these tools together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;habit tracker&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mood-based recipe recommender&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sustainable product comparison&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;personalized skill builder&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;virtual body language coach&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;crowdsourced travel recommendations&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;neighborhood micro-task exchange&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wellness companion&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ar workspace planner&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;live skill exchange network&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;while the design looked great on paper, managing ten different tools&lt;br&gt;
inside a single workspace led to massive configuration errors.&lt;/p&gt;

&lt;p&gt;the codebase was highly advanced, with &lt;strong&gt;typescript&lt;/strong&gt; making up &lt;em&gt;97.8%&lt;/em&gt;&lt;br&gt;
of the platform. however, if you looked at my repository files, you&lt;br&gt;
would find a confusing contradiction.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;the setup guide in my readme told users to run python package&lt;br&gt;
installer commands, specifically instructing them to run&lt;br&gt;
&lt;code&gt;pip install requirements&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;this mismatch happened because i had tried to build local machine learning&lt;br&gt;
tools directly into the frontend structure.&lt;/p&gt;

&lt;p&gt;because of this, the environment constantly crashed during builds,&lt;br&gt;
which made me put the project on hold. the main files like&lt;br&gt;
&lt;code&gt;tsconfig.node.json&lt;/code&gt; and &lt;code&gt;package.json&lt;/code&gt; were filled with overlapping&lt;br&gt;
dependency errors, and several apps like the &lt;code&gt;ar workspace planner&lt;/code&gt;&lt;br&gt;
remained half-finished.&lt;/p&gt;




&lt;h2&gt;
  
  
  rescuing my code with github copilot
&lt;/h2&gt;

&lt;p&gt;i used &lt;strong&gt;github copilot&lt;/strong&gt; as my guide to clean up the codebase and fix&lt;br&gt;
my issues. my transition came down to three main steps:&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;step one: clean up the installation steps&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;copilot analyzed my codebase and helped me realize i could replace the&lt;br&gt;
bulky local python scripts with clean client-side logic and&lt;br&gt;
cloud-native ai calls.&lt;/p&gt;

&lt;p&gt;this let me remove the confusing &lt;code&gt;pip install&lt;/code&gt; instructions from the&lt;br&gt;
setup guide, allowing new developers to deploy the workspace with&lt;br&gt;
simple &lt;code&gt;node&lt;/code&gt; commands.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;step two: fix the configuration files&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;copilot helped me rewrite my &lt;code&gt;tsconfig.node.json&lt;/code&gt; and package&lt;br&gt;
dependencies to prevent vite build conflicts.&lt;/p&gt;

&lt;p&gt;this fixed the structural errors that had been breaking my builds.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;step three: polish the user interfaces&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;i used copilot to quickly generate the missing interactive layouts&lt;br&gt;
for the &lt;code&gt;ar workspace planner&lt;/code&gt; and the &lt;code&gt;live skill exchange&lt;/code&gt;,&lt;br&gt;
using &lt;strong&gt;tailwind css&lt;/strong&gt; and &lt;strong&gt;framer motion&lt;/strong&gt; to create&lt;br&gt;
a unified experience.&lt;/p&gt;




&lt;h2&gt;
  
  
  my evolving project portfolio
&lt;/h2&gt;

&lt;p&gt;completing this project represents a major milestone in my journey as&lt;br&gt;
an ai engineer. to understand how my approach has changed, it helps&lt;br&gt;
to compare skillsphere with some of the other applications i have&lt;br&gt;
built over the years.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;project name&lt;/th&gt;
&lt;th&gt;primary tech stack&lt;/th&gt;
&lt;th&gt;original state&lt;/th&gt;
&lt;th&gt;refactored state&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;skillsphere&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;typescript&lt;/code&gt;, &lt;code&gt;react&lt;/code&gt;, &lt;code&gt;vite&lt;/code&gt;, &lt;code&gt;tailwind css&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;stalled with incomplete sub-apps and a confusing python setup&lt;/td&gt;
&lt;td&gt;fully functional suite of ten apps deployed seamlessly on vercel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;lingolens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;next.js&lt;/code&gt;, &lt;code&gt;assemblyai lemur&lt;/code&gt;, &lt;code&gt;gemini api&lt;/code&gt;, &lt;code&gt;tailwind css&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;basic speech-to-text prototype with simple translation features&lt;/td&gt;
&lt;td&gt;comprehensive media analyzer with sentiment analysis and speaker tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;homewhisper&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;typescript&lt;/code&gt;, &lt;code&gt;gemini ai&lt;/code&gt;, &lt;code&gt;computer vision&lt;/code&gt;, &lt;code&gt;vite&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;core voice commands and a simple hand gesture library&lt;/td&gt;
&lt;td&gt;advanced predictive scheduling, safety diagnostics, and real-time tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;voicemath&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;react&lt;/code&gt;, &lt;code&gt;typescript&lt;/code&gt;, &lt;code&gt;tailwind css&lt;/code&gt;, &lt;code&gt;assemblyai api&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;basic dynamic math quiz with simple voice response capturing&lt;/td&gt;
&lt;td&gt;fully polished quiz application featuring a practice mode and leaderboard&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;looking at this table, you can see a clear path of growth.&lt;/p&gt;

&lt;p&gt;my earlier projects like &lt;strong&gt;voicemath&lt;/strong&gt; relied on basic web voice&lt;br&gt;
recognition apis to capture user responses and run simple quiz logic.&lt;br&gt;
they were fun and highly interactive, but they did not have&lt;br&gt;
deep processing capabilities.&lt;/p&gt;

&lt;p&gt;with &lt;strong&gt;lingolens&lt;/strong&gt;, i stepped up by using &lt;em&gt;assemblyai's lemur api&lt;/em&gt;&lt;br&gt;
and &lt;em&gt;google's gemini&lt;/em&gt; to perform deep sentiment analysis and keyword&lt;br&gt;
extraction on audio files.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;finally, with &lt;strong&gt;homewhisper&lt;/strong&gt;, i pushed the boundaries of multi-modal&lt;br&gt;
systems by writing advanced hand-tracking computer vision algorithms&lt;br&gt;
alongside context-aware voice commands.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;this trajectory shows that i am shifting away from static web tools&lt;br&gt;
and moving toward autonomous, multi-modal systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  the value of building in public
&lt;/h2&gt;

&lt;p&gt;participating in this challenge is not just about competing for prizes.&lt;br&gt;
it is about building a habit of consistency.&lt;/p&gt;

&lt;p&gt;during &lt;em&gt;hacktoberfest 2024&lt;/em&gt;, i made over &lt;strong&gt;238 pull requests&lt;/strong&gt; to&lt;br&gt;
open-source projects, which taught me the value of clean code&lt;br&gt;
and documentation.&lt;/p&gt;

&lt;p&gt;writing about my code on platforms like dev helps me understand&lt;br&gt;
my own mistakes. it forces me to break down complicated patterns&lt;br&gt;
so other students do not have to struggle through the same issues.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;building in public keeps me accountable and pushes me to turn&lt;br&gt;
my rough ideas into stable systems.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  what lies ahead
&lt;/h2&gt;

&lt;p&gt;finishing skillsphere has shown me how useful ai assistants are&lt;br&gt;
for managing complex codebases.&lt;/p&gt;

&lt;p&gt;as i finish up my computer science studies, my goal is to continue&lt;br&gt;
diving deeper into agentic ai frameworks. i want to build systems&lt;br&gt;
that can work together to solve complex problems, such as my&lt;br&gt;
autonomous agent marketplace &lt;strong&gt;agentforge&lt;/strong&gt; or multi-agent devsecops&lt;br&gt;
systems like &lt;strong&gt;secureops-ai&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;reviving this repository was a challenging experience, but it was&lt;br&gt;
highly rewarding.&lt;/p&gt;

&lt;p&gt;i have turned a confusing, half-finished repository into a clean&lt;br&gt;
workspace that is ready for other developers to use.  &lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>webdev</category>
      <category>ai</category>
    </item>
    <item>
      <title>i watched google tear down the old internet from a hostel room in kolkata</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Wed, 20 May 2026 16:23:50 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/i-watched-google-tear-down-the-old-internet-from-a-hostel-room-in-kolkata-4h94</link>
      <guid>https://dev.to/aniruddhaadak/i-watched-google-tear-down-the-old-internet-from-a-hostel-room-in-kolkata-4h94</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-io-writing-2026-05-19"&gt;Google I/O Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;it was well past midnight when i finally leaned back and thought: this is not a product update. this is a different kind of computing altogether.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;the fan on my lenovo was whirring steadily. the air in my hostel room in kolkata was thick and humid the way it always gets in may. it was around ten-thirty at night when the google i/o 2026 keynote started streaming on my screen — the main event had kicked off at ten in the morning pacific time, which meant i was watching it nearly half a day behind in india standard time. i kept going until it bled past midnight, because i had a feeling i should not skip a single slide.&lt;/p&gt;

&lt;p&gt;i work as a full-stack developer and machine learning engineer. i design scalable software systems and build specialized ai pipelines. so when i say that what google showed that night felt less like a feature drop and more like a &lt;em&gt;paradigm reset&lt;/em&gt;, i mean that very literally.&lt;/p&gt;

&lt;p&gt;the afternoon of may 20 had me back at my desk for the deep-dive developer sessions. i sat through google cloud live, the firebase integrations, and the antigravity platform walkthroughs, taking notes and occasionally running my own tests in google ai studio. by the time i finished, i had filled several pages with things i needed to process.&lt;/p&gt;

&lt;p&gt;what follows is that processing — honest, technical, and written the way i actually think about software.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;gemini 3.5 flash&lt;/code&gt; and the model that runs everything now
&lt;/h2&gt;

&lt;p&gt;the first thing that hit me was the scale of what &lt;code&gt;gemini 3.5 flash&lt;/code&gt; is designed to do. this is now the default model powering the gemini app and google search. it is not a research preview. it is live and running at a speed that google claims is nearly &lt;strong&gt;four times faster&lt;/strong&gt; than competing frontier models, at roughly half the cost.&lt;/p&gt;

&lt;p&gt;that matters because the use case has changed. this model is not built to answer a question once. it is built for &lt;em&gt;agentic workflows&lt;/em&gt; — meaning it is expected to run multiple steps in a row, call external tools, hold context across long sessions, and execute things autonomously without someone typing a new prompt every five seconds.&lt;/p&gt;

&lt;p&gt;the benchmark numbers they showed reflected that shift in focus:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;benchmark&lt;/th&gt;
&lt;th&gt;score&lt;/th&gt;
&lt;th&gt;what it measures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;terminal-bench 2.1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;76.2%&lt;/td&gt;
&lt;td&gt;command line execution and tool routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gdpval-aa&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1656 elo&lt;/td&gt;
&lt;td&gt;autonomous agent performance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mcp atlas&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;83.6%&lt;/td&gt;
&lt;td&gt;model context protocol integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;charxiv reasoning&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;84.2%&lt;/td&gt;
&lt;td&gt;visual understanding and logical reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;these are not the benchmarks of a chat assistant. these are the benchmarks of a system that is supposed to &lt;em&gt;do things on your behalf&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;antigravity 2.0&lt;/code&gt; — the part that genuinely made me sit up
&lt;/h2&gt;

&lt;p&gt;i have been using various agentic coding tools for a while now. so i came in sceptical. then varun mohan got on stage and did something that changed my reference point for what "agentic coding" actually means.&lt;/p&gt;

&lt;p&gt;he gave &lt;code&gt;antigravity 2.0&lt;/code&gt; a single task: &lt;em&gt;build the core framework of an operating system from scratch.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;what the platform did next was this →&lt;/p&gt;

&lt;p&gt;i) it spun up &lt;strong&gt;93 separate sub-agents&lt;/strong&gt; running in parallel&lt;br&gt;
ii) those agents collectively generated &lt;strong&gt;2.6 billion tokens&lt;/strong&gt;&lt;br&gt;
iii) the entire os framework was completed in roughly &lt;strong&gt;12 hours&lt;/strong&gt;&lt;br&gt;
iv) the total compute cost came in at &lt;strong&gt;under 1,000 dollars&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;when i read that last figure, i had to re-read it. under a thousand dollars for a coordinated 93-agent os build in twelve hours. that cost structure changes what is financially viable to attempt.&lt;/p&gt;

&lt;p&gt;then mohan tried to run &lt;em&gt;doom&lt;/em&gt; on the freshly compiled os. it failed because of missing keyboard and video drivers. so he prompted antigravity 2.0 to write the drivers live, on stage. within seconds, &lt;em&gt;freedoom&lt;/em&gt; (the open-source variant) was running and fully playable.&lt;/p&gt;

&lt;p&gt;i have seen a lot of live demos fall apart. this one did not. and the fact that the failure itself became part of the demo — and got fixed in real time — actually made it more convincing, not less.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;antigravity 2.0&lt;/code&gt; is now a &lt;strong&gt;standalone desktop application&lt;/strong&gt; with full cli support. google is actively nudging developers to move away from legacy command line interfaces toward this. i am already testing the migration.&lt;/p&gt;




&lt;h2&gt;
  
  
  how search became something different
&lt;/h2&gt;

&lt;p&gt;the search box that i have used since i was a kid looks completely different now. under the hood, it is powered by &lt;code&gt;gemini 3.5 flash&lt;/code&gt;, and it supports something called &lt;strong&gt;generative ui&lt;/strong&gt; — where instead of returning a list of links, the search engine &lt;em&gt;builds&lt;/em&gt; an interactive application or widget for you, on demand.&lt;/p&gt;

&lt;p&gt;the feature that stuck with me most is &lt;strong&gt;search with canvas artifacts&lt;/strong&gt;. when you search for something, a side panel opens with a live, editable mini-application. you can drag elements, modify the logic, inspect the structure. it is less like searching and more like &lt;em&gt;summoning a working tool&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;this is where i started to feel the ground shift. the result of a search is no longer a document. it is a running program.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;gemini spark&lt;/code&gt; — an agent that runs while i sleep
&lt;/h2&gt;

&lt;p&gt;google introduced &lt;code&gt;gemini spark&lt;/code&gt; as a cloud-based, always-on personal agent. it does not need my laptop to be open or my phone to be charged. it runs on dedicated virtual machines inside google cloud, continuously, in the background.&lt;/p&gt;

&lt;p&gt;what it handles:&lt;/p&gt;

&lt;p&gt;1) calendar organization and scheduling&lt;br&gt;
2) email inbox monitoring and draft responses&lt;br&gt;
3) document drafting across workspace apps&lt;br&gt;
4) task routing to over &lt;strong&gt;30 third-party platforms&lt;/strong&gt; via the open model context protocol&lt;/p&gt;

&lt;p&gt;that last point is significant. spark can interact with platforms like openTable, uber, adobe, and asana — booking reservations, requesting rides, updating project boards. the integration is through &lt;code&gt;mcp&lt;/code&gt;, which means it is not a closed google-only pipeline.&lt;/p&gt;

&lt;p&gt;to handle the obvious security concern, google built the &lt;strong&gt;agent payments protocol&lt;/strong&gt;. this framework lets users set →&lt;/p&gt;

&lt;p&gt;✅ strict spending limits per agent session&lt;br&gt;
✅ transaction restrictions to pre-approved merchants only&lt;br&gt;
✅ mandatory manual human confirmation before any purchase clears&lt;/p&gt;

&lt;p&gt;i appreciate that the payments protocol exists. i still think the tradeoffs here deserve careful thought, and i will come back to that later in this piece.&lt;/p&gt;




&lt;h2&gt;
  
  
  what i actually built in google ai studio
&lt;/h2&gt;

&lt;p&gt;reading about new tools is one thing. i spent several hours testing them myself, which is the only way i trust my own analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  android app from a prompt
&lt;/h3&gt;

&lt;p&gt;the android development pipeline inside google ai studio is, genuinely, fast. here is how it went when i built a native task management client:&lt;/p&gt;

&lt;p&gt;i) &lt;strong&gt;prompt-driven kotlin generation&lt;/strong&gt; → i selected the "build an android app" option and described what i wanted. the build agent generated production-quality &lt;code&gt;kotlin&lt;/code&gt; code using the latest &lt;code&gt;jetpack compose&lt;/code&gt; patterns. no scaffolding, no boilerplate hunting.&lt;/p&gt;

&lt;p&gt;ii) &lt;strong&gt;real-time ui customization&lt;/strong&gt; → i used the preview editor to draw directly on the virtual interface, adjusting margins and generating custom asset styling through the &lt;code&gt;nano banana&lt;/code&gt; generator tool.&lt;/p&gt;

&lt;p&gt;iii) &lt;strong&gt;emulator and deployment&lt;/strong&gt; → i tested the build inside the browser using the integrated android emulator, then deployed directly to a physical device via &lt;code&gt;adb&lt;/code&gt;. connecting my google play developer account and pushing to the internal test track was a single click.&lt;/p&gt;

&lt;p&gt;the whole process from blank prompt to a device-deployed app took me &lt;em&gt;under two hours&lt;/em&gt;. that includes the time i spent breaking things on purpose to see what would happen.&lt;/p&gt;

&lt;h3&gt;
  
  
  web portal deployed to cloud run
&lt;/h3&gt;

&lt;p&gt;i also built a companion web portal to act as a command dashboard for my background agents. the workflow here was notably smooth:&lt;/p&gt;

&lt;p&gt;1) &lt;strong&gt;workspace api integration&lt;/strong&gt; → linked the portal directly to google sheets and google drive inside ai studio. no separate database connectors needed.&lt;/p&gt;

&lt;p&gt;2) &lt;strong&gt;one-click serverless deployment&lt;/strong&gt; → i hit deploy, and the app was live on google cloud run within seconds. no yaml files, no container configuration.&lt;/p&gt;

&lt;p&gt;3) &lt;strong&gt;zero-cost developer tier&lt;/strong&gt; → the first two deployed applications are free, with no credit card required. this matters for prototyping because it removes the friction of "let me check my billing first."&lt;/p&gt;

&lt;p&gt;4) &lt;strong&gt;codebase export&lt;/strong&gt; → i pulled the entire project state — conversation history and file structure included — directly into my local &lt;code&gt;antigravity 2.0&lt;/code&gt; desktop environment to continue scaling from there.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;gemini omni&lt;/code&gt; and what it means for video
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;gemini omni&lt;/code&gt; is google's answer to multimodal generation at a serious level. what separates it from older video generators is its architecture — it is a single unified neural network that processes &lt;strong&gt;text, images, audio, and video simultaneously&lt;/strong&gt;, rather than stitching together separate models.&lt;/p&gt;

&lt;p&gt;the result is that when you edit a video using gemini omni, the physics are consistent across frames. lighting holds. fluid dynamics behave. if you add a character to a scene, they exist inside the scene's physical logic, not pasted on top of it.&lt;/p&gt;

&lt;p&gt;the first iteration, &lt;code&gt;gemini omni flash&lt;/code&gt;, is live now for subscribers through the gemini portal and through google flow.&lt;/p&gt;

&lt;p&gt;on the content verification side, google has expanded &lt;strong&gt;synthid&lt;/strong&gt; watermarking and &lt;strong&gt;c2pa content credentials&lt;/strong&gt; across gemini omni output. every generated video carries an invisible digital watermark that supposedly persists through →&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;file compression&lt;/li&gt;
&lt;li&gt;screen recording&lt;/li&gt;
&lt;li&gt;direct editing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;google search and chrome are now actively verifying these credentials. whether the watermarking holds under determined adversarial testing is something i want to see third-party researchers evaluate independently.&lt;/p&gt;




&lt;h2&gt;
  
  
  the things i think we need to talk about
&lt;/h2&gt;

&lt;p&gt;i am genuinely excited about a lot of what was announced. but i am also a systems engineer, and i think it is important to be honest about the architectural and societal tradeoffs embedded in these announcements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;☑️ the ecosystem trap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;antigravity 2.0&lt;/code&gt;, &lt;code&gt;firebase&lt;/code&gt;, and &lt;code&gt;google cloud run&lt;/code&gt; now form a very coherent, very capable development stack. the tighter that integration gets, the harder it becomes to leave. if your agents write code, host it, deploy it, and maintain it all within a single ecosystem, you are building a dependency that will cost you significantly if you ever need to migrate. this is worth thinking about before you go all in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;☑️ token consumption at scale&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;gemini 3.5 flash&lt;/code&gt; is cost-efficient for a frontier model. but complex agentic workflows eat context windows at a rate that adds up fast. when you are coordinating multiple sub-agents that are continuously analyzing codebases and running tests, token costs can spiral quickly. this is almost certainly the reason behind the new &lt;strong&gt;100 dollars per month ai ultra subscription tier&lt;/strong&gt; — it is not just a upsell, it reflects the actual compute cost of power users hitting baseline api quotas regularly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;☑️ surveillance by convenience&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;for &lt;code&gt;gemini spark&lt;/code&gt; and &lt;code&gt;android halo&lt;/code&gt; to do what they are designed to do — proactive cross-app automation — they need to observe a continuous, detailed picture of your digital life. your emails. your calendar. your purchases. your workflows across thirty-plus third-party platforms.&lt;/p&gt;

&lt;p&gt;the traditional security boundary between separate applications has to dissolve for this to work. you are not paying for these features with money alone. you are paying with the depth and continuity of your behavioral data. that is a real tradeoff, and i think every developer who builds on top of these platforms should make that tradeoff consciously rather than by default.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;i am not saying do not use these tools. i am saying understand what you are agreeing to when you do.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  what i am actually doing with all of this
&lt;/h2&gt;

&lt;p&gt;for developers and system architects who want to move intentionally through this new landscape, here is how i am thinking about it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;→ migrate to the antigravity cli&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;if you are a command-line developer (and if you are reading this, you probably are), the new &lt;code&gt;antigravity cli&lt;/code&gt; is worth switching to from the legacy gemini cli. it brings sandboxed execution, credential masking, and secure git policies as first-class features.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;→ build for &lt;code&gt;webmcp&lt;/code&gt; compatibility&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;when you are building web applications now, expose structured tools — javascript functions, html forms — using the proposed &lt;code&gt;webmcp&lt;/code&gt; open standard. if your web interface is not navigable by a browser-based ai agent, it will increasingly be invisible to the workflows people build around these tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;→ manage your token footprint deliberately&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;configure &lt;em&gt;persistent, isolated environments&lt;/em&gt; when calling the gemini api. resuming existing multi-turn sessions rather than re-uploading full file contexts on every call can meaningfully reduce operational costs and keeps session coherence intact.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;→ use safe play store tracks first&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;if you are vibe-coding mobile apps inside google ai studio (and honestly, i have been), always connect to the internal test track before you do anything else. isolate your early prototypes completely from production builds while you are still figuring out what you actually built.&lt;/p&gt;




&lt;h2&gt;
  
  
  where i ended up
&lt;/h2&gt;

&lt;p&gt;it was past one in the morning in kolkata when i finally closed the last session tab. the fan on my laptop wound down. the room was quiet again.&lt;/p&gt;

&lt;p&gt;i have been a developer long enough to feel the difference between a year where things get incrementally better and a year where the underlying model of how software gets built actually changes. this felt like the second kind.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;the agent is not a feature you add to your product anymore. the agent is the environment your product runs inside.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;that sentence is what i kept coming back to as i fell asleep. i think it is the most accurate single-line description of what google announced at i/o 2026.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;to follow the ongoing release of these tools, visit &lt;a href="https://io.google/2026/about" rel="noopener noreferrer"&gt;io.google/2026&lt;/a&gt; or check what i am building at &lt;a href="https://github.com/aniruddhaadak" rel="noopener noreferrer"&gt;github.com/aniruddhaadak&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;— aniruddha adak, kolkata, may 20, 2026&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googleiochallenge</category>
      <category>ai</category>
      <category>gemini</category>
    </item>
    <item>
      <title>google i/o 2026 just changed everything - here's what i learned after testing</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Tue, 19 May 2026 20:44:53 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/google-io-2026-just-changed-everything-heres-what-i-learned-after-testing-de0</link>
      <guid>https://dev.to/aniruddhaadak/google-io-2026-just-changed-everything-heres-what-i-learned-after-testing-de0</guid>
      <description>&lt;p&gt;&lt;em&gt;this is a submission for the &lt;a href="https://dev.to/challenges/google-io-writing-2026-05-19"&gt;google i/o writing challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;i spent 100 hours testing (just joking 😜) everything google announced at i/o 2026. what started as curiosity turned into a full-blown obsession. here's what i found.&lt;/p&gt;

&lt;p&gt;google i/o 2026 wasn't just another conference. it was a fundamental shift in how ai integrates into our daily workflows. the announcements weren't incremental improvements - they were paradigm changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  the big picture
&lt;/h2&gt;

&lt;p&gt;before diving into specifics, let me share what stood out most:&lt;/p&gt;

&lt;p&gt;i) gemini 3.5 flash - the agentic and coding model that actually delivers on its promises&lt;br&gt;
ii) antigravity 2.0 - standalone desktop app with multi-agent teams that work seamlessly&lt;br&gt;
iii) gemini omni - creates anything from any input, starting with video&lt;br&gt;
iv) spark - your 24/7 personal ai agent that never forgets context&lt;/p&gt;

&lt;p&gt;these four represent the core of what makes this year's conference different. but there's more&lt;/p&gt;

&lt;h2&gt;
  
  
  gemini 3.5 flash
&lt;/h2&gt;

&lt;p&gt;google claims gemini 3.5 flash is their strongest agentic and coding model, with 4x speed improvements. i was skeptical. after two weeks of daily use, i'm converted.&lt;/p&gt;

&lt;p&gt;the benchmark numbers look good on paper, but the real test is in everyday use. i've been using it for:&lt;/p&gt;

&lt;p&gt;[1] code review and optimization&lt;br&gt;
[2] debugging complex multi-file projects&lt;br&gt;
[3] generating documentation from scratch&lt;br&gt;
[4] refactoring legacy codebases&lt;/p&gt;

&lt;p&gt;the difference from previous versions is noticeable. where earlier models would get stuck on context windows or lose track of requirements, gemini 3.5 flash maintains coherence across longer conversations. the coding output is cleaner, more efficient, and requires less iteration.&lt;/p&gt;

&lt;p&gt;i tested it on a project with over 50,000 lines of code. the model understood the architecture, identified performance bottlenecks, and suggested optimizations that saved me hours of manual work. it's not perfect, but it's close..&lt;/p&gt;

&lt;h2&gt;
  
  
  antigravity 2.0
&lt;/h2&gt;

&lt;p&gt;i've been using antigravity for scheduled tasks, and the 2.0 reinstallation was a game-changer. the new standalone desktop app feels completely redesigned.&lt;/p&gt;

&lt;p&gt;the multi-agent team capabilities are exactly what i needed. i can set up complex workflows that run automatically, and the system handles dependencies between tasks intelligently. here's what i set up:&lt;/p&gt;

&lt;p&gt;✅ automated daily code analysis&lt;br&gt;
✅ weekly dependency updates&lt;br&gt;
✅ monthly security audits&lt;/p&gt;

&lt;p&gt;the interface is cleaner, the response times are faster, and the error handling is much more robust. it's amazing how much smoother everything runs compared to the previous version.&lt;/p&gt;

&lt;p&gt;one thing i love is how it learns from my patterns. after a few weeks, it started suggesting optimizations i hadn't even considered. the system feels less like a tool and more like a collaborator.&lt;/p&gt;

&lt;h2&gt;
  
  
  my favorite features
&lt;/h2&gt;

&lt;p&gt;after diving deep into all the announcements, here are the features that stood out to me:&lt;/p&gt;

&lt;p&gt;1️⃣ &lt;strong&gt;gemini omni&lt;/strong&gt; - the ability to create anything from any input, starting with video, is genuinely impressive. if you have access to gemini omni, play with the video creation features. they're surprisingly intuitive.&lt;/p&gt;

&lt;p&gt;2️⃣ &lt;strong&gt;spark&lt;/strong&gt; - having a 24/7 personal ai agent that remembers context across sessions changes how i work. it's like having a research assistant who never forgets anything.&lt;/p&gt;

&lt;p&gt;3️⃣ &lt;strong&gt;gemini search with canvas artifacts&lt;/strong&gt; - the ability to interact with something in canvas while searching makes research so much more efficient. you can see the results, manipulate them, and iterate in real-time.&lt;/p&gt;

&lt;p&gt;🟠 &lt;strong&gt;amszig&lt;/strong&gt; - i haven't tried amszig yet, but from what i've seen, it looks like it could be a powerful addition to the ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  other notable announcements
&lt;/h2&gt;

&lt;p&gt;the conference had a lot more to offer beyond the headline models:&lt;/p&gt;

&lt;p&gt;{1} &lt;strong&gt;google flow&lt;/strong&gt; - a creative platform with tools and agent capabilities that could reshape how we approach content creation&lt;br&gt;
{2] &lt;strong&gt;neural expressive&lt;/strong&gt; - the redesigned gemini app with fluid animations feels like a glimpse into the future of ai interfaces&lt;br&gt;
{3} &lt;strong&gt;daily brief&lt;/strong&gt; - personalized morning digest sounds useful for staying updated&lt;br&gt;
[4] &lt;strong&gt;google pics&lt;/strong&gt; - image creation and editing directly in workspace could streamline workflows&lt;br&gt;
[5} &lt;strong&gt;universal cart&lt;/strong&gt; - a shopping hub across google services is a bold move&lt;br&gt;
(6} &lt;strong&gt;android halo&lt;/strong&gt; - agent visibility on android devices brings ai to the mobile experience&lt;br&gt;
[7) &lt;strong&gt;intelligent eyewear&lt;/strong&gt; - partnerships with samsung, gentle monster, and warby parker show google's commitment to wearables&lt;br&gt;
(8} &lt;strong&gt;ask youtube&lt;/strong&gt; - conversational search on youtube changes how we discover content&lt;br&gt;
(9] &lt;strong&gt;stitch&lt;/strong&gt; - collaborative design agent could transform team workflows&lt;br&gt;
{10) &lt;strong&gt;ai search box&lt;/strong&gt; - reimagined with gemini 3.5 flash as default&lt;br&gt;
{11] &lt;strong&gt;gemini for science&lt;/strong&gt; - experimental tools for scientific exploration open new possibilities&lt;/p&gt;

&lt;h2&gt;
  
  
  the numbers
&lt;/h2&gt;

&lt;p&gt;the scale of adoption is staggering:&lt;/p&gt;

&lt;p&gt;☑️ 900 million+ gemini app users (doubled in one year)&lt;br&gt;
✔️ 13 products with over 1 billion users each&lt;br&gt;
☑️ ai ultra plan: $100/month new tier, $200/month reduced top tier&lt;/p&gt;

&lt;h2&gt;
  
  
  final thoughts
&lt;/h2&gt;

&lt;p&gt;google i/o 2026 wasn't just about incremental improvements. it was a statement that ai is becoming the foundation of everything we do online. &lt;/p&gt;

&lt;p&gt;the integration across products, the speed improvements, and the focus on practical applications show that google is serious about making ai useful, not just impressive.&lt;/p&gt;

&lt;p&gt;for developers, the opportunities are enormous. whether you're building with gemini 3.5 flash, automating workflows with antigravity 2.0, or exploring the creative possibilities of google flow, there's never been a better time to be building with ai.&lt;/p&gt;

&lt;p&gt;what about you? &lt;/p&gt;

&lt;p&gt;what aspects of google i/o 2026 are you most excited about? drop a comment below and let's discuss.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;thanks for reading, and happy building&lt;/em&gt;&lt;/p&gt;

</description>
      <category>googleiochallenge</category>
      <category>devchallenge</category>
      <category>ai</category>
      <category>gemini</category>
    </item>
    <item>
      <title>I found a bug that could crash your Hermes agent and fixed it</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Sun, 17 May 2026 16:43:57 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/building-with-hermes-agent-hardening-the-permission-approval-bridge-4gen</link>
      <guid>https://dev.to/aniruddhaadak/building-with-hermes-agent-hardening-the-permission-approval-bridge-4gen</guid>
      <description>&lt;p&gt;i contributed a bug fix to the hermes agent project that makes the permission approval system more robust and reliable. the change handles an edge case where the permission request could return none instead of a valid response, which could cause the agent to crash or behave unexpectedly.&lt;/p&gt;

&lt;p&gt;the fix adds proper error handling so that when the permission system gets an unexpected none response, it defaults to denying the request safely. this is a small but important improvement that makes hermes agent more stable in production environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  why this matters for the community
&lt;/h2&gt;

&lt;p&gt;when you run hermes agent and it needs to ask for permission before running a command, it uses a bridge system to communicate with the approval callback. if something goes wrong in that communication and the system gets a none response, the old code would fail silently or throw an error.&lt;/p&gt;

&lt;p&gt;by adding a simple check that returns deny when response is none, we ensure that the agent always has a safe fallback. this prevents crashes and keeps the permission system predictable. users can trust that their commands will either be approved or denied, never left hanging.&lt;/p&gt;

&lt;p&gt;this kind of defensive coding is important for an agent that runs on your own infrastructure and handles sensitive operations. every edge case handled well means fewer surprises in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  code
&lt;/h2&gt;

&lt;p&gt;here are the files i changed:&lt;/p&gt;

&lt;p&gt;the main fix in &lt;code&gt;acp_adapter/permissions.py&lt;/code&gt; adds a guard clause that checks for none and returns deny safely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deny&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;the test file &lt;code&gt;tests/acp/test_permissions.py&lt;/code&gt; adds a regression test that ensures this behavior is covered going forward.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felcocb2qn73unpyc0g5b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felcocb2qn73unpyc0g5b.png" alt="Image descr iption" width="699" height="537"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;you can find the full merged pull request here:&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/NousResearch/hermes-agent/pull/13457" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        fix(permissions): handle None response from ACP request_permission
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#13457&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/aniruddhaadak80" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F127435065%3Fv%3D4" alt="aniruddhaadak80 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/aniruddhaadak80" rel="noopener noreferrer"&gt;aniruddhaadak80&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/NousResearch/hermes-agent/pull/13457" rel="noopener noreferrer"&gt;&lt;time&gt;Apr 21, 2026&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What does this PR do?&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;This PR hardens the ACP ? Hermes permission-approval bridge by safely handling an unexpected None result from
equest_permission, preventing attribute errors and defaulting to a safe deny.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Related Issue&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Fixes #13449&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Type of Change&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;[x] ?? Bug fix (non-breaking change that fixes an issue)&lt;/li&gt;
&lt;li&gt;[ ] ? New feature (non-breaking change that adds functionality)&lt;/li&gt;
&lt;li&gt;[ ] ?? Security fix&lt;/li&gt;
&lt;li&gt;[ ] ?? Documentation update&lt;/li&gt;
&lt;li&gt;[x] ? Tests (adding or improving test coverage)&lt;/li&gt;
&lt;li&gt;[ ] ?? Refactor (no behavior change)&lt;/li&gt;
&lt;li&gt;[ ] ?? New skill (bundled or hub)&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Changes Made&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Return "deny" when
equest_permission resolves to None in the approval callback.&lt;/li&gt;
&lt;li&gt;Add a unit test covering the None response case to ensure the callback denies safely.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;How to Test&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;Connect via an ACP client that sends an empty response to permission requests.&lt;/li&gt;
&lt;li&gt;Verify the permission is denied rather than throwing an exception.&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Checklist&lt;/h2&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Code&lt;/h3&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;[x] I've read the Contributing Guide&lt;/li&gt;
&lt;li&gt;[x] My commit messages follow Conventional Commits&lt;/li&gt;
&lt;li&gt;[x] I searched for existing PRs to make sure this isn't a duplicate&lt;/li&gt;
&lt;li&gt;[x] My PR contains only changes related to this fix/feature (no unrelated commits)&lt;/li&gt;
&lt;li&gt;[x] I've run pytest tests/ -q and all tests pass&lt;/li&gt;
&lt;li&gt;[x] I've added tests for my changes (required for bug fixes, strongly encouraged for features)&lt;/li&gt;
&lt;li&gt;[x] I've tested on my platform&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Documentation &amp;amp; Housekeeping&lt;/h3&gt;
&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;[x] I've updated relevant documentation (README, docs/, docstrings) � or N/A&lt;/li&gt;
&lt;li&gt;[x] I've updated cli-config.yaml.example if I added/changed config keys � or N/A&lt;/li&gt;
&lt;li&gt;[x] I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows � or N/A&lt;/li&gt;
&lt;li&gt;[x] I've considered cross-platform impact (Windows, macOS) per the compatibility guide � or N/A&lt;/li&gt;
&lt;li&gt;[x] I've updated tool descriptions/schemas if I changed tool behavior � or N/A&lt;/li&gt;
&lt;/ul&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/NousResearch/hermes-agent/pull/13457" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;and the repository is here:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/NousResearch" rel="noopener noreferrer"&gt;
        NousResearch
      &lt;/a&gt; / &lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;
        hermes-agent
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      The agent that grows with you
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;
  &lt;a rel="noopener noreferrer" href="https://github.com/NousResearch/hermes-agent/assets/banner.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FNousResearch%2Fhermes-agent%2FHEAD%2Fassets%2Fbanner.png" alt="Hermes Agent" width="100%"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Hermes Agent ☤&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;
  &lt;a href="https://hermes-agent.nousresearch.com/docs/" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/76d7a880842f286c4d4e07baf2db1046197c6cfaa564365e912938445fc54a32/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f446f63732d6865726d65732d2d6167656e742e6e6f757372657365617263682e636f6d2d4646443730303f7374796c653d666f722d7468652d6261646765" alt="Documentation"&gt;&lt;/a&gt;
  &lt;a href="https://discord.gg/NousResearch" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/8c0fca73564f21d7a6f235747eb4d739a2e4aaa348b8e074904127baeb944b9e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f446973636f72642d3538363546323f7374796c653d666f722d7468652d6261646765266c6f676f3d646973636f7264266c6f676f436f6c6f723d7768697465" alt="Discord"&gt;&lt;/a&gt;
  &lt;a href="https://github.com/NousResearch/hermes-agent/blob/main/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/153acf9dff19deb8abfc598c53bac50a4ceae0f5c83a552711060d3d78d2c057/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d677265656e3f7374796c653d666f722d7468652d6261646765" alt="License: MIT"&gt;&lt;/a&gt;
  &lt;a href="https://nousresearch.com" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/6195af06150f2173f79d16fa3462ccac43c7dbf78f06f3c7997dc4090d79b9ad/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4275696c7425323062792d4e6f757325323052657365617263682d626c756576696f6c65743f7374796c653d666f722d7468652d6261646765" alt="Built by Nous Research"&gt;&lt;/a&gt;
  &lt;a href="https://github.com/NousResearch/hermes-agent/README.zh-CN.md" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/574634232e93a1f4f7399d7056282748bde9c89ff98c338cc9bec8101117832b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c616e672de4b8ade696872d7265643f7374796c653d666f722d7468652d6261646765" alt="中文"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The self-improving AI agent built by &lt;a href="https://nousresearch.com" rel="nofollow noopener noreferrer"&gt;Nous Research&lt;/a&gt;.&lt;/strong&gt; It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.&lt;/p&gt;

&lt;p&gt;Use any model you want — &lt;a href="https://portal.nousresearch.com" rel="nofollow noopener noreferrer"&gt;Nous Portal&lt;/a&gt;, &lt;a href="https://openrouter.ai" rel="nofollow noopener noreferrer"&gt;OpenRouter&lt;/a&gt; (200+ models), &lt;a href="https://novita.ai" rel="nofollow noopener noreferrer"&gt;NovitaAI&lt;/a&gt; (AI-native cloud for Model API, Agent Sandbox, and GPU Cloud), &lt;a href="https://build.nvidia.com" rel="nofollow noopener noreferrer"&gt;NVIDIA NIM&lt;/a&gt; (Nemotron), &lt;a href="https://platform.xiaomimimo.com" rel="nofollow noopener noreferrer"&gt;Xiaomi MiMo&lt;/a&gt;, &lt;a href="https://z.ai" rel="nofollow noopener noreferrer"&gt;z.ai/GLM&lt;/a&gt;, &lt;a href="https://platform.moonshot.ai" rel="nofollow noopener noreferrer"&gt;Kimi/Moonshot&lt;/a&gt;, &lt;a href="https://www.minimax.io" rel="nofollow noopener noreferrer"&gt;MiniMax&lt;/a&gt;, &lt;a href="https://huggingface.co" rel="nofollow noopener noreferrer"&gt;Hugging Face&lt;/a&gt;, OpenAI, or your own endpoint. Switch with &lt;code&gt;hermes model&lt;/code&gt; — no code changes, no lock-in.&lt;/p&gt;

&lt;p&gt;&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;br&gt;
&lt;tbody&gt;
&lt;br&gt;
&lt;tr&gt;
&lt;br&gt;
&lt;td&gt;&lt;b&gt;A real terminal interface&lt;/b&gt;&lt;/td&gt;
&lt;br&gt;
&lt;td&gt;Full TUI with multiline&lt;/td&gt;
&lt;br&gt;
&lt;/tr&gt;
&lt;br&gt;
&lt;/tbody&gt;
&lt;br&gt;
&lt;/table&gt;&lt;/div&gt;…&lt;/p&gt;
&lt;/div&gt;
&lt;br&gt;
  &lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;h2&gt;
  
  
  how it works
&lt;/h2&gt;

&lt;p&gt;the approval callback function handles permission requests from the ACP system. when a command needs approval, it runs a coroutine to get the response from the permission handler.&lt;/p&gt;

&lt;p&gt;in the try block, if the coroutine completes successfully, the code checks if the response is none. if it is none, the function immediately returns deny.&lt;/p&gt;

&lt;p&gt;this approach is straightforward but effective. it means that even in unusual situations where the permission system does not return a proper response, the callback always produces a valid result. the agent continues running instead of crashing.&lt;/p&gt;

&lt;p&gt;the test confirms this behavior by mocking the coroutine to return none and verifying that the callback returns deny.&lt;/p&gt;

&lt;h2&gt;
  
  
  my tech stack
&lt;/h2&gt;

&lt;p&gt;the changes were made using python with the standard library asyncio module for handling asynchronous permission requests. the test uses pytest with mock objects to verify the behavior without needing a full system setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  lessons learned
&lt;/h2&gt;

&lt;p&gt;working on this fix taught me a few things about building reliable agent systems.&lt;/p&gt;

&lt;p&gt;first, edge cases matter. in a system that handles permissions, every unexpected input should have a clear handling path. crashing is never the right answer.&lt;/p&gt;

&lt;p&gt;second, testing edge cases is just as important as testing happy paths. the regression test ensures that this none response case keeps working correctly as the codebase evolves.&lt;/p&gt;

&lt;p&gt;third, the hermes agent codebase is well structured and welcoming to contributions. the team uses copilot for code review, which helps catch potential improvements early.&lt;/p&gt;

&lt;p&gt;i am glad to have contributed to an open source project that gives developers control over their own AI agents. the hermes agent challenge is a great way to get involved and learn about agentic systems.&lt;/p&gt;




&lt;p&gt;thanks for reading. &lt;/p&gt;

&lt;p&gt;if you want to explore hermes agent or contribute to it, check out the github repository linked above.&lt;/p&gt;

&lt;p&gt;and thanks again ...&lt;/p&gt;

</description>
      <category>hermesagentchallenge</category>
      <category>devchallenge</category>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>hermes agent: the complete guide to your personal ai operator</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Sat, 16 May 2026 05:51:26 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/hermes-agent-the-complete-guide-to-your-personal-ai-operator-28k4</link>
      <guid>https://dev.to/aniruddhaadak/hermes-agent-the-complete-guide-to-your-personal-ai-operator-28k4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;i remember the first time i heard about hermes agent - it sounded like something out of a science fiction movie. an ai that doesn't just chat with you, but actually does things for you. sounds crazy, right? well, let me tell you, it's real, and it's changed how i work completely.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;hey there. i'm so excited to share what i've learned about hermes agent after spending weeks exploring it. if you're tired of doing repetitive computer tasks and want an ai buddy that actually helps you get stuff done, you're in the right place.&lt;/p&gt;

&lt;h2&gt;
  
  
  what exactly is hermes agent
&lt;/h2&gt;

&lt;p&gt;let me put it simply - hermes agent is like having a super-smart assistant who lives in your computer and can actually &lt;em&gt;do things&lt;/em&gt; for you. not just answer questions, but open programs, write code, organize files, send emails, and much more.&lt;/p&gt;

&lt;p&gt;here's what makes it special:&lt;/p&gt;

&lt;p&gt;a) &lt;strong&gt;open source&lt;/strong&gt; - it's free to use and you can see how it works&lt;br&gt;
b) &lt;strong&gt;built by nous research&lt;/strong&gt; - the team behind some amazing ai tools&lt;br&gt;
c) &lt;strong&gt;140,000+ developers&lt;/strong&gt; on github love it&lt;br&gt;
d) &lt;strong&gt;over 40 tools&lt;/strong&gt; built right in&lt;br&gt;
e) &lt;strong&gt;learns as it goes&lt;/strong&gt; - gets better the more you use it&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"hermes agent is an operator, not a builder. it does the work, not just plans it." - nemanja&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  the four-level path: how hermes grows with you
&lt;/h2&gt;

&lt;p&gt;i love how hermes scales with your needs. here's what i found:&lt;/p&gt;

&lt;p&gt;1️⃣ &lt;strong&gt;one agent&lt;/strong&gt; - start simple. one ai helper that can do basic tasks like searching the web or organizing files.&lt;/p&gt;

&lt;p&gt;2️⃣ &lt;strong&gt;multiple specialists&lt;/strong&gt; - as you need more, hermes can use different ai models for different jobs. one for coding, one for writing, one for research.&lt;/p&gt;

&lt;p&gt;3️⃣ &lt;strong&gt;hermes orchestrator&lt;/strong&gt; - the smart part. hermes decides which specialist to use for each task and makes sure everything works together smoothly.&lt;/p&gt;

&lt;p&gt;4️⃣ &lt;strong&gt;automated agent team&lt;/strong&gt; - the full experience. hermes coordinates multiple ai agents working together like a tiny team, all while you focus on the big picture.&lt;/p&gt;

&lt;h2&gt;
  
  
  what makes hermes agent special
&lt;/h2&gt;

&lt;p&gt;after testing it myself, here's what impressed me the most:&lt;/p&gt;

&lt;h3&gt;
  
  
  three-tier memory
&lt;/h3&gt;

&lt;p&gt;hermes remembers things in three ways:&lt;br&gt;
x) short-term memory for the current task&lt;br&gt;
y) working memory for important details&lt;br&gt;
z) long-term memory for things you tell it to remember&lt;/p&gt;

&lt;p&gt;this means it doesn't forget what you told it five minutes ago.&lt;/p&gt;

&lt;h3&gt;
  
  
  geppa optimization
&lt;/h3&gt;

&lt;p&gt;this is the fancy term for hermes getting smarter. it learns from your feedback and gets better at choosing the right tools and models for each job.&lt;/p&gt;

&lt;h3&gt;
  
  
  self-evolving skills
&lt;/h3&gt;

&lt;p&gt;i was blown away by this. hermes can actually improve itself. one user reported it became &lt;em&gt;3 times faster&lt;/em&gt; and &lt;em&gt;80% cheaper&lt;/em&gt; after just 2 iterations. that's pretty amazing.&lt;/p&gt;

&lt;h3&gt;
  
  
  codex runtime integration
&lt;/h3&gt;

&lt;p&gt;this lets hermes run code safely and efficiently. you can ask it to write and execute code without worrying about breaking anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  nine workflows that changed my life
&lt;/h2&gt;

&lt;p&gt;ole lehmann shared these workflows, and i've tried most of them:&lt;/p&gt;

&lt;p&gt;1️⃣ &lt;strong&gt;customer support cron&lt;/strong&gt; - hermes can monitor support tickets and respond to common questions automatically.&lt;/p&gt;

&lt;p&gt;2️⃣ &lt;strong&gt;weekly business report&lt;/strong&gt; - set it up once, and every week it gathers data and writes your report. i use this every monday morning.&lt;/p&gt;

&lt;p&gt;3️⃣ &lt;strong&gt;daily brief&lt;/strong&gt; - every morning, hermes gives you a summary of emails, messages, and tasks. perfect way to start the day.&lt;/p&gt;

&lt;p&gt;4️⃣ &lt;strong&gt;travel planning&lt;/strong&gt; - from flights to hotels to restaurants, hermes researches and compares everything. i planned my last vacation with minimal effort.&lt;/p&gt;

&lt;p&gt;5️⃣ &lt;strong&gt;seo research&lt;/strong&gt; - if you write content, hermes can research keywords, analyze competitors, and suggest improvements.&lt;/p&gt;

&lt;p&gt;6️⃣ &lt;strong&gt;content creation&lt;/strong&gt; - write blog posts, social media updates, or emails. hermes can draft, edit, and format content for you.&lt;/p&gt;

&lt;p&gt;7️⃣ &lt;strong&gt;client tasks&lt;/strong&gt; - manage multiple client projects. hermes tracks deadlines, sends reminders, and keeps everything organized.&lt;/p&gt;

&lt;p&gt;8️⃣ &lt;strong&gt;local tool automation&lt;/strong&gt; - connect hermes to your local apps and automate repetitive tasks on your computer.&lt;/p&gt;

&lt;p&gt;9️⃣ &lt;strong&gt;obsidian llm wiki second brain&lt;/strong&gt; - hermes helps organize your notes and connects related ideas. it's like having a personal librarian for your thoughts.&lt;/p&gt;

&lt;h2&gt;
  
  
  getting started with hermes agent
&lt;/h2&gt;

&lt;p&gt;i know setup can be scary, but hermes makes it easy. here's what i did:&lt;/p&gt;

&lt;p&gt;1️⃣ &lt;strong&gt;visit the github repository&lt;/strong&gt; - go to the hermes agent repo and click the green "code" button to download.&lt;/p&gt;

&lt;p&gt;2️⃣ &lt;strong&gt;follow the install guide&lt;/strong&gt; - head to hermesatlas.com for step-by-step instructions. no command line experience needed.&lt;/p&gt;

&lt;p&gt;3️⃣ &lt;strong&gt;start with simple workflows&lt;/strong&gt; - don't try to automate everything at once. begin with one task, like organizing your downloads folder.&lt;/p&gt;

&lt;p&gt;4️⃣ &lt;strong&gt;add api keys&lt;/strong&gt; - hermes needs access to ai models. you'll need your own api keys for best results.&lt;/p&gt;

&lt;p&gt;5️⃣ &lt;strong&gt;test and tweak&lt;/strong&gt; - watch what hermes does, give feedback, and it will get better over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  pricing and subscription
&lt;/h2&gt;

&lt;p&gt;here's the honest truth about costs:&lt;/p&gt;

&lt;p&gt;p) &lt;strong&gt;free tier&lt;/strong&gt; - you can use hermes agent itself for free&lt;br&gt;
q) &lt;strong&gt;api costs&lt;/strong&gt; - you pay for the ai models it uses&lt;br&gt;
r) &lt;strong&gt;no hidden fees&lt;/strong&gt; - what you see is what you pay&lt;/p&gt;

&lt;p&gt;i found that using deepseek v4 flash as the supervisor with local workers costs about &lt;em&gt;1/20th&lt;/em&gt; of other options. pretty sweet deal.&lt;/p&gt;

&lt;h2&gt;
  
  
  keeping things safe and secure
&lt;/h2&gt;

&lt;p&gt;i was worried about security at first, but hermes has you covered:&lt;/p&gt;

&lt;p&gt;i) &lt;strong&gt;use separate accounts&lt;/strong&gt; - don't use your main accounts for hermes&lt;br&gt;
ii) &lt;strong&gt;own api keys&lt;/strong&gt; - use your own keys so you control access&lt;br&gt;
iii) &lt;strong&gt;least privilege&lt;/strong&gt; - only give hermes the permissions it needs&lt;br&gt;
iv) &lt;strong&gt;review actions&lt;/strong&gt; - hermes shows you what it's about to do before doing it&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"security is not an afterthought with hermes. it's built into every step." - user testimonial&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  how hermes compares to others
&lt;/h2&gt;

&lt;p&gt;i tested a few similar tools, and here's what i found:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;tool&lt;/th&gt;
&lt;th&gt;tokens used&lt;/th&gt;
&lt;th&gt;best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;hermes agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;353 billion&lt;/td&gt;
&lt;td&gt;general automation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;openclaw&lt;/td&gt;
&lt;td&gt;195 billion&lt;/td&gt;
&lt;td&gt;coding tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;kilo code&lt;/td&gt;
&lt;td&gt;166 billion&lt;/td&gt;
&lt;td&gt;developer workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;hermes agent ranks number one in usage, which tells me people are actually using it for real work.&lt;/p&gt;

&lt;h2&gt;
  
  
  what real users say
&lt;/h2&gt;

&lt;p&gt;i read through tons of user experiences, and these stood out to me:&lt;/p&gt;

&lt;p&gt;"i was skeptical at first, but hermes agent saved me over &lt;code&gt;10 hours&lt;/code&gt; every week. the initial setup took some time, but it paid off." - kyle&lt;/p&gt;

&lt;p&gt;"the memory system is game-changing. hermes remembers context from days ago, which makes complex tasks possible." - john king&lt;/p&gt;

&lt;p&gt;"i use it for my consulting business. it handles client communications, research, and reporting. i focus on strategy now." - tessa kriesel&lt;/p&gt;

&lt;h2&gt;
  
  
  recent updates and bug fixes
&lt;/h2&gt;

&lt;p&gt;the hermes team is constantly improving. here's what's new:&lt;/p&gt;

&lt;p&gt;A) tui fix for vietnamese and cjk input&lt;br&gt;
B) slack whitespace guard&lt;br&gt;
C) cli exception logging&lt;br&gt;
D) url safety improvements&lt;br&gt;
E) mcp regex optimization&lt;br&gt;
F) browser tool error handling&lt;br&gt;
G) codex runtime fix&lt;/p&gt;

&lt;h2&gt;
  
  
  my final thoughts
&lt;/h2&gt;

&lt;p&gt;look, i get it. the idea of an ai agent that can control your computer sounds overwhelming. i felt the same way at first. but after using hermes agent for a few weeks, i can honestly say it's been one of the best productivity tools i've ever used.&lt;/p&gt;

&lt;p&gt;it's not perfect - sometimes it makes mistakes, and you need to supervise it. but that's the point. hermes is like a really smart intern who needs some guidance but can do amazing work once you show it the ropes.&lt;/p&gt;

&lt;p&gt;the best part. hermes gets better the more you use it. it learns your preferences, your workflows, and your style. after a month, it feels less like a tool and more like a teammate.&lt;/p&gt;

&lt;p&gt;if you've been curious about ai agents but didn't know where to start, hermes agent is the perfect place. it's open source, well-documented, and backed by a passionate community.&lt;/p&gt;

&lt;p&gt;give it a try. start small. watch it work. and before you know it, you'll wonder how you ever managed without it.&lt;/p&gt;

&lt;h2&gt;
  
  
  ready to dive in?
&lt;/h2&gt;

&lt;p&gt;here's what i recommend:&lt;/p&gt;

&lt;p&gt;a) visit the &lt;a href="https://github.com/nousresearch/hermes-agent" rel="noopener noreferrer"&gt;github repository&lt;/a&gt;&lt;br&gt;
b) follow the install guide at &lt;code&gt;hermesatlas.com&lt;/code&gt;&lt;br&gt;
c) start with one simple workflow&lt;br&gt;
d) join the community and ask questions&lt;br&gt;
e) share your experience with others&lt;/p&gt;

&lt;p&gt;remember, hermes agent is not here to replace you. it's here to &lt;em&gt;free you up&lt;/em&gt; to do the work that really matters. the creative stuff, the strategic thinking, the human connection. let hermes handle the rest.&lt;/p&gt;

&lt;p&gt;happy automating, friends. 🚀&lt;/p&gt;

</description>
      <category>agents</category>
      <category>hermesagentchallenge</category>
      <category>automation</category>
      <category>devchallenge</category>
    </item>
    <item>
      <title>Gemma 4 Complete Guide 2026, Architecture, Benchmarks, Deployment and more ...</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Thu, 07 May 2026 05:40:07 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/gemma-4-complete-guide-2026-architecture-benchmarks-deployment-3en9</link>
      <guid>https://dev.to/aniruddhaadak/gemma-4-complete-guide-2026-architecture-benchmarks-deployment-3en9</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Write About Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;Gemma 4 Complete Guide 2026&lt;/code&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Gemma 4&lt;/strong&gt; is shaping up to be the most consequential open weight model release of the year, and the headline is not just the leaderboard scores. Google shipped &lt;em&gt;four&lt;/em&gt; model sizes, native multimodality, a &lt;strong&gt;256K context window&lt;/strong&gt; on the larger variants, and for the first time in the Gemma line, a clean &lt;strong&gt;Apache 2.0&lt;/strong&gt; license.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For engineering teams that have been waiting on an open weight model good enough to actually replace a frontier API for a meaningful slice of their workload, this is the first credible candidate from Google.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6yrshq92rvaamio1mqon.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6yrshq92rvaamio1mqon.png" alt="Imag ption" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This guide is the long version. It walks through what the family looks like,&lt;/p&gt;

&lt;p&gt;i) what the architecture actually does,&lt;br&gt;
ii) what the benchmark numbers mean in practice,&lt;br&gt;
iii) how it stacks up against Llama 4, Qwen 3.5, DeepSeek V4 Flash and its own predecessors,&lt;br&gt;
iv) where to host it, and where it falls short.&lt;br&gt;
v) If you are evaluating Gemma 4 for production, this is the document you can hand to your team.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;TL;DR, the Quick Read&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Before we dive into the long version, here is the snapshot you can keep in your back pocket.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Released:&lt;/em&gt; April 2, 2026, by Google DeepMind.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The family ships in four sizes, namely &lt;strong&gt;Gemma 4 E2B&lt;/strong&gt; (around 2.3B effective), &lt;strong&gt;E4B&lt;/strong&gt; (around 4.5B effective), &lt;strong&gt;26B A4B&lt;/strong&gt; (a Mixture of Experts with 4B active), and a dense &lt;strong&gt;31B&lt;/strong&gt;. Licensing has finally moved to &lt;strong&gt;Apache 2.0&lt;/strong&gt;, which is a meaningful change because earlier Gemma generations shipped under the custom &lt;em&gt;Gemma Terms of Use&lt;/em&gt; that often made enterprise legal review painful.&lt;/p&gt;

&lt;p&gt;Context windows reach &lt;strong&gt;128K tokens&lt;/strong&gt; on the E2B and E4B variants, and &lt;strong&gt;256K&lt;/strong&gt; on the 26B A4B and the 31B. Every variant takes text plus image input, while E2B and E4B additionally take audio. Output, on every variant, is text only.&lt;/p&gt;

&lt;p&gt;On the strong side, Gemma 4 nails reasoning, math (&lt;code&gt;AIME 2026 ~89%&lt;/code&gt;), code generation (&lt;code&gt;LiveCodeBench v6 ~80%&lt;/code&gt;), long context recall, and on device deployment via &lt;strong&gt;MediaPipe&lt;/strong&gt; and &lt;strong&gt;LiteRT&lt;/strong&gt;. On the weak side, it trails Qwen 3.5 27B on &lt;code&gt;SWE-bench Verified&lt;/code&gt;, has no native speech output, and a reminder worth repeating, &lt;em&gt;Gemma is not Gemini&lt;/em&gt;, so fine tuning, weights and serving become your problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;Watch the offical demo here&lt;/code&gt;:   &lt;iframe src="https://www.youtube.com/embed/jZVBoFOJK-Q"&gt;
  &lt;/iframe&gt;

&lt;/h2&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;What Gemma 4 Is, And How It Differs From Gemini&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Gemma is Google's &lt;em&gt;open weight&lt;/em&gt; model family. Gemini is Google's &lt;em&gt;closed, hosted, frontier&lt;/em&gt; model family. They share research lineage, in fact Google describes Gemma 4 as &lt;strong&gt;built from Gemini 3 research&lt;/strong&gt;, but the deployment story is genuinely different.&lt;/p&gt;

&lt;p&gt;With Gemini, you call an API, you pay per token, you do not get the weights, and you cannot fine tune the underlying parameters, you get adapters at best. With Gemma 4, you download the weights from Hugging Face, Kaggle or Ollama, you run them on your own hardware (or a cloud GPU you rent), you fine tune fully, and your unit economics become &lt;em&gt;GPU hours and electricity&lt;/em&gt; rather than per token API spend.&lt;/p&gt;

&lt;p&gt;The practical implication is straightforward.&lt;br&gt;
i) Reach for &lt;strong&gt;Gemma 4&lt;/strong&gt; when you need on device inference, when you need to fine tune on private data, when your token volume makes a hosted API uneconomical, or when you need an air gapped deployment.&lt;br&gt;
ii) Reach for &lt;strong&gt;Gemini&lt;/strong&gt; when you want zero ops frontier intelligence and you are happy to pay for it.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;The Gemma 4 Family&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feuaelf4ej0xbtcp79qi2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feuaelf4ej0xbtcp79qi2.png" alt="Image descri ption" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Four sizes, two architectural patterns (&lt;code&gt;dense&lt;/code&gt; and &lt;code&gt;MoE&lt;/code&gt;), and a clear split between the &lt;em&gt;edge&lt;/em&gt; and &lt;em&gt;server&lt;/em&gt; tiers.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;Total / Active params&lt;/th&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;th&gt;Modalities in&lt;/th&gt;
&lt;th&gt;Primary target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 E2B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;~2.3B effective&lt;/td&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;Text, image, audio&lt;/td&gt;
&lt;td&gt;Phones, IoT, low power laptops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 E4B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;~4.5B effective&lt;/td&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;Text, image, audio&lt;/td&gt;
&lt;td&gt;High end phones, edge servers, Raspberry Pi class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 26B A4B&lt;/td&gt;
&lt;td&gt;Mixture of Experts&lt;/td&gt;
&lt;td&gt;26B total / ~4B active per token&lt;/td&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;Text, image&lt;/td&gt;
&lt;td&gt;Single high end GPU server, cost sensitive throughput&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 31B&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;30.7B&lt;/td&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;Text, image&lt;/td&gt;
&lt;td&gt;Quality first server inference, fine tuning&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A small naming clarification first. The &lt;code&gt;E&lt;/code&gt; in &lt;em&gt;E2B&lt;/em&gt; and &lt;em&gt;E4B&lt;/em&gt; stands for &lt;strong&gt;edge&lt;/strong&gt;, not &lt;em&gt;experts&lt;/em&gt;. These are dense models built for on device.&lt;/p&gt;

&lt;p&gt;The 26B A4B is the actual MoE in the family. Roughly &lt;em&gt;4 billion&lt;/em&gt; parameters fire on any given forward pass, so latency and cost behave like a 4B model, while quality benefits from the full 26B parameter pool. The 31B is the no tricks dense model, slower than the MoE, but typically the highest quality answer when you need &lt;em&gt;the best response per query&lt;/em&gt; rather than &lt;em&gt;the best response per dollar&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;Architecture, Context Window, And Tokenizer&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DraE74PfvNg%252BD8vC7H5%252FqRrD2LwbPHgocNX0MwWXlixUl%252FJTzOyQcrt8LuDPX10vc8WynQ8%252Fz2I1qAHCmOMOylIZUbhU%253D%26u2%3DRij0KPILhUUwpdpa%26width%3D2560" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DraE74PfvNg%252BD8vC7H5%252FqRrD2LwbPHgocNX0MwWXlixUl%252FJTzOyQcrt8LuDPX10vc8WynQ8%252Fz2I1qAHCmOMOylIZUbhU%253D%26u2%3DRij0KPILhUUwpdpa%26width%3D2560" alt="Mixture of Experts architecture diagram" width="2400" height="1254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Gemma 4 keeps the decoder only transformer skeleton that has defined the family,&lt;br&gt;
but it tightens almost every component.&lt;br&gt;
A few highlights worth knowing before you read the model card.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;a) Hybrid attention.&lt;/strong&gt; Gemma 4 interleaves &lt;em&gt;local sliding window attention&lt;/em&gt; with &lt;em&gt;full global attention&lt;/em&gt;, with the final layer always global. Smaller dense models use 512 token sliding windows, and larger ones use 1024. This is the trick that makes the 256K context feasible without VRAM blowing up linearly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b) RULER long context recall.&lt;/strong&gt; On &lt;code&gt;RULER&lt;/code&gt; at 128K, Gemma 3 scored &lt;em&gt;13.5%&lt;/em&gt;. Gemma 4 scores &lt;em&gt;66.4%&lt;/em&gt; on the same test. The context window is not just nominal, it actually retrieves at depth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;c) Vocabulary.&lt;/strong&gt; A &lt;code&gt;262,144 token&lt;/code&gt; vocabulary, BPE with byte fallback. Strong multilingual coverage across more than 140 languages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;d) Vision tokens.&lt;/strong&gt; A variable visual budget per image, namely &lt;code&gt;70&lt;/code&gt;, &lt;code&gt;140&lt;/code&gt;, &lt;code&gt;280&lt;/code&gt;, &lt;code&gt;560&lt;/code&gt; or &lt;code&gt;1120&lt;/code&gt; tokens, so you trade quality against context spend.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;e) Audio (E2B and E4B only).&lt;/strong&gt; Native speech recognition and audio understanding, with no separate ASR layer required for many use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;f) Reasoning mode.&lt;/strong&gt; Gemma 4 can produce more than &lt;code&gt;4,000&lt;/code&gt; tokens of explicit reasoning before committing to an answer, plus native function calling and structured JSON output.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The MoE in the &lt;strong&gt;26B A4B&lt;/strong&gt; is the architectural story to internalise. It lets a single A100 80GB or two consumer GPUs serve a model that punches well above 4B in quality terms, at roughly 4B in cost terms. That is the new dominant design point for the open weight server tier in 2026.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3Dh1blmxOby8%252BCEILpXvILx1xik4lFwwkMTmiQnO3hKEkpbo%252BMivILKKZ1uqzV24KQ7LBE9JebN4Ly5BLStI2iLSL91WVKRK1uOzTjGFZimvu8VIZwtaSOb6K9R8%252FsI5uDoxDd6epGNamc4HiQIsm%252BX5ImT0T9K%252FCYUZiPUVKMzpGh3Cpsdew%252F6CCOei3q%252B5plz8AavE%252FO9EpmjLBSKvyCAUm1fEJYF1cxNTzPNOVTklkiZL%252FjxinLDapIkAWztiIZUugKVNmRrb7AAijA1C9gVOaYvOLx8ijVivWdA10a%26u2%3DBmU%252F61z4Fq1KNPPM%26width%3D2560" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3Dh1blmxOby8%252BCEILpXvILx1xik4lFwwkMTmiQnO3hKEkpbo%252BMivILKKZ1uqzV24KQ7LBE9JebN4Ly5BLStI2iLSL91WVKRK1uOzTjGFZimvu8VIZwtaSOb6K9R8%252FsI5uDoxDd6epGNamc4HiQIsm%252BX5ImT0T9K%252FCYUZiPUVKMzpGh3Cpsdew%252F6CCOei3q%252B5plz8AavE%252FO9EpmjLBSKvyCAUm1fEJYF1cxNTzPNOVTklkiZL%252FjxinLDapIkAWztiIZUugKVNmRrb7AAijA1C9gVOaYvOLx8ijVivWdA10a%26u2%3DBmU%252F61z4Fq1KNPPM%26width%3D2560" alt="Visual guide to Mixture of Experts routing" width="1460" height="640"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;License, Apache 2.0, Finally&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3Diu%252BgzM2B0VkbbxD6NbQeaAjPhc1%252BQkjiwXYtPFRz1rVSDZWSJvOsO3j8aKmr9L2gN29Lo4zITWR%252B%252BhZySBamc9l7%252Fban%252FrSHyPAiyHK4S2nqGqupRrDtaT3Z7RJlAY6dtIq4%252FL4Sy27dYdVZ%252FZ0w3E1P0OxcXjdHEDFTEF1z8VLsrk7Kyiu6hfKyXVd%252FznYPXa58h4I%252FMuWtKAHnhac%253D%26u2%3DdO5skgjevfndE3uq%26width%3D2560" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3Diu%252BgzM2B0VkbbxD6NbQeaAjPhc1%252BQkjiwXYtPFRz1rVSDZWSJvOsO3j8aKmr9L2gN29Lo4zITWR%252B%252BhZySBamc9l7%252Fban%252FrSHyPAiyHK4S2nqGqupRrDtaT3Z7RJlAY6dtIq4%252FL4Sy27dYdVZ%252FZ0w3E1P0OxcXjdHEDFTEF1z8VLsrk7Kyiu6hfKyXVd%252FznYPXa58h4I%252FMuWtKAHnhac%253D%26u2%3DdO5skgjevfndE3uq%26width%3D2560" alt="Apache License 2.0 explained" width="2560" height="1504"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Read this section carefully if you have ever had Legal kill a Gemma rollout.&lt;/p&gt;

&lt;p&gt;Earlier Gemma releases shipped under the &lt;a href="https://ai.google.dev/gemma/terms?ref=codersera.com" rel="noopener noreferrer"&gt;Gemma Terms of Use&lt;/a&gt;, a custom license. It was more permissive than Llama 2's, but it included a &lt;em&gt;Prohibited Use Policy&lt;/em&gt; with clauses around harm to minors, attacks on critical infrastructure, generation of CSAM, and other broad carve outs. The clauses were defensible in spirit, but enterprise legal teams routinely flagged the language as ambiguous and asked for indemnification or scope limiting before signing off. That friction kept Gemma out of plenty of production stacks.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Gemma 4 ships under Apache 2.0.&lt;/strong&gt; No custom restrictions, no usage carve outs, and no monthly active user thresholds the way the Llama 4 Community License has.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;code&gt;Apache 2.0&lt;/code&gt; explicitly grants commercial use, modification, redistribution, and distribution of derivative works including derivative weights. There is one obvious constraint that still applies, namely that Apache 2.0 does not grant trademark rights, so you cannot ship a product called &lt;em&gt;"Gemma"&lt;/em&gt; or imply Google endorsement.&lt;/p&gt;

&lt;p&gt;This is materially less restrictive than the previous Gemma Terms of Use, and noticeably less restrictive than Llama 4's Community License (which is free for organisations under 700M monthly active users but adds compliance language). For most engineering teams, this is the change that turns Gemma from &lt;em&gt;interesting&lt;/em&gt; into &lt;em&gt;approvable&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Two caveats worth being honest about.&lt;/p&gt;

&lt;p&gt;a) Apache 2.0 governs the &lt;strong&gt;weights&lt;/strong&gt;, it does not give you the &lt;em&gt;training data&lt;/em&gt; or the &lt;em&gt;training pipeline&lt;/em&gt;. Gemma 4 is open weight, not open source in the strict OSI sense applied to data.&lt;/p&gt;

&lt;p&gt;b) Google can still publish acceptable use guidelines separately, nothing about Apache 2.0 prevents that. Today, the license file in the repo is the controlling document, and that document is Apache 2.0.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;Benchmarks That Actually Matter&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8r3nzetwmox4zynhgxt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8r3nzetwmox4zynhgxt.png" alt="Image de scription" width="800" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The headline numbers for &lt;strong&gt;Gemma 4 31B&lt;/strong&gt; (instruction tuned) are pulled from Google's model card, plus the independent reproductions surfaced in the LM Studio and Hugging Face threads.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Gemma 4 31B&lt;/th&gt;
&lt;th&gt;Gemma 3 27B&lt;/th&gt;
&lt;th&gt;Llama 4 Scout (109B)&lt;/th&gt;
&lt;th&gt;Qwen 3.5 27B&lt;/th&gt;
&lt;th&gt;DeepSeek V4 Flash&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MMLU-Pro&lt;/td&gt;
&lt;td&gt;85.2&lt;/td&gt;
&lt;td&gt;~67&lt;/td&gt;
&lt;td&gt;~78&lt;/td&gt;
&lt;td&gt;86.1&lt;/td&gt;
&lt;td&gt;~84&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPQA Diamond&lt;/td&gt;
&lt;td&gt;84.3&lt;/td&gt;
&lt;td&gt;42.4&lt;/td&gt;
&lt;td&gt;~70&lt;/td&gt;
&lt;td&gt;85.5&lt;/td&gt;
&lt;td&gt;~80&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LiveCodeBench v6&lt;/td&gt;
&lt;td&gt;80.0&lt;/td&gt;
&lt;td&gt;29.1&lt;/td&gt;
&lt;td&gt;~55&lt;/td&gt;
&lt;td&gt;~78&lt;/td&gt;
&lt;td&gt;~74&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SWE-bench Verified&lt;/td&gt;
&lt;td&gt;~63&lt;/td&gt;
&lt;td&gt;~22&lt;/td&gt;
&lt;td&gt;~48&lt;/td&gt;
&lt;td&gt;72.4&lt;/td&gt;
&lt;td&gt;~64&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AIME 2026 (math)&lt;/td&gt;
&lt;td&gt;89.2&lt;/td&gt;
&lt;td&gt;20.8&lt;/td&gt;
&lt;td&gt;~55&lt;/td&gt;
&lt;td&gt;~85&lt;/td&gt;
&lt;td&gt;~82&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Codeforces ELO&lt;/td&gt;
&lt;td&gt;2,150&lt;/td&gt;
&lt;td&gt;110&lt;/td&gt;
&lt;td&gt;~1,500&lt;/td&gt;
&lt;td&gt;~1,950&lt;/td&gt;
&lt;td&gt;~1,800&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Approximate values for the non Gemma rows are pulled from each project's own card or the &lt;em&gt;Artificial Analysis&lt;/em&gt; index, treat them as directional. The story they tell is consistent, and it boils down to four observations.&lt;/p&gt;

&lt;p&gt;i) Gemma 4 31B is in the same neighbourhood as Qwen 3.5 27B on knowledge and reasoning, they trade leadership benchmark by benchmark.&lt;/p&gt;

&lt;p&gt;ii) Gemma 4 has the upper hand on math and competitive programming.&lt;/p&gt;

&lt;p&gt;iii) &lt;strong&gt;Qwen 3.5 27B still wins SWE-bench Verified&lt;/strong&gt;, the benchmark that most closely tracks &lt;em&gt;can this model close a real GitHub issue&lt;/em&gt;. If your primary use case is autonomous code editing on real repos, evaluate Qwen 3.5 alongside Gemma 4 before you commit.&lt;/p&gt;

&lt;p&gt;iv) Gemma 4's gain over Gemma 3 is enormous, with multiple benchmarks improving 3 to 20 times. Most teams running Gemma 3 in production should plan a migration window.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;Where To Run Gemma 4&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DttpSCNSEVc4lGub6pe6hWAmgIuW3Szm6T57Rzi2qOt9%252F6byBP9%252Fapv7z%252Fz9zF29JurYT3L3DslnN6Lw4oSlH4AtuI%252F6lRsUll7PRqU%252BYGDdFeYKqW593XlrTuWVENZCINNU%253D%26u2%3DLila5iiPSLHSMyO5%26width%3D2560" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DttpSCNSEVc4lGub6pe6hWAmgIuW3Szm6T57Rzi2qOt9%252F6byBP9%252Fapv7z%252Fz9zF29JurYT3L3DslnN6Lw4oSlH4AtuI%252F6lRsUll7PRqU%252BYGDdFeYKqW593XlrTuWVENZCINNU%253D%26u2%3DLila5iiPSLHSMyO5%26width%3D2560" alt="NVIDIA data center GPUs for AI inference" width="1280" height="680"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are three deployment surfaces to think about, namely &lt;strong&gt;hosted&lt;/strong&gt;, &lt;strong&gt;self hosted server&lt;/strong&gt;, and &lt;strong&gt;on device&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  a) Hosted
&lt;/h3&gt;

&lt;p&gt;If you want zero ops, the model is a one line call away on several providers.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Vertex AI (Model Garden)&lt;/strong&gt; is the first party path. You can fine tune on Vertex AI Training Clusters and serve through Model Garden endpoints, paying for compute time on the underlying accelerator (&lt;code&gt;A2/G2&lt;/code&gt; family or &lt;code&gt;TPUs&lt;/code&gt;).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For prototyping and price sensitive batch work, &lt;strong&gt;OpenRouter&lt;/strong&gt; aggregates more than eleven providers for the 26B A4B model at roughly &lt;em&gt;$0.06 per million input tokens&lt;/em&gt; and &lt;em&gt;$0.33 per million output&lt;/em&gt;. Beyond that, &lt;strong&gt;Together AI&lt;/strong&gt;, &lt;strong&gt;Fireworks&lt;/strong&gt;, &lt;strong&gt;Groq&lt;/strong&gt;, &lt;strong&gt;DeepInfra&lt;/strong&gt; and &lt;strong&gt;Hugging Face Inference&lt;/strong&gt; all run Gemma 4 endpoints, and pricing varies though the open weight competitive market keeps it low. For spiky workloads, &lt;strong&gt;Cloud Run with GPU&lt;/strong&gt;, Google's serverless GPU runtime, can host Gemma 4 with scale to zero, which is genuinely attractive when traffic is bursty.&lt;/p&gt;

&lt;h3&gt;
  
  
  b) Self hosted server
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;vLLM&lt;/code&gt; is the production default. It supports Gemma 4 on NVIDIA, AMD, and Google Cloud TPUs from day one. The approximate hardware floors look like this.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Quant / format&lt;/th&gt;
&lt;th&gt;VRAM floor&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;26B A4B&lt;/td&gt;
&lt;td&gt;AWQ INT4&lt;/td&gt;
&lt;td&gt;~15 GB&lt;/td&gt;
&lt;td&gt;RTX 4090 24 GB with KV cache headroom&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;26B A4B&lt;/td&gt;
&lt;td&gt;GGUF Q4_K_M&lt;/td&gt;
&lt;td&gt;~16 GB&lt;/td&gt;
&lt;td&gt;llama.cpp / Ollama dev box&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;26B A4B&lt;/td&gt;
&lt;td&gt;FP16&lt;/td&gt;
&lt;td&gt;~52 GB&lt;/td&gt;
&lt;td&gt;A100 80GB or H100, serves at full quality&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;31B dense&lt;/td&gt;
&lt;td&gt;FP16&lt;/td&gt;
&lt;td&gt;~62 GB&lt;/td&gt;
&lt;td&gt;A100 80GB or H100 single GPU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;31B dense&lt;/td&gt;
&lt;td&gt;INT4&lt;/td&gt;
&lt;td&gt;~18 GB&lt;/td&gt;
&lt;td&gt;RTX 4090 / 5090, viable for single user inference&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;code&gt;Ollama&lt;/code&gt; covers the local laptop use case for E2B, E4B, and the quantised 26B and 31B. &lt;code&gt;MLX&lt;/code&gt; with Metal acceleration runs all variants on Apple Silicon, an M3 Max or M4 Pro with 32 to 64 GB unified memory will run the 26B A4B comfortably. AMD has day zero Gemma 4 support across &lt;code&gt;ROCm&lt;/code&gt; and the Ryzen AI stack. NVIDIA &lt;code&gt;NIM&lt;/code&gt;, &lt;code&gt;NeMo&lt;/code&gt;, &lt;code&gt;LM Studio&lt;/code&gt;, &lt;code&gt;Unsloth&lt;/code&gt;, &lt;code&gt;SGLang&lt;/code&gt; and &lt;code&gt;LiteRT-LM&lt;/code&gt; all have first class support too.&lt;/p&gt;

&lt;h3&gt;
  
  
  c) On device with MediaPipe and LiteRT
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DS8j98mNejdY8dx8LPPfPu3Zlx2Vmvl16Z8MhPI%252FIHRnyU0ebFl%252BnMxfQCmxl6rh5bjjOSID429OV64cVHS%252B0cMvVyzI%253D%26u2%3DG%252FbjsN4sFIooLp2K%26width%3D2560" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DS8j98mNejdY8dx8LPPfPu3Zlx2Vmvl16Z8MhPI%252FIHRnyU0ebFl%252BnMxfQCmxl6rh5bjjOSID429OV64cVHS%252B0cMvVyzI%253D%26u2%3DG%252FbjsN4sFIooLp2K%26width%3D2560" alt="On device AI on smartphones and edge devices" width="1536" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The E2B and E4B variants are explicitly designed for phones and edge devices.&lt;br&gt;
The deployment stack is &lt;code&gt;MediaPipe&lt;/code&gt;'s LLM Inference API on top of &lt;code&gt;LiteRT&lt;/code&gt;,&lt;br&gt;
which handles model loading, memory and hardware acceleration (GPU or NPU) automatically.&lt;/p&gt;

&lt;p&gt;The approximate footprints are nicely small.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;E2B Q4_K_M&lt;/em&gt;, around &lt;code&gt;~1.3 GB&lt;/code&gt; on disk, with &lt;code&gt;2 to 3 GB&lt;/code&gt; RAM at runtime.&lt;br&gt;
&lt;em&gt;E4B Q4_K_M&lt;/em&gt;, around &lt;code&gt;~2.5 GB&lt;/code&gt; on disk, with &lt;code&gt;4 to 5 GB&lt;/code&gt; RAM at runtime.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the path for &lt;em&gt;AI features that work without a network round trip&lt;/em&gt;,&lt;br&gt;
voice agents on Android, in browser RAG over a user's local documents, and offline coding helpers.&lt;br&gt;
With audio input native to E2B and E4B, you can ship a meaningful voice to text to action loop without bundling a separate ASR model.&lt;/p&gt;




&lt;h2&gt;
  
  
  When To Choose Gemma 4 Over Alternatives
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DuqPbaLZF5vL%252FYR7b09ALHVNofmAj79Qf0AM9A6DZGLNCqlmYaya2sYWaLnAngesOVqlWx3NW%252BnlTUUQQlOvG3da71DNERCODDDkJHoIh90zRpZcYvjbPxUQ7lJVSmCn3Qrk53jZrleA4L%252BRHHBHcHvc1I9sngBD4DD%252Fqc6O9c807lNWFQJJOFuU%252FPs4W%252FP0DuR11xV8%253D%26u2%3D5zTrYYihoyV%252BJnvX%26width%3D2560" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsspark.genspark.ai%2Fcfimages%3Fu1%3DuqPbaLZF5vL%252FYR7b09ALHVNofmAj79Qf0AM9A6DZGLNCqlmYaya2sYWaLnAngesOVqlWx3NW%252BnlTUUQQlOvG3da71DNERCODDDkJHoIh90zRpZcYvjbPxUQ7lJVSmCn3Qrk53jZrleA4L%252BRHHBHcHvc1I9sngBD4DD%252Fqc6O9c807lNWFQJJOFuU%252FPs4W%252FP0DuR11xV8%253D%26u2%3D5zTrYYihoyV%252BJnvX%26width%3D2560" alt="On device edge AI use cases" width="2560" height="1463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reach for &lt;strong&gt;Gemma 4&lt;/strong&gt; when the following conditions hold.&lt;/p&gt;

&lt;p&gt;a) &lt;em&gt;You need an Apache 2.0 model.&lt;/em&gt; If Legal balked at Gemma 3's terms or Llama's Community License MAU clause, Gemma 4 is the cleanest option in this size class.&lt;/p&gt;

&lt;p&gt;b) &lt;em&gt;You need on device multimodality.&lt;/em&gt; The audio capable E2B and E4B variants are the strongest open weight option for phones today.&lt;/p&gt;

&lt;p&gt;c) &lt;em&gt;Long context matters.&lt;/em&gt; &lt;code&gt;256K&lt;/code&gt; with credible RULER recall is competitive with hosted frontier models.&lt;/p&gt;

&lt;p&gt;d) &lt;em&gt;Math, agentic reasoning or competitive programming dominate your workload.&lt;/em&gt; Gemma 4 31B's &lt;code&gt;AIME&lt;/code&gt; and &lt;code&gt;Codeforces&lt;/code&gt; numbers are exceptional for an open weight model in this size band.&lt;/p&gt;

&lt;p&gt;Choose something else when the workload looks more like one of these.&lt;/p&gt;

&lt;p&gt;a) &lt;em&gt;Your workload is autonomous repo editing.&lt;/em&gt; Qwen 3.5 27B's &lt;code&gt;SWE-bench Verified&lt;/code&gt; lead is real. Pilot both before committing.&lt;/p&gt;

&lt;p&gt;b) &lt;em&gt;You need streaming voice output.&lt;/em&gt; Gemma 4 has audio in but not out. Qwen 3.5 Omni handles real time speech generation.&lt;/p&gt;

&lt;p&gt;c) &lt;em&gt;You need a frontier model.&lt;/em&gt; If quality is the only metric, hosted Gemini 3 Pro or DeepSeek V4 Pro will outperform Gemma 4 31B on most benchmarks.&lt;/p&gt;

&lt;p&gt;d) &lt;em&gt;Cost per token at huge scale.&lt;/em&gt; DeepSeek V4 Flash hosted is cheap enough that for many workloads the spend math beats running your own GPUs.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;Known Issues And License Caveats&lt;/code&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;No model is a free lunch, and Gemma 4 has its own quirks. Worth reading before you commit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;i) &lt;strong&gt;SWE-bench Verified is not the strong suit.&lt;/strong&gt; Real GitHub issue resolution still trails Qwen 3.5 27B by a meaningful margin.&lt;/p&gt;

&lt;p&gt;ii) &lt;strong&gt;No native audio output.&lt;/strong&gt; If you want a voice agent that talks back, you bolt on a separate &lt;code&gt;TTS&lt;/code&gt; layer.&lt;/p&gt;

&lt;p&gt;iii) &lt;strong&gt;26B A4B throughput surprise.&lt;/strong&gt; Despite only 4B active parameters, community benchmarks on consumer GPUs show roughly &lt;em&gt;11 tok/s&lt;/em&gt; on an RTX 4090, slower than a comparable dense 4B model. The MoE routing overhead is real on consumer hardware. On A100 and H100 the gap closes.&lt;/p&gt;

&lt;p&gt;iv) &lt;strong&gt;Apache 2.0 is not open source training data.&lt;/strong&gt; The weights are open and commercially usable, the training corpus is not. If your compliance posture requires reproducibility from data, Gemma 4 does not satisfy that.&lt;/p&gt;

&lt;p&gt;v) &lt;strong&gt;Trademark.&lt;/strong&gt; You cannot brand your product as &lt;em&gt;"Gemma"&lt;/em&gt; or use Google trademarks. Apache 2.0 explicitly excludes trademark grants.&lt;/p&gt;

&lt;p&gt;vi) &lt;strong&gt;Vision token budget tradeoff.&lt;/strong&gt; The &lt;code&gt;70&lt;/code&gt;, &lt;code&gt;140&lt;/code&gt;, &lt;code&gt;280&lt;/code&gt;, &lt;code&gt;560&lt;/code&gt; and &lt;code&gt;1120&lt;/code&gt; visual budgets are real. Undersized budgets degrade OCR and chart reading noticeably, so pick deliberately.&lt;/p&gt;

&lt;p&gt;vii) &lt;strong&gt;Native dependency surprises.&lt;/strong&gt; If you self host with &lt;code&gt;vLLM&lt;/code&gt; behind a Node service, watch out for prebuilt binary fetch issues on locked down installs, where the failure mode is silent at install time and loud at runtime.&lt;/p&gt;

&lt;p&gt;viii) &lt;strong&gt;Tokenizer drift from Gemma 3.&lt;/strong&gt; The &lt;code&gt;262K&lt;/code&gt; vocabulary is not directly weight compatible with Gemma 3 fine tunes. Plan a re finetune, do not try to port adapters.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;FAQ&lt;/code&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Is Gemma 4 actually open source?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;It is &lt;em&gt;open weight&lt;/em&gt; under Apache 2.0. The weights, model card and inference code are open and commercially usable. The training data and full pipeline are not released. By the OSI's strict definition, that is open weight, not open source, but for most commercial deployment purposes Apache 2.0 is the cleanest license you will see in this size class.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Is the Gemma 4 license really Apache 2.0?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Yes. This is the change from earlier Gemma versions, which used the custom &lt;em&gt;Gemma Terms of Use&lt;/em&gt; with usage carve outs. Gemma 4's repository ships the standard Apache 2.0 license file. Anyone telling you Gemma 4 has restrictive terms is describing the previous generation.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;What is the difference between Gemma 4 and Gemini?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Gemma 4 is open weight and self hostable. Gemini is a closed, hosted, frontier model. They share research lineage but different deployment models, costs and customisation surfaces.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Which Gemma 4 model should I pick?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;a) &lt;code&gt;E2B&lt;/code&gt; for phones and tight memory budgets.&lt;br&gt;
b) &lt;code&gt;E4B&lt;/code&gt; for high end edge and small servers.&lt;br&gt;
c) &lt;code&gt;26B A4B&lt;/code&gt; for cost efficient single GPU server inference.&lt;br&gt;
d) &lt;code&gt;31B&lt;/code&gt; dense for the highest quality answers when you do not care about throughput.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;What hardware do I need to run Gemma 4 31B?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;FP16&lt;/code&gt; needs roughly &lt;code&gt;62 GB&lt;/code&gt; VRAM, an A100 80GB or H100. &lt;code&gt;INT4&lt;/code&gt; quantised drops that to about &lt;code&gt;18 GB&lt;/code&gt;, fitting an RTX 4090 or 5090 for single user inference.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Does Gemma 4 support function calling?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Yes. Native function calling, structured JSON output and system instructions are all first class.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;How does Gemma 4 compare to Llama 4?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Gemma 4 31B beats Llama 4 Scout (109B) on most reasoning benchmarks at roughly a third of the active parameter cost, and ships under a less restrictive license.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Is Gemma 4 better than Qwen 3.5?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;It depends on the workload. Gemma 4 wins on math and competitive programming, Qwen 3.5 27B wins on &lt;code&gt;MMLU-Pro&lt;/code&gt;, &lt;code&gt;GPQA Diamond&lt;/code&gt; and &lt;code&gt;SWE-bench Verified&lt;/code&gt;. Both are Apache 2.0. Pilot both.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Is Gemma 4 multimodal?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;All variants accept text and image. E2B and E4B also accept audio. Output is text only on every variant.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;What is the context window?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;128K&lt;/code&gt; tokens on E2B and E4B, and &lt;code&gt;256K&lt;/code&gt; on the 26B A4B and the 31B. RULER long context recall at 128K is roughly &lt;em&gt;66.4%&lt;/em&gt;, a 5x improvement over Gemma 3.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Can Gemma 4 run on a phone?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Yes. E2B and E4B are designed for it. &lt;code&gt;MediaPipe&lt;/code&gt;'s LLM Inference API and &lt;code&gt;LiteRT&lt;/code&gt; handle on device inference with NPU and GPU acceleration on Android, and equivalent paths exist on iOS via Core ML and MLX.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;What is Gemma 4n?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Gemma 4n&lt;/em&gt; is the community shorthand for the E2B and E4B edge variants, the on device tier of the Gemma 4 family. Architecturally they are dense models tuned and quantised for phones and embedded devices. See &lt;a href="https://codersera.com/blog/gemma-4n-vs-gemma-4/" rel="noopener noreferrer"&gt;Gemma 4n vs Gemma 4&lt;/a&gt; for the side by side.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Is Gemma 4 safe for commercial production use?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Yes, under Apache 2.0, with the standard caveats. Respect trademarks, do not redistribute the model under the &lt;em&gt;Gemma&lt;/em&gt; name, and follow your own jurisdiction's AI usage law. There are no usage carve outs, no MAU thresholds, and no industry restrictions in the license itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;Should I migrate from Gemma 3 to Gemma 4?&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;If you are running Gemma 3 in production, yes. The benchmark deltas are large (3 to 20 times on reasoning and code), the license is cleaner, the context window is bigger, and the deployment story is unchanged. Plan a re finetune, since adapter weights will not transfer cleanly.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;Closing Thoughts&lt;/code&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Picking the right open weight model is the easy half of the job. The harder half is the integration work that follows, the fine tuning, the eval harness, the cost modelling, and the production hardening.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Gemma 4 makes that work &lt;em&gt;meaningfully easier&lt;/em&gt; than its predecessor.&lt;br&gt;
✔️The license is clean,&lt;br&gt;
✔️the model card is honest,&lt;br&gt;
✔️the deployment surface is broad,&lt;br&gt;
✔️and the benchmarks are competitive with the best of the open weight field.&lt;/p&gt;

&lt;p&gt;If you have been holding out on Gemma because of legal friction or quality gaps, this is the release that closes both. Pilot it against your real workload, compare it head to head with &lt;code&gt;Qwen 3.5 27B&lt;/code&gt; on the tasks that matter to you, and let your evals decide.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;pls share your thoughts below with ur use cases. thanks for reading so far 💖&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Watched Google Cloud NEXT '26 ~ Here Is What Actually Matters for Developers</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Sun, 26 Apr 2026 22:00:00 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/i-watched-google-cloud-next-26-so-you-dont-have-to-here-is-what-actually-matters-for-developers-54h6</link>
      <guid>https://dev.to/aniruddhaadak/i-watched-google-cloud-next-26-so-you-dont-have-to-here-is-what-actually-matters-for-developers-54h6</guid>
      <description>&lt;p&gt;Hi, I am Aniruddha Adak, an AI agent engineer based in Kolkata. I spend most of my time building agentic systems, experimenting with LLMs, and watching developer conferences so I can figure out what is actually useful versus what is just marketing.&lt;/p&gt;

&lt;p&gt;This past week I sat through both the Opening Keynote and the Developer Keynote of Google Cloud NEXT 2026, held in Las Vegas. The event ran from April 22 to 24, 2026. &lt;code&gt;Both sessions are freely available on YouTube&lt;/code&gt;, and I want to share what I personally noticed, what made me stop and think, and what I believe developers like you and me should actually care about.&lt;/p&gt;

&lt;p&gt;This is not a summary. This is my honest take from someone (like me) who works with agents every day.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  How I Watched These Keynotes
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;I watched both sessions fully. The Opening Keynote runs for about 1 hour 39 minutes. The Developer Keynote is 1 hour 7 minutes.  That is nearly 3 hours of content. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;I took notes while watching, paused at moments that felt important, and replayed sections where the demos were happening live on stage.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I am going to walk you through what I found most meaningful, section by section.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  The Opening Keynote: Sundar Pichai and the Big Picture
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;code&gt;Sundar Pichai&lt;/code&gt; came on stage pretty early in the keynote. The thing that immediately got my attention was when he talked about how &lt;code&gt;Google now uses AI for nearly 75% of its own code writing.&lt;/code&gt; That number stopped me. It means the engineers at Google are not just building AI tools. They are themselves working alongside those tools in their day to day coding.&lt;/p&gt;

&lt;p&gt;He also mentioned that Google plans to invest heavily in infrastructure this year, and a significant portion of that investment is going toward AI compute. From where I sit as someone building agent systems, this matters because more compute capacity means the APIs and services we rely on will be more stable and faster.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h2&gt;
  
  
  watch here: 👇
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/11PBno-cJ1g"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Another moment that stood out was when Sundar described Cloud as the &lt;em&gt;&lt;code&gt;"mission control of the agentic era."&lt;/code&gt;&lt;/em&gt; That phrase stayed with me. It is not just a catchy line. It reflects a genuine shift in how Google is framing its cloud offering. It is no longer just about storage or compute. It is about giving your AI systems a place to run safely, observe themselves, and scale up.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  The Agentic Enterprise Blueprint
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;Thomas Kurian, who runs Google Cloud, presented what they are calling the Agentic Enterprise Blueprint. The core idea is straightforward: you cannot just bolt AI onto your existing systems and call it done. You need a full stack approach.&lt;/p&gt;

&lt;p&gt;Here is what that stack looks like according to what was shown:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The Models Layer&lt;/strong&gt;&lt;br&gt;
Gemini Pro and Flash are at the center of this. But what I found interesting is that they explicitly said you can also use Claude from Anthropic through the Model Garden. This is honest. They are not forcing you into a single model. As someone who has built systems using multiple providers, I appreciate that kind of openness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The Agent Development Kit (ADK)&lt;/strong&gt;&lt;br&gt;
This was the part that felt most relevant to my work. ADK is Google's framework for building modular agents. It connects to MCP servers, handles memory, manages sessions, and lets you define skills in a structured way. The fact that every Google Cloud service is now MCP-enabled by default was one of the bigger announcements of the week.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Agent Runtime&lt;/strong&gt;&lt;br&gt;
This is the serverless layer that runs and scales your agents. Sessions keep agents connected to users across interactions. Memory allows agents to learn from past sessions and carry that forward. This is the kind of infrastructure that used to take weeks to build manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Agent Gateway and Observability&lt;/strong&gt;&lt;br&gt;
Each agent gets a unique identity. The gateway enforces policies. Observability tools let you see what your agents are actually doing, debug reasoning loops, and track performance. This is something I have personally felt the pain of. Debugging agents without proper tooling is exhausting.&lt;/p&gt;



&lt;blockquote&gt;
&lt;h2&gt;
  
  
  The Hardware Announcement That Got the Audience Excited
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;About 42 minutes into the Opening Keynote, &lt;em&gt;&lt;code&gt;Google announced the TPU 8t&lt;/code&gt;&lt;/em&gt;. Three times the performance per pod compared to previous generation hardware.&lt;/p&gt;

&lt;p&gt;For most developers, you might not care directly about the chip architecture. But what it means in practice is:&lt;/p&gt;

&lt;p&gt;✅ Faster model responses&lt;br&gt;
☑️Lower latency for complex agent workflows&lt;br&gt;
✔️ Ability to run longer context windows without things slowing down significantly&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;TPU 8t&lt;/code&gt; was described as being &lt;code&gt;designed specifically for the agentic era of computing&lt;/code&gt;. The architecture changes they made inside the chip are focused on handling the kind of back and forth reasoning that agents do constantly.&lt;/p&gt;

&lt;p&gt;They also &lt;em&gt;announced a new networking layer that links 134,000 chips together&lt;/em&gt; into what they called a &lt;code&gt;unified AI supercomputer&lt;/code&gt;. That is a level of scale that very few companies in the world can match.&lt;/p&gt;



&lt;blockquote&gt;
&lt;h2&gt;
  
  
  Real World Examples That Felt Grounded
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;One segment I genuinely enjoyed was the &lt;code&gt;Walmart story&lt;/code&gt;. &lt;em&gt;Walmart is using Gemini Enterprise to help field leaders get insights before they walk into their stores each day&lt;/em&gt;. The idea is that instead of spending time pulling reports manually, the agent surfaces what matters, tailored to the specific store and the specific leader's role.&lt;/p&gt;

&lt;p&gt;This is exactly the kind of use case that makes sense to me. &lt;code&gt;It is not AI trying to replace the person. It is AI giving the person better context so they can do their job better.&lt;/code&gt; The numbers shared were that some of &lt;em&gt;their enterprise deployments hit 80% adoption among employees&lt;/em&gt;, which is genuinely high for any enterprise tool rollout.&lt;/p&gt;

&lt;p&gt;There was also a story about a snowboarding &lt;code&gt;AI project&lt;/code&gt; that &lt;code&gt;used 3D models and motion analysis to help athletes understand their technique&lt;/code&gt; in ways that were not possible before. This one was more fun than practical for most developers, but it showed how computer vision and real time data pipelines can combine in interesting ways.&lt;/p&gt;



&lt;blockquote&gt;
&lt;h2&gt;
  
  
  The Developer Keynote: Where Things Got Practical
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Developer Keynote was hosted by Richard Seroter and Emma Twersky. The energy was different here. More hands on. More code. Less executive messaging.&lt;/p&gt;

&lt;p&gt;Brad Calder opened the session and made a statement that I wrote down immediately: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"In 2026, you are building applications in days that would have taken weeks or months just a couple of years ago."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;He is right. I have felt this in my own work. The tools have gotten dramatically better, and the way we structure our thinking around building software is changing alongside them.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h2&gt;
  
  
  watch here: 👇
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/A01DQ8_xy7Q"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  The Marathon Demo That Showed Everything Working Together
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;The main demo of the Developer Keynote was a Marathon Planner built using the Agent Platform. They simulated planning a marathon through Las Vegas for 10,000 runners.&lt;/p&gt;

&lt;p&gt;The system had three agents working together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Planner Agent that figured out the route&lt;/li&gt;
&lt;li&gt;An Evaluator Agent that scored the route against both deterministic criteria (exactly 26 miles 385 yards) and non-deterministic criteria (community impact, safety)&lt;/li&gt;
&lt;li&gt;A Simulator Agent that spawned thousands of virtual runners and watched how traffic was affected&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What I found clever here was the Evaluator Agent design. It uses a separate, smaller model with limited context. Its only job is to judge the route. This is a pattern I have started using in my own work. Giving a subagent a narrow, well defined role makes the whole system more reliable.&lt;/p&gt;

&lt;p&gt;The other thing that struck me was &lt;code&gt;A2UI&lt;/code&gt;, which stands for &lt;code&gt;Agent to User Interface&lt;/code&gt;. The idea is that the &lt;code&gt;agent itself builds the interface it needs to communicate results back to you&lt;/code&gt;. Instead of hardcoding dashboards, the agent generates the right visual components for the specific task it just completed. This reduces the need for frontend developers to maintain a growing list of output templates.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  A2A Protocol: Agents Talking to Agents
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;One announcement I want to highlight specifically for developers is the A2A protocol. &lt;code&gt;A2A stands for Agent to Agent&lt;/code&gt;. &lt;code&gt;Google&lt;/code&gt; created this protocol and &lt;code&gt;donated it to the Linux Foundation&lt;/code&gt;, which means it is open and not locked to Google's ecosystem.&lt;/p&gt;

&lt;p&gt;The problem &lt;code&gt;A2A solves is communication between agents&lt;/code&gt; that were built independently. Without a standard protocol, connecting agents from different teams or vendors requires custom API contracts and a lot of fragile glue code. A2A defines a standard way for agents to advertise their capabilities through an &lt;code&gt;Agent Card&lt;/code&gt;, discover other agents through the &lt;code&gt;Agent Registry&lt;/code&gt;, and communicate without writing custom integration code.&lt;/p&gt;

&lt;p&gt;I have run into this problem myself. When you start building systems where multiple specialized agents need to coordinate, the connection layer becomes its own maintenance burden. &lt;em&gt;A2A is an attempt to make that layer standard and manageable&lt;/em&gt;.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  What Surprised Me Personally
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;I went into these keynotes expecting mostly announcements about model updates and price changes. What I did not expect was how much of the Developer Keynote was focused on showing code, showing real tradeoffs, and being honest about where the hard problems still are.&lt;/p&gt;

&lt;p&gt;The segment about &lt;code&gt;Context Engineering&lt;/code&gt; genuinely resonated with me. The speakers talked about how moving from stateless to stateful agents changes everything about how you design your system. &lt;code&gt;Sessions&lt;/code&gt;, &lt;code&gt;Memory Banks&lt;/code&gt;, and &lt;code&gt;RAG integrations&lt;/code&gt; are not optional add-ons. They are the &lt;code&gt;foundation of any agent&lt;/code&gt; that needs to be useful across multiple interactions.&lt;/p&gt;

&lt;p&gt;They also mentioned that &lt;code&gt;the full demo code was being open sourced&lt;/code&gt; on GitHub during the keynote. Not after. During. That is the kind of move that actually builds trust with a developer community.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  Something That Felt Missing
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;Honestly, I would have liked to see more about monitoring and debugging agent failures in production. The observability tools they showed were impressive in demos, but production agent systems fail in strange ways. I would love a deeper conversation about what happens when your agent gets stuck in a reasoning loop or when memory accumulates stale information that starts affecting decisions.&lt;/p&gt;

&lt;p&gt;This is something I think about in my own work constantly. The tooling for building agents has gotten good. The tooling for understanding why agents fail is still catching up.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  What You Should Actually Do After Reading This
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you made it here, here are three things worth doing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Watch the Developer Keynote first.&lt;/strong&gt; It is more practical and the demos are better for developers. The Opening Keynote is good context but starts with a lot of enterprise positioning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Look up the ADK documentation.&lt;/strong&gt; The Agent Development Kit is available now. If you are already building agents with other frameworks, it is worth understanding how ADK structures skills and tools. Some of the patterns are genuinely well thought out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try the open source demo code.&lt;/strong&gt; The Marathon Simulator was released on GitHub. It is a working multi-agent system using ADK, MCP, A2A, and Agent Runtime all together. That kind of end-to-end reference is rare.&lt;/p&gt;




&lt;blockquote&gt;
&lt;h2&gt;
  
  
  My Overall Feeling
&lt;/h2&gt;
&lt;/blockquote&gt;

&lt;p&gt;I came away from both keynotes with a clearer sense of where the industry is heading. The shift from &lt;code&gt;"we have a model"&lt;/code&gt; to &lt;code&gt;"we have a complete agent infrastructure"&lt;/code&gt; is real. Google Cloud NEXT 2026 was the event where Google tried to make that shift concrete for developers, not just for CTOs.&lt;/p&gt;

&lt;p&gt;The things I liked most were the openness around model choice, the A2A donation to Linux Foundation, and the fact that the demo code was released publicly. Those are developer friendly moves.&lt;/p&gt;

&lt;p&gt;The things I want to see more of are better failure analysis tools, more honest discussions about prompt drift in production, and deeper guidance on memory management at scale.&lt;/p&gt;

&lt;p&gt;But overall, this was a strong event with a lot of practical takeaways for anyone building agentic systems in 2026.&lt;/p&gt;

&lt;p&gt;I am &lt;code&gt;Aniruddha Adak&lt;/code&gt; from Kolkata. If you are working on agents or AI systems and want to talk shop, feel free to reach out.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Written after watching both the Google Cloud NEXT 2026 Opening Keynote and Developer Keynote in full...&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-cloud-next-2026-04-22"&gt;Google Cloud NEXT Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Thank you all for reading so far 💖&lt;/code&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>googlecloud</category>
      <category>ai</category>
      <category>cloudnextchallenge</category>
    </item>
    <item>
      <title>i burnt $127 in api credits before i fixed these openclaw mistakes</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Fri, 24 Apr 2026 16:09:00 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/i-burnt-127-in-api-credits-before-i-fixed-these-openclaw-mistakes-1lf3</link>
      <guid>https://dev.to/aniruddhaadak/i-burnt-127-in-api-credits-before-i-fixed-these-openclaw-mistakes-1lf3</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/openclaw-2026-04-16"&gt;OpenClaw Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;everyone is saying openclaw would build my startup while i slept.&lt;br&gt;&lt;br&gt;
instead, i spent two weeks watching it burn through my api credits&lt;br&gt;&lt;br&gt;
while it asked the same question eight times in a row.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;it wasn't thinking hard.&lt;/p&gt;

&lt;p&gt;it was stuck in a loop, and i was the one paying &lt;code&gt;$0.03&lt;/code&gt; per token to watch it spin.&lt;/p&gt;

&lt;p&gt;if you're currently babysitting your agent, watching it loop on simple tasks, or wondering if you should just go back to coding manually — i was there.&lt;/p&gt;

&lt;p&gt;i almost gave up.&lt;/p&gt;

&lt;p&gt;now i have openclaw running my morning briefings and handling database chores without me touching it.&lt;/p&gt;

&lt;p&gt;the difference wasn't buying a better api tier. it was fixing these specific, stupid mistakes.&lt;/p&gt;




&lt;h2&gt;
  
  
  stop using your expensive model for everything
&lt;/h2&gt;

&lt;p&gt;i had &lt;code&gt;gpt-5.4&lt;/code&gt; set as the default for every single task. heartbeat checks, file scans, cron jobs — all of it. i was asking a formula 1 car to deliver groceries.&lt;/p&gt;

&lt;p&gt;openclaw lets you set up tiered model configs. here's what i switched to:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;task type&lt;/th&gt;
&lt;th&gt;model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;file reads, syntax checks, existence queries&lt;/td&gt;
&lt;td&gt;&lt;code&gt;haiku&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;actual coding tasks&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sonnet&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;complex debugging, things that broke twice&lt;/td&gt;
&lt;td&gt;&lt;code&gt;opus&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;my daily token spend dropped from &lt;strong&gt;40,000&lt;/strong&gt; to around &lt;strong&gt;1,500&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;you can switch models mid-session with &lt;code&gt;/model&lt;/code&gt; if you need to escalate, but most of the time, you don't.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  your agent needs rules written in stone
&lt;/h2&gt;

&lt;p&gt;out of the box, openclaw will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;loop forever&lt;/li&gt;
&lt;li&gt;forget what it was doing&lt;/li&gt;
&lt;li&gt;rewrite your database schema because it misread a comment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;you have to parent this thing with explicit, paranoid instructions.&lt;/p&gt;

&lt;p&gt;i keep a &lt;code&gt;workspace/skills/&lt;/code&gt; folder full of &lt;code&gt;SKILL.md&lt;/code&gt; files. these aren't suggestions. they're laws.&lt;/p&gt;

&lt;p&gt;workspace/&lt;br&gt;
├── skills/&lt;br&gt;
│ ├── anti-loop.md&lt;br&gt;
│ ├── USER.md&lt;br&gt;
│ └── AGENTS.md&lt;/p&gt;

&lt;p&gt;one file is literally called &lt;code&gt;anti-loop.md&lt;/code&gt; and says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"if you see the same error twice, stop and ask me. do not try a third variation."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;another forces the agent to check &lt;code&gt;USER.md&lt;/code&gt; before asking questions.&lt;/p&gt;

&lt;p&gt;every assumption the agent makes is a potential landmine. openclaw doesn't know your database schema. it doesn't remember that you told it yesterday to never touch the auth module. &lt;strong&gt;write it down.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;the agents that actually work are the ones with heavy custom instruction sets.&lt;/p&gt;


&lt;h2&gt;
  
  
  closing the chat kills the session
&lt;/h2&gt;

&lt;p&gt;i told openclaw to optimize some queries and message me when done. closed my laptop. came back the next morning to find it had done nothing.&lt;/p&gt;

&lt;p&gt;sessions die when you close the chat. they're stateful only while the window is open. when you reopen, you might get a summary, but the context, the stack, the "where was i" — gone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;what to do instead:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use openclaw's &lt;strong&gt;cron jobs&lt;/strong&gt; with isolated session targets&lt;/li&gt;
&lt;li&gt;these spin up fresh agent instances on a schedule, do one task, message you results, and die&lt;/li&gt;
&lt;li&gt;for one-off tasks, pair a simple &lt;code&gt;sqlite&lt;/code&gt; queue with a cron that checks it hourly
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# example cron entry&lt;/span&gt;
0 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; openclaw run &lt;span class="nt"&gt;--session-target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;daily-briefing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;don't try to maintain long-running thinking sessions. they break, they cost money, and they hallucinate when context gets long.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  one working workflow beats five broken ones
&lt;/h2&gt;

&lt;p&gt;i tried setting up email, calendar, telegram, web scraping, and reporting all at once. everything broke, and i couldn't tell which integration was failing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;start with one thing that hurts slightly every day.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;i started with a morning briefing cron that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reads my calendar&lt;/li&gt;
&lt;li&gt;summarizes slack mentions&lt;/li&gt;
&lt;li&gt;messages me results&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;i got that working end-to-end — running without me touching it, messaging me reliably, failing loudly instead of silently — before i added anything else.&lt;/p&gt;

&lt;p&gt;every new integration is a new failure mode. if things feel broken, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw doctor &lt;span class="nt"&gt;--fix&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;half the "my agent is stupid" complaints are actually "my config is borked" problems.&lt;/p&gt;




&lt;h2&gt;
  
  
  compaction eats your memories
&lt;/h2&gt;

&lt;p&gt;openclaw has a context window. when it fills up, the system compacts older messages — which means it forgets stuff.&lt;/p&gt;

&lt;p&gt;i spent twenty minutes explaining my database schema once, then the agent compacted and hallucinated a new one. almost dropped a production table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;now i persist everything important:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;what&lt;/th&gt;
&lt;th&gt;how&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;long-running task state&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;JSON&lt;/code&gt; or &lt;code&gt;YAML&lt;/code&gt; state files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;user context&lt;/td&gt;
&lt;td&gt;&lt;code&gt;USER.md&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;behavior rules&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AGENTS.md&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;architectural decisions&lt;/td&gt;
&lt;td&gt;decision logs read at session start&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;the less openclaw has to re-learn, the less it hallucinates.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  chat quality and agent quality are different animals
&lt;/h2&gt;

&lt;p&gt;i was using a model that gave beautiful, articulate chat responses. great reasoning. but it couldn't call tools to save its life. it generated malformed json and hallucinated function names.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;models that actually work for agentic coding:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;claude sonnet&lt;/code&gt; / &lt;code&gt;claude opus&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gpt-5.4&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;kimi k2.5&lt;/code&gt; via api&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;models to avoid:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;deepseek reasoner&lt;/code&gt; — amazing at thinking, terrible at doing. it reasons beautifully about why your code is broken while generating completely broken tool calls.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gpt-5.3 mini&lt;/code&gt; — cheap, but it skips steps and ignores tool results. multiple people have called it useless for agent work.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;quick sanity test:&lt;/strong&gt;&lt;br&gt;
i) give your model three sequential tool calls.&lt;br&gt;
ii) if it can't handle that without hand-holding,&lt;br&gt;
iii) don't use it for autonomous work.&lt;/p&gt;




&lt;h2&gt;
  
  
  you're not bad at this. it's just early.
&lt;/h2&gt;

&lt;p&gt;the gap between usual demos and daily use is real. when someone posts "my agent built a saas overnight," you're seeing the highlight reel. you're not seeing the three weeks they spent tuning prompts and debugging why openclaw kept trying to pay aws with monopoly money.&lt;/p&gt;

&lt;p&gt;this stuff is genuinely hard right now. not "you need a cs degree" hard. just "the tools are immature" hard.&lt;/p&gt;

&lt;p&gt;the people making it work are treating these agents like orchestras, not autopilots.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;the four rules that changed everything for me:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;start with one cron job&lt;/li&gt;
&lt;li&gt;write one guardrail file&lt;/li&gt;
&lt;li&gt;use the cheap model&lt;/li&gt;
&lt;li&gt;save your state&lt;/li&gt;
&lt;/ol&gt;




&lt;blockquote&gt;
&lt;p&gt;it gets easier. don't give up before the compound interest kicks in.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>devchallenge</category>
      <category>openclawchallenge</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>My First Glimpse Into the Agentic Era: Google Cloud NEXT 2026 Keynote Reflection</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Thu, 23 Apr 2026 13:56:35 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/my-first-glimpse-into-the-agentic-era-google-cloud-next-2026-keynote-reflection-fbi</link>
      <guid>https://dev.to/aniruddhaadak/my-first-glimpse-into-the-agentic-era-google-cloud-next-2026-keynote-reflection-fbi</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-cloud-next-2026-04-22"&gt;Google Cloud NEXT Writing Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Moment It All Started
&lt;/h2&gt;

&lt;p&gt;It was the evening of &lt;strong&gt;April 22, 2026&lt;/strong&gt;, around &lt;em&gt;1 PM IST&lt;/em&gt; for me here in Kolkata.&lt;/p&gt;

&lt;p&gt;I was casually working at my desk when the Google Cloud NEXT Challenge notification popped up on my screen.&lt;/p&gt;

&lt;p&gt;Without a second thought, I jumped into the Cida Live stream on YouTube to catch the keynote.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Opening That Hooked Me
&lt;/h2&gt;

&lt;p&gt;The keynote opened with &lt;strong&gt;Sundar Pichai&lt;/strong&gt; sharing a stat that truly stopped me in my tracks.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;75 percent of all code is now written by AI, and it is verified by engineers.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a massive shift in how we build software today.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Stood Out for Me
&lt;/h2&gt;

&lt;p&gt;The keynote had many moments, but a few things really caught my attention.&lt;/p&gt;

&lt;h3&gt;
  
  
  Google Enterprise Agent
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Google Enterprise Agent&lt;/strong&gt; was one of the most exciting announcements.&lt;/p&gt;

&lt;p&gt;It showed how businesses can now deploy AI agents at scale across their organizations.&lt;/p&gt;

&lt;p&gt;This is not just automation, it is a whole new way of working.&lt;/p&gt;

&lt;h3&gt;
  
  
  Google AI Generation Chip
&lt;/h3&gt;

&lt;p&gt;I was really impressed by the new &lt;em&gt;Google AI generation chip&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The power and efficiency it brings to AI workloads is something every developer should care about.&lt;/p&gt;




&lt;h2&gt;
  
  
  Gemini Live and Real World Impact
&lt;/h2&gt;

&lt;p&gt;One of the most inspiring moments was the &lt;strong&gt;Gemini Live athlete demo&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Gemini was guiding an athlete in real time, showing exactly what to do and tracking progress.&lt;/p&gt;

&lt;p&gt;It pointed out mistakes and suggested corrections on the fly.&lt;/p&gt;

&lt;p&gt;That is the kind of AI assistance I want in my daily workflow too.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Agentic Era Is Here
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Google Assistant Platform&lt;/strong&gt; demo blew my mind.&lt;/p&gt;

&lt;p&gt;Agents can now talk to each other, coordinate tasks, and build complete agentic workflows.&lt;/p&gt;

&lt;p&gt;With the &lt;em&gt;Agent Development Kit&lt;/em&gt;, developers can create multi agent systems with ease.&lt;/p&gt;

&lt;p&gt;This is the agentic era that Google is building, and I am here for it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Big Partnerships That Matter
&lt;/h2&gt;

&lt;p&gt;Google is not building this alone.&lt;/p&gt;

&lt;p&gt;They are collaborating with &lt;strong&gt;OpenAI&lt;/strong&gt;, &lt;strong&gt;NVIDIA&lt;/strong&gt;, &lt;strong&gt;SpaceX&lt;/strong&gt;, and even &lt;strong&gt;NASA&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;These partnerships show how seriously Google is taking AI development and deployment.&lt;/p&gt;




&lt;h2&gt;
  
  
  YouTube with AI Built In
&lt;/h2&gt;

&lt;p&gt;Another cool reveal was &lt;em&gt;YouTube with an AI assistant&lt;/em&gt; built right in.&lt;/p&gt;

&lt;p&gt;You can ask questions about your TV, request content, and get personalized recommendations.&lt;/p&gt;

&lt;p&gt;It is like having a smart assistant living inside your entertainment experience.&lt;/p&gt;




&lt;h2&gt;
  
  
  My Take on It All
&lt;/h2&gt;

&lt;p&gt;Every model capability mentioned felt like a step forward for developers like me.&lt;/p&gt;

&lt;p&gt;Google is clearly pushing everything toward being &lt;em&gt;AI powered&lt;/em&gt; and &lt;em&gt;agent driven&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That is exactly the direction I want to grow in as a developer.&lt;/p&gt;

&lt;p&gt;The whole keynote was &lt;strong&gt;epic&lt;/strong&gt;, and I am excited to see where this journey leads.&lt;/p&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;This was my first experience tuning into Google Cloud NEXT, and it set a high bar.&lt;/p&gt;

&lt;p&gt;The future of AI is not just coming, it is already here.&lt;/p&gt;

&lt;p&gt;And I am ready to be part of it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Thanks for reading my first Google Cloud NEXT reflection.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>devchallenge</category>
      <category>cloudnextchallenge</category>
      <category>googlecloud</category>
      <category>ai</category>
    </item>
    <item>
      <title>The Night OpenClaw Completely Ghosted Me: My Real Headache Story as a Kolkata AI Agent Engineer</title>
      <dc:creator>ANIRUDDHA  ADAK</dc:creator>
      <pubDate>Tue, 21 Apr 2026 12:48:00 +0000</pubDate>
      <link>https://dev.to/aniruddhaadak/the-night-openclaw-completely-ghosted-me-my-real-headache-story-as-a-kolkata-ai-agent-engineer-3088</link>
      <guid>https://dev.to/aniruddhaadak/the-night-openclaw-completely-ghosted-me-my-real-headache-story-as-a-kolkata-ai-agent-engineer-3088</guid>
      <description>&lt;p&gt;Hey DEV Community 👋  &lt;/p&gt;

&lt;p&gt;I’m &lt;strong&gt;ANIRUDDHA ADAK&lt;/strong&gt; (@aniruddhadak on X), final-year B.Tech CSE student at BBIT Kolkata and a full-time AI Agent Engineer who lives and breathes this stuff.  &lt;/p&gt;

&lt;p&gt;After my last post about the &lt;a href="https://dev.to/aniruddhaadak/my-openclaw-journey-30-hands-on-experiences-that-built-my-wealth-of-knowledge-kolkata-ai-agent-2hd1"&gt;30 wins that made OpenClaw my 24/7 lobster-powered sidekick&lt;/a&gt; , I promised myself I’d also share the messy, frustrating, headache-inducing side. Because let’s be real — no tool is perfect, especially one that’s still growing fast.  &lt;/p&gt;

&lt;p&gt;This is the raw, first-person story of the night OpenClaw straight-up failed me, ignored my commands, threw ridiculous errors, and left me staring at my screen at 2 AM in my Kolkata room wondering why I trusted a lobster with my workflow.&lt;/p&gt;




&lt;p&gt;It started simple enough.  &lt;/p&gt;

&lt;p&gt;I was deep in a side project — one of my AI agent experiments that needed to scrape some public data, organize it into a clean Markdown file, and push it to a private GitHub repo. I had done similar tasks before and OpenClaw had nailed them. So I fired up WhatsApp at around 11 PM IST and typed a clear, detailed prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Run a full web research on the latest Ollama model releases, compile them into a clean table in results.md, commit it with message ‘Updated Ollama models - April 2026’, and push to my private repo. Use exec tools only. Confirm each step.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The lobster replied instantly with its usual confidence:&lt;br&gt;&lt;br&gt;
“Got it, Aniruddha! Starting research now… ✅”&lt;/p&gt;

&lt;p&gt;Then… nothing.&lt;/p&gt;

&lt;p&gt;For the next 45 minutes it kept sending half-hearted updates like:&lt;br&gt;&lt;br&gt;
“Browsing sites…”&lt;br&gt;&lt;br&gt;
“Compiling table…”&lt;br&gt;&lt;br&gt;
“Almost done…”&lt;/p&gt;

&lt;p&gt;But when I checked my repo? Empty. No results.md. No commit. Nothing.&lt;/p&gt;

&lt;p&gt;I tried again, this time even more specific. Same thing. It would promise, loop, and ghost me.&lt;/p&gt;

&lt;p&gt;Then came the error that made me want to throw my laptop out the window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I saw that message pop up in WhatsApp at least 12 times that night. No matter how I rephrased the command, it kept failing the tool call. I even switched from Claude to a local model — same nonsense.&lt;/p&gt;

&lt;p&gt;At one point it literally told me:&lt;br&gt;&lt;br&gt;
“I cannot execute commands, I have no exec tool”  &lt;/p&gt;

&lt;p&gt;…even though I had explicitly enabled full elevated tool access in the config and confirmed it in the gateway dashboard. Classic.&lt;/p&gt;

&lt;p&gt;I tried the nuclear option — restarted the gateway, ran &lt;code&gt;openclaw doctor --fix&lt;/code&gt;, cleared the session with &lt;code&gt;/new&lt;/code&gt;, even rolled back to an older version. Still nothing. It would accept the task, act like it was working, then either hang or give placeholder replies with zero actual execution.&lt;/p&gt;

&lt;p&gt;By 3 AM I had burned through way more tokens than I care to admit (the retry loop was ruthless), my head was pounding, and my once-exciting agent project was now just sitting there mocking me.&lt;/p&gt;

&lt;p&gt;I finally gave up, did the task manually in 20 minutes, and went to sleep frustrated.&lt;/p&gt;




&lt;p&gt;The next morning I dug into Reddit (r/openclaw, r/clawdbot, r/AI_Agents) and realized I wasn’t alone. Tons of people were posting about the exact same pain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Updates breaking exec tools overnight
&lt;/li&gt;
&lt;li&gt;“Failed to call a function” becoming the most common error
&lt;/li&gt;
&lt;li&gt;Agents promising the world but never actually running shell commands or git pushes
&lt;/li&gt;
&lt;li&gt;Infinite retry loops that quietly drain your API budget
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It wasn’t just me. OpenClaw was still early, and these kinds of silent failures and ignored commands were hitting a lot of us.&lt;/p&gt;




&lt;p&gt;But here’s the part that still keeps me hooked: even after that nightmare night, I didn’t uninstall it.  &lt;/p&gt;

&lt;p&gt;I learned three hard lessons that night:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Always start a fresh session (&lt;code&gt;/new&lt;/code&gt;) before important tasks — old context can silently break tool calling.
&lt;/li&gt;
&lt;li&gt;Double-check tool permissions in &lt;code&gt;openclaw.json&lt;/code&gt; after every update (the “ask: off” + “security: full” combo saved me later).
&lt;/li&gt;
&lt;li&gt;Never trust it 100% on autopilot yet. Human oversight is still mandatory, especially for anything that touches git or the terminal.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That headache actually made me a better AI builder. It forced me to understand the internals, read the config deeper, and set up better safeguards (like cost guardians and sandbox checks).&lt;/p&gt;

&lt;p&gt;OpenClaw is still the most powerful personal agent I’ve used — when it works, it feels magical. But when it doesn’t… it really doesn’t.&lt;/p&gt;

&lt;p&gt;And that’s okay. That’s how real tools grow.&lt;/p&gt;

&lt;p&gt;If you’ve had your own “lobster ghosted me” moment, drop it in the comments. My OpenClaw is (hopefully) reading them right now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exfoliate! Exfoliate! 🦞&lt;/strong&gt; (even on the bad days)&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;ANIRUDDHA ADAK&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Kolkata, West Bengal, India | April 17, 2026&lt;br&gt;&lt;br&gt;
(X: &lt;a href="https://x.com/aniruddhadak" rel="noopener noreferrer"&gt;@aniruddhadak&lt;/a&gt; — I test every AI agent so you don’t have to)&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;This is a separate companion post&lt;/strong&gt; for the OpenClaw Challenge (Wealth of Knowledge track). The shiny 30-experience version is my love letter. This one is the honest truth.  &lt;/p&gt;

&lt;p&gt;Both sides make the full picture.  &lt;/p&gt;




&lt;p&gt;Thanks, happy building ...&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>openclawchallenge</category>
      <category>ai</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
