Mukunda Rao Katta

Posted on May 19

gemma4-safe-agent: a tool-using research agent on Gemma 4 e2b

#gemma #gemmachallenge #devchallenge #showdev

Gemma 4 Challenge: Build With Gemma 4 Submission

Submission for the Gemma 4 DEV Challenge, Build track. Companion to my Write-track post on the five libs behind it.

What it is

A tool-using research agent that runs locally on Gemma 4 e2b via Ollama, in around 200 lines of Node.

You give it a question. It picks between two tools, reads a Wikipedia page, then returns a structured JSON answer with sources. No API key. No rate limit. Two GB of RAM and an Ollama instance is the whole stack.

ollama pull gemma4:e2b
git clone https://github.com/MukundaKatta/gemma4-safe-agent
cd gemma4-safe-agent && npm install
npm run demo -- "What is RLHF?"

{
  "final": "RLHF is a technique that uses human preferences as a reward signal to fine-tune language models.",
  "sources": ["https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback"],
  "steps": 2
}

Repo: github.com/MukundaKatta/gemma4-safe-agent

Why Gemma 4 e2b specifically

Gemma 4 ships in four sizes: e2b and e4b for edge and mobile, a 26B Mixture-of-Experts model, and a 31B dense model for servers. I picked e2b on purpose.

Reasons:

Runs anywhere. Two GB of RAM, no network, no key. The agent works on a CI runner, a Raspberry Pi, an old MacBook. The bigger sizes do not.
Hardest reliability case. A 2B-class model makes more parse mistakes and more arg mistakes than a 26B. If the scaffolding holds at the 2B level, the bigger ones are a drop-in via GEMMA_MODEL=gemma4:e4b.
Real product surface. Cheap, fast, local agents are where on-device AI is going. e2b is the right target for the kind of agent you'd actually ship in a desktop app, a mobile shell, or a browser extension.

The same agent runs against any of the four Gemma 4 variants with one env var change.

How it works

The whole agent is a small loop:

for (let step = 0; step < MAX_STEPS; step++) {
  const fitted = fit(messages, { maxTokens: 4096, preserveSystem: true, preserveLastN: 2 });
  const raw = await ollamaChat(fitted.messages);
  const action = parseAction(raw);

  if (action.kind === 'tool') {
    const result = await TOOLS[action.tool].fn(action.args);
    messages.push({ role: 'assistant', content: raw });
    messages.push({ role: 'user', content: `tool_result: ${result}` });
    continue;
  }

  return cast({ llm, validate, prompt: 'Restate as JSON: ...' });
}

The whole run is wrapped in an agentguard.firewall block. Each tool is wrapped with agentvet.vet and agentsnap.traceTool. That gives me:

Context budget management so Gemma 4 e2b never blows its small window
Network egress allowlist so a prompt injection cannot redirect the agent to fetch an attacker URL
Tool-arg validation so a hallucinated fetch_url({ url: 12345 }) never runs
Trace snapshots so swapping models or tweaking prompts shows up as a CI diff, not a production surprise
Final-answer JSON enforcement with a validate-and-retry loop, which is the load-bearing piece for getting clean JSON out of a 2B model

I wrote about the scaffolding in detail in the Write-track companion post. Here the focus is the agent and the demo.

What you can run

The repo ships three entry points:

npm run demo -- "...": real run against your local Gemma 4 e2b
npm run demo:mock: same agent, with fetch_url returning canned pages (no internet needed)
AGENT_MOCK=1 node examples/run-stub.js: deterministic stub LLM in place of Gemma 4, so the whole pipeline runs in CI without any model at all

The third one is the one I use for snapshot regression tests. It proves the agent's tool-use behavior is stable even with an LLM swapped out.

What surprised me

Two things.

Gemma 4 e2b picks the right tool more often than I expected. The model is small but the tool-selection task is well-bounded ("you have these two tools, here's the schema, return one JSON"). When the surrounding scaffolding catches arg mistakes and JSON glitches, the model's reasoning is the part that doesn't need help.
The final-answer step is where the model really needs the cast loop. Asking for "JSON only, no prose" still produced Sure here you go: {...} enough of the time that I would not trust the agent without agentcast wrapping that step. With it, the post-condition becomes a guarantee.

Try it

Repo: github.com/MukundaKatta/gemma4-safe-agent (MIT)

Issues and PRs welcome. The five scaffolding libs are all on npm under @mukundakatta/* and are zero-dep, so you can pull them into your own Gemma 4 projects one at a time.

If you build something on top of this, drop me a link.

Have fun with Gemma 4.

Top comments (2)

thehwang • May 19

Nice ! I'm researching the same angle (agent + e2b) for the next challenge. Curious how you handled tool-use reliability at this size. In my multimodal cross-reference test e2b had two distinct failure modes depending on prompt structure.

Mukunda Rao Katta • May 29

Tool-use reliability was the hardest part at this size. What worked: a strict JSON schema per tool plus a coerce-and-retry step, the small model got intent right but arg types wrong (string where it needed an int). The two failure modes I saw were skipping the tool and just narrating, or calling with malformed args. Tight schema plus one self-correction turn fixed most of both. Which two did you hit?