This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
mac-agent-rust is an early-stage local AI agent that controls your Mac through
natural language — running entirely on-device, no cloud, no API keys.
It is under active development. Right now it can:
- navigate Safari tabs
- Draft and compose mails in the Mail app
- Create notes in Apple Notes
- Display system notifications
- Get and set system volume
You type what you want in plain English. The agent writes the AppleScript, runs it
on your machine, reads the result, and responds. Everything stays local.
Demo
Code
github.com/Tejas1Koli/mac-agent-rust
Built with:
- Rust — agent loop, tool execution, history management
- Rig — LLM client and tool calling framework based on rust
- Ollama — local model inference
- osascript — AppleScript execution via macOS native runtime
How I Used Gemma 4
I used Gemma 4 E4B MLX — the 4 billion parameter multimodal variant running via
Ollama's MLX backend, optimized specifically for Apple Silicon. On an M2 MacBook Air
it runs fast with no fan spin, no internet, and no API key — pure on-device inference
using the unified memory architecture.
The agent runs Gemma 4 E4B in a tool-calling loop: receive task → write AppleScript
→ call the tool → read output → respond. The MLX backend gets significantly better
token throughput than llama.cpp on Apple Silicon, which matters for a conversational
agent where latency is noticeable.
E4B was the right fit — big enough to reliably generate valid AppleScript and follow
strict tool-call formatting, small enough to run locally without breaking a sweat on
consumer Mac hardware. macOS automation tasks are narrow and structured, not
open-ended reasoning problems, so you don't need a 31B model to get good results.
Why Rust
Most AI agents are Python. I wanted a single binary, proper error handling around
arbitrary system command execution, and Tokio's async runtime for process timeouts.
Rig made tool-calling feel native — a tool is just a trait, schema mismatches are
caught at compile time. cargo install and it runs, no virtualenv, no runtime.
try
cargo install mac-agent-rust
Limitations
- Small models struggle with tool call formatting — Gemma 4 E4B occasionally returns empty responses or malformed tool calls, requiring retries
- No automatic retry on error — when a script fails the model tends to report the error to the user instead of fixing and retrying the script
- Context degradation — long conversations cause the model to lose track of tool results and contradict correct output it just received
- Tool result confusion — the model sometimes receives correct output but claims the script failed, especially when results are wrapped in JSON instead of plain text
- Spotify and Music app — track info only works when something is actively playing, errors out when paused with no graceful handling
- Safari JavaScript — requires "Allow JavaScript from Apple Events" to be manually enabled in Safari's Developer settings, not obvious to first time users
- History resets every session — no persistence between runs, agent has no memory of previous conversations
Top comments (0)