I Built The UnoQ's Claw: A Tiny Agentic AI Assistant That Lives Inside an Arduino Uno Q

David Laurenvil — Thu, 21 May 2026 05:42:00 +0000

Every "AI on hardware" demo you have ever seen has a LLM behind it. The user talks to a board via a terminal or Telegram, and the board calls an API to have a cloud model do the work. QClaw flips that arrangement. The Arduino Uno Q hosts the language model, runs the agent loop, drives the compile toolchain, and flashes its own microcontroller.

Ask QClaw to scroll "QClaw" across the LED matrix and it does. End to end. On the board. Offline.

QClaw has an eight-tool agentic surface, a fifteen-skill pre-router, and a direct OpenOCD flash route that makes autonomous uploads actually execute. A dual-path runtime lets you pick speed or full hardware control on the same model.

Why the Uno Q Is the Right Board for This

The Arduino Uno Q is a split-silicon device. It looks like a classic Arduino on the outside, but it is two boards in a trench-coat:

The MPU and MCU share the same PCB. The MPU can hold the MCU in reset and reprogram its flash directly through GPIO pins wired to SWD via the linuxgpiod driver. No USB cable between them. No probe. No second machine. That is the genius of QClaw.

The agentic loop orchestrates the full sketch lifecycle across the Arduino Uno Q's dual-silicon topology, with the MPU driving the loop and the MCU executing the resulting firmware. This is how QClaw generates, compiles, flashes, and observes.

The QClaw arduino tool invokes OpenOCD directly at the correct address. The tool compiles with arduino-cli compile --fqbn arduino:zephyr:unoq --export-binaries, picks up the resulting .elf-zsk.bin, and pipes it through OpenOCD over the GPIO SWD bridge. No SSH, no network credentials, no remote OCD tunnel. Just MPU to MCU, on the same board. Sub-second flash once the binary is on disk.

Four gigs of RAM is also more than enough to host a Qwen3.5 0.8B Q4_0 model with an 8K context window, mlocked, with q8_0 KV cache. QClaw lives on the Uno Q at around 1.3 GB. Decode runs at roughly 8 tokens per second. Slow next to a desktop GPU, fast enough that a sketch compiles and flashes before you have finished your coffee.

*How To Use QClaw *

QClaw ships two runtimes on top of the same llama-server backend, the same SOUL.md, and the same 23-rule pre-router. They differ in what wraps the LLM call.

Agentic path (make qclaw-agentic). the qclaw Go gateway sits in front of the model. It runs channel adapters (terminal, SSH, Telegram), the multi-iteration agent loop, the pre-router, and the eight-tool dispatcher. This is the production default. It is the only path that can actually compile and flash a sketch.
Direct path (make qclaw-direct). A thin Python REPL POSTs directly to llama-server after running the same pre-router rules in Python. No loop, no tools, no Telegram. About 33 percent lower latency on pure factual prompts at equivalent correctness, because there is no tool schema in the prompt and no second iteration.

Use the agentic path when you want a sketch flashed or a frame captured. Use the direct path when you just want to ask which pins on the Uno Q do PWM.

Drop in two commands and you have a session:

git clone https://github.com/laurenvil/Uno-QClaw.git ~/ArduinoApps/QClaw     

cd ~/ArduinoApps/QClaw     

git submodule update --init --recursive     

# Download the inference engine     

cd yzma && make download-llama.cpp && cd ..     

# Download the model (~490 MB for Q4_0)     

mkdir -p ~/models     

wget -O ~/models/Qwen_Qwen3.5-0.8B-Q4_0.gguf \     

     'https://huggingface.co/Qwen/Qwen3.5-0.8B-GGUF/resolve/main/Qwen3.5-0.8B-Q4_0.gguf'     

# Build, install arduino-cli, configure (one time)     

make qclaw-install     

# Start a session — pick a path     

make qclaw-agentic    # full agent loop + 8 tools (compile/upload/camera/sysfs_led/network/i2cdetect)     

make qclaw-direct     # pre-router + direct API (fast Q&A, no tools)

make qclaw-install builds the Go binary, copies the system prompt and the fifteen-skill tree into ~/.qclaw/workspace/, installs arduino-cli plus the arduino:zephyr core, and runs an interactive wizard that sets up the optional Telegram gateway.

Once it is running, the agent has eight narrowly-scoped tools available:

read_file, write_file, list_dir for workspace navigation
arduino for compile and flash via OpenOCD
camera for single-frame V4L2 capture through GStreamer
sysfs_led for the MPU-side RGB LEDs at /sys/class/leds/*
network for hostname, interfaces, and the default gateway, all read-only stdlib Go
i2cdetect for listing and scanning Linux I²C buses with -y -r only
No general exec. No general shell. Every tool validates its arguments against an allow-list. The total tool schema is around 3.4K characters, leaving plenty of room for the system prompt at an 8K context window.

The Pre-Router: Skills, Not RAG

The pre-router is the part of QClaw that does the heavy lifting on a 0.8B model. It is not RAG. It is a flat table of 23 keyword regex rules across 15 skills. When you send a message, the pre-router scans it, finds matching rules, and inlines the relevant SKILL.md plus its referenced files directly into the system prompt before the LLM call.

The model never has to call read_file for canonical skill content. The content is already there. At 0.8B scale, a read_file call costs a full LLM iteration, roughly 10 to 20 minutes of cold prefill plus decode. The pre-router amortizes that to zero.

*The skills cover: *

Sketch fundamentals: blink, breathe, button, potentiometer, servo, compile and upload, CAN bus, DAC, OPAMP
The 13x8 LED matrix with the canonical Arduino_LED_Matrix template
Uno Q hardware: pin tables, voltage rules, connectors, power
Dual-chip workflow: Bridge RPC, App Lab, Bricks
Linux-side capabilities: Wi-Fi, Bluetooth, camera, OpenCV, microphone, sysfs LEDs
Plug-and-play Modulino sensors
Each skill is just a directory under workspace/skills/<name>/ with a SKILL.md and optional reference files. Adding a new one is a matter of writing the markdown and adding a regex rule.

*Try It *

Repo:https://github.com/laurenvil/Uno-QClaw

Issues, forks, and pull requests are welcome at https://github.com/laurenvil/Uno-QClaw. If you have an Arduino Uno Q on your desk, you can have a self-flashing AI assistant sitting on it tonight, with the Ethernet cable unplugged.

DEV Community: David Laurenvil

I Built The UnoQ's Claw: A Tiny Agentic AI Assistant That Lives Inside an Arduino Uno Q