Like many of you, I’ve been messing around with OpenClaw (formerly clawdbot) and the whole "vibe coding" concept. It's cool, but finding a decent tool that actually drives the UI on Linux was a pain. Everything seems to be Mac-first right now.
Since I do all my local inference on Linux, I built a dedicated tool for it.
It's called Peepbo : https://github.com/LichAmnesia/peepbo
Basically, it's a lightweight Node/TS wrapper that connects your local VLM (LLaVA, Qwen-VL, etc) to your desktop Linux environment.
How it works:
-
Vision: Wraps
scrot,gnome-screenshot, orgdbusso the model can see the screen. -
Control: Uses
xdotoolto handle mouse/keyboard inputs. - Wayland: Yes, it works on GNOME Wayland, but you'll need to run in unsafe mode (details in the readme).
It's open source. Give it a shot if you're trying to build agents on Linux and let me know if it breaks anything.
Top comments (0)