I built an evolving self modifying AI agent that monitors webpages without constant LLM calls

#ai #opensource #agents #programming

Background

Many AI agents have these two big problems.

Problem 1: Token waste during waiting

A lot of agents I've seen handle "waiting" poorly. If you want the agent to monitor a webpage until something changes — like a price update, a status change, or a specific piece of text appearing — the usual approach is to either:

keep calling the LLM in a loop ("is it there yet? is it there yet?")

That burns through tokens and requests even though the agent isn't really doing that much.

Problem 2: Fixed toolsets

Many agents ship with a fixed set of tools. If you give them a task that doesn't fit those tools well, they either fail, hallucinate a solution, or you have to go add a new tool yourself.

I wanted an agent that could notice when it's missing a capability and just... build it.

What I built

The project is called GrimmBot. It's an open-source AI agent that runs inside a Docker container with a full Debian-based desktop environment, a browser, and a set of built-in tools.

But the two features I think make it different are:

Zero-token monitoring mode
Runtime tool generation

Let me explain how each one works.

Zero-token monitoring

When GrimmBot needs to wait for something — like a webpage to update, a piece of text to appear, or a visual condition to be met — it doesn't keep polling the LLM.

Instead, it hands off the waiting to a local Python watcher loop.

This loop can monitor for:

specific text or regex patterns in the DOM
visual conditions using pixel/color bounding boxes
any other condition you can express in Python

While this loop is running, no LLM calls are made. The model is essentially asleep.

The moment the trigger condition is met, the watcher exits and wakes the agent back up. Only then does it make another API call to continue the task.

So if you ask GrimmBot to "watch this page until this text appears," it will:

set up a local watcher for that condition using the built in tools it has for doing so
suspend LLM usage
wait locally
wake up and resume once the condition is true

This makes long monitoring tasks much more efficient.

Runtime tool generation

GrimmBot ships with 60+ built-in tools for things like:

browser control (clicking, navigation, DOM extraction)
file operations (read, write, patch)
shell commands
screenshots and visual grids
scheduling and memory

But sometimes that's not enough.

If GrimmBot hits a task where none of its existing tools are a good fit, it can write a new one

So if you ask it to do something weird and specific — like parse a proprietary log format, or interact with some niche API — it can try to build the tool itself in python instead of just failing.

These custom tools are persistent. If it builds a tool on Monday, it still has access to it on Tuesday.

The environment

GrimmBot runs fully containerized in Docker.

The container includes:

Debian Bookworm Slim
a headless X11 display using xvfb
VNC access via x11vnc (so you can watch what it's doing)
Chromium browser
Python with common libraries
Java 17 and build tools for code tasks
LiteLLM for model-agnostic API support (works with OpenAI, Anthropic, Gemini, OpenRouter, or local models)

You interact with the agent through an attached terminal, and you can view its desktop over VNC.

There's also a "wormhole" folder — a shared directory between your host machine and the container for passing files in and out.

Human approval for certain actions

I added a system where when the agent tries to take certain actions, you must give it approval for them to work.

Certain actions — like running arbitrary shell commands, creating a custom tool, or navigating outside a list of allowed domains — will pause and ask for approval in the terminal before proceeding.