DEV Community

Grimm67123
Grimm67123

Posted on

I Built Tools for my AI Agent That let it Watch Webpages for Hours Without Burning API Tokens

One thing has always bothered me about a lot of AI agents: they keep spending money even when nothing is happening.

If you ask an agent to watch a webpage and tell you when something changes, many systems handle that in a surprisingly expensive way. They keep involving the LLM while waiting, repeatedly checking the same page, repeatedly thinking about the same unchanged state, and repeatedly burning tokens just to remain “aware.”

That feels wasteful.

Waiting is not reasoning.

So I built a different approach into my open source project, GrimmBot: when it needs to monitor a webpage or screen for a specific condition, it can enter a zero-token monitoring loop and stay there for as long as necessary. No constant LLM calls. No paying for inactivity. The model only wakes up once the condition is actually met.

But that’s only one part of the project.

GrimmBot also has persistent memory, scheduling, a browser and desktop environment inside a Debian Docker sandbox, and the ability to write its own Python tools when it runs into a task it can’t solve with its built-in capabilities. So the bigger idea is not just “cheap monitoring.” It’s an AI agent that can act, wait, remember, schedule, and expand its own toolkit without treating the LLM like a heart monitor that has to stay on every second.

Repo: https://github.com/grimm67123/grimmbot

Demo videos are in the repo.

The problem

A lot of agent workflows sound simple on paper:

  • “Wait until this dashboard shows a certain status”
  • “Alert me when this page contains a specific phrase”
  • “Keep an eye on this screen region and continue when it changes”
  • “Check back later and continue the task”
  • “Remember what happened last time and pick up from there”

But many AI agents treat these tasks as if the model has to stay mentally present the entire time.

That means the expensive part of the system is being used for the cheapest part of the job: waiting.

And the problem gets worse once the workflow extends beyond a single burst of activity. A useful agent doesn’t just need to click and summarize. It often needs to:

  • wait
  • resume later
  • remember context
  • run scheduled work
  • adapt when its built-in actions aren’t enough

That’s where a lot of current systems start to feel shallow. They can be impressive in short sessions, but awkward in long-running ones.

What I wanted instead

I wanted an agent that could do a few things cleanly:

  • Use the LLM to understand the user’s intent
  • Hand repetitive waiting work off to local deterministic tools
  • Wake the LLM only when something meaningful actually happens
  • Persist useful memory across sessions
  • Schedule tasks to happen later or repeatedly
  • Create new tools when the current ones aren’t enough

That’s the broader design idea behind GrimmBot.

The zero-token monitoring feature is probably the clearest example, but it sits inside a bigger system. I didn’t want an agent that only looks smart while the model is actively thinking. I wanted one that can keep operating sensibly over time.

How GrimmBot approaches monitoring

GrimmBot runs inside a Debian Docker container and uses Chromium as its default browser.

When a monitoring task comes in, the LLM can choose one of GrimmBot’s built-in monitoring tools and pass the necessary arguments. From there, a local Python loop handles the watch process.

Depending on the task, GrimmBot can monitor things like:

  • webpage DOM changes
  • specific text appearing
  • regex matches
  • pixel regions on screen
  • bounded color changes in an area

So instead of repeatedly asking the model “has anything changed yet?”, GrimmBot lets a deterministic loop do the boring work.

The LLM is suspended during that phase and only comes back when the watched condition is satisfied.

That’s why I call it zero-token monitoring: while the loop is running, the system is not continuously spending model calls just to wait.

Why I didn’t want to stop at monitoring

Monitoring was the first pain point that pushed me to build this differently, but I didn’t want to solve it in isolation.

In real use, workflows often look more like this:

  1. open a browser and navigate somewhere
  2. inspect the page
  3. wait for a condition
  4. continue once it changes
  5. save information
  6. remember what happened
  7. schedule a follow-up
  8. hit a weird edge case and need a new capability

That’s why GrimmBot also includes persistent memory, scheduling, and autonomous tool generation.

Persistent memory

Long-running agent workflows are fragile if the system can’t retain useful context.

If an agent monitors something now and needs to continue later, it helps if it can remember prior instructions, saved facts, or task context instead of treating each interaction like a fresh start. I wanted GrimmBot to be able to retain operational context across sessions instead of living entirely in the current prompt window.

Scheduling

Not every action should happen immediately.

Sometimes the right behavior is:

  • check again later
  • run something every so often
  • queue a background task
  • wait until a specific time before doing the next step

That’s part of why GrimmBot includes scheduling features. Monitoring solves “wait until X happens.” Scheduling solves “do Y later” or “repeat Z over time.” Together they make the agent more useful for workflows that don’t fit into one uninterrupted session.

Autonomous tool generation

This is the part I personally find the most interesting.

Most agents are limited by whatever tools they shipped with. If the built-in actions don’t cover the task, you either bolt on something external or go back and write the code yourself.

GrimmBot has a different path: when it lacks a capability, it can generate a new Python tool for the task and add it to its own toolkit.

So the project isn’t just trying to make an agent that uses tools well. It’s trying to make one that can extend itself when it hits a wall.

That doesn’t mean the LLM should reinvent everything all the time. It means the agent has a way to adapt instead of immediately failing whenever the environment asks for something specific.

Why I think this matters

A lot of current AI tooling routes too much through the model.

If something can be handled by:

  • a local loop
  • a deterministic check
  • a scheduled job
  • a remembered piece of context
  • or a generated helper tool

then it probably should be.

LLMs are useful for:

  • understanding intent
  • choosing actions
  • interpreting ambiguous results
  • deciding what to do next

They are much less appropriate for:

  • polling
  • waiting
  • repeatedly confirming that nothing has changed
  • pretending every workflow begins from scratch
  • being permanently boxed in by a static toolset

To me, a better agent is not just one that can do more in a single burst. It’s one that knows when to think, when to wait, when to remember, when to schedule, and when to extend itself.

That’s where a lot of practical power comes from.

The practical difference

The difference sounds small until you imagine real usage.

If you want an agent to watch a page briefly, maybe the waste doesn’t matter much.

But if you want it to:

  • watch a page all afternoon
  • wait for an account approval flow
  • monitor a dashboard during work
  • track a product listing for a long time
  • remember what it was doing yesterday
  • rerun a task on a schedule
  • handle some weird niche requirement its built-in tools don’t support

then the cost of “thinking while idle” and the limitations of static tooling start to matter a lot.

That’s exactly the kind of case I wanted GrimmBot to handle better.

A patient agent should be cheap while waiting.

A useful agent should remember what it’s doing.

A long-running agent should be able to schedule work.

And an adaptable agent should have some way to grow beyond the tools it started with.

That combination is what GrimmBot is really about.

Top comments (0)