The problem
I’ve been experimenting with AI agents on a Raspberry Pi 5, and I kept hitting the same issue:
most agent frameworks felt too heavy for small hardware.
They often bring a full stack wit multiple services, extra infrastructure, a lot of moving parts and on a Raspberry Pi that quickly turns into slow startup, more memory pressure, and too much complexity for simple tasks.
I didn’t want that.
I wanted something that would stay small and still be useful.
So instead of building yet another agent framework, I started building a lightweight runtime with a different approach to routing.
The project became openLight:
The idea
What I wanted was not “LLM for everything”.
For a lot of requests, an LLM is unnecessary.
If a user wants to check CPU, disk, logs, or run a known action, that should go through a predictable path.
So openLight is built around a mixed model:
• deterministic routing where possible
• LLM-based classification where needed
• validation before execution
That keeps the system much more practical on small hardware.
How routing works
The flow looks like this:
Telegram message
↓
Auth
↓
Deterministic routing
├─ matched → execute skill
└─ no match → LLM classifier
↓
chat / skill
↓
validate
↓
execute
In practice, that means:
• every Telegram message first goes through auth and persistence
• then the runtime tries deterministic routing
• if there is a direct match, the skill executes immediately
• if not, the system uses the LLM to decide whether the request is just chat or should be mapped to a skill
• skill execution is validated before running
So the LLM is part of the system, but it is not the whole system.
That was important to me from the start.
Why this works better on Raspberry Pi
On small hardware, every extra layer matters.
If every request goes straight into an LLM-driven loop, the system becomes slower, less predictable, and more expensive to run.
With this design:
• obvious commands stay fast
• known actions remain deterministic
• the LLM is only used where classification is actually useful
• validation reduces the chance of random execution paths
For Raspberry Pi and homelab use, this feels much more natural than a heavy agent stack.
What openLight is trying to be
I don’t see it as a huge agent framework.
It’s closer to a small runtime for personal infrastructure.
Right now the main interface is Telegram, but the bigger idea is wider than that: a lightweight agent runtime that can combine deterministic skills with LLM-based interpretation without dragging in a huge platform around it.
Why I built it
Mostly because I like small tools that are easy to run and easy to understand.
I wanted something that:
• works well on Raspberry Pi
• stays lightweight
• does not depend on a huge framework
• uses LLMs where they help, not everywhere by default
That’s the direction behind openLight.
If you want to take a look:
Top comments (4)
Really like the "LLM only where classification is actually useful" stance here. On constrained hardware, deterministic routing + validation is a much better systems design than forcing every request through a full agent loop. Curious whether you've measured latency / memory deltas between the deterministic path and the fallback LLM path on the Pi 5 — that would be a great proof point.
I haven’t published a full benchmark table yet, but I do have a first Pi 5 data point now.
On the same natural-language request, the actual
statusskill execution was only ~150ms. The big delta was entirely in the LLM classification path: local Ollama with qwen2.5:0.5b took ~19.8s for route classification plus ~22.6s for skill classification, so about 42.5s end-to-end. The same flow with gpt-4o-mini was ~1.35s + ~1.77s, so about 3.3s end-to-end.So yes, this is exactly why the design is deterministic-first: when a request can be routed directly, you avoid paying the classifier tax entirely. I still need to do a clean memory comparison, especially separating agent RSS from local model residency, but the latency difference is already pretty stark.
Deterministic-first routing is smart — most agent frameworks burn LLM tokens on tasks a regex could handle. Curious if you've measured the ratio of deterministic vs LLM-classified requests in practice.
Not yet as a published metric, but it’s something I want to expose.
Right now each routing decision already carries a mode (
slash,explicit,alias,rule,llm), so breaking requests down into deterministic vs LLM-classified traffic is straightforward. I’d actually prefer to publish the full split rather than just a single ratio, because it shows how much is handled by direct commands/rules before the classifier is even touched.Anecdotally, command-shaped and routine ops are exactly what the deterministic path is meant to absorb, and the LLM is there as the overflow path for natural-language requests. Turning that into a real counter/dashboard is probably the next telemetry improvement I should make.