<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Evgenii Engineer</title>
    <description>The latest articles on DEV Community by Evgenii Engineer (@evgenii_engineer).</description>
    <link>https://dev.to/evgenii_engineer</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3824319%2F7333d0b6-3e6e-41f2-b728-6177a9d29646.jpg</url>
      <title>DEV Community: Evgenii Engineer</title>
      <link>https://dev.to/evgenii_engineer</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/evgenii_engineer"/>
    <language>en</language>
    <item>
      <title>What I Learned Building a Lightweight Local AI Agent</title>
      <dc:creator>Evgenii Engineer</dc:creator>
      <pubDate>Fri, 08 May 2026 20:52:37 +0000</pubDate>
      <link>https://dev.to/evgenii_engineer/what-i-learned-building-a-lightweight-local-ai-agent-18nk</link>
      <guid>https://dev.to/evgenii_engineer/what-i-learned-building-a-lightweight-local-ai-agent-18nk</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffkx4g7zyo4yrc1agernf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffkx4g7zyo4yrc1agernf.png" alt="A Raspberry Pi sitting on top of a Mac mini — both running openLight, both small, both always on." width="800" height="1418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When I wrote the first post about openLight in March, the project was a 1.0 commit, a Raspberry Pi, and a Telegram bot. I had a Pi running tailscale, a small Matrix homeserver, and I was tired of &lt;code&gt;ssh pi@raspberrypi.local &amp;amp;&amp;amp; systemctl status …&lt;/code&gt; from a phone keyboard. So I wrote a Go binary that talked to a Telegram bot, kept state in SQLite, and could fall back to a local Ollama model when the rule-based router didn't recognize the request.&lt;/p&gt;

&lt;p&gt;Two months later the binary is still ~25 MB, still one config file, still SQLite. Almost everything underneath has been rewritten at least once. The identity I'd write today is shorter than anything in the v0.0.1 README:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;openLight is a lightweight operational layer for personal servers, not a generic AI assistant.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That sentence didn't exist in March. It took two months of building, deleting, and re-building to find it. This is the retrospective I wish I had read before starting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five moments where it earned its keep
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3rb32voldejqj5kesvm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3rb32voldejqj5kesvm.png" alt="Telegram alert: synapse status check failed, with Restart / Logs / Status / Ignore buttons." width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A train station, 11pm.&lt;/strong&gt; Synapse is down on the VPS. I tap &lt;code&gt;Restart&lt;/code&gt; on the Telegram alert. It comes back. I don't open a laptop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A grocery line.&lt;/strong&gt; Tailscale shows yellow. I tap &lt;code&gt;Logs&lt;/code&gt;, see a known transient peer issue, tap &lt;code&gt;Ignore&lt;/code&gt; for 15 minutes. Done before checkout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A flight.&lt;/strong&gt; I'm offline. The watch loop runs anyway. When I land, three resolved-incident messages are waiting.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A Mac mini at home.&lt;/strong&gt; It runs Ollama and a few Docker services. CPU &amp;gt; 90% for 5 minutes triggers a watch I'd forgotten about. My own background job is the problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A friend's homelab.&lt;/strong&gt; &lt;code&gt;/restart matrix&lt;/code&gt; from my couch hits a remote docker-compose service. Same UX as the local Pi.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The architecture below only matters because of those moments.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed technically
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Routing: from flat to deterministic-first
&lt;/h3&gt;

&lt;p&gt;The original router was flat: try slash commands, try a few regexes, fall through to Ollama, run whatever the model picked. This works for about a week. Then you notice that half your latency is the model warming up on a Pi, the model picks plausible-but-wrong tool names ~10–15% of the time, and you can't separate "I don't know" from "I'm 51% sure, do the thing."&lt;/p&gt;

&lt;p&gt;The current router has five layers before the LLM is ever consulted, and the LLM path itself is two-stage:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjy2nz26oi5wxhmdtezg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjy2nz26oi5wxhmdtezg.png" alt="Router before and after: a flat slash → regex → Ollama path on the left, a deterministic-first cascade with a two-stage LLM classifier on the right." width="800" height="746"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The semantic layer pays for itself within a week: &lt;code&gt;покажи логи tailscale&lt;/code&gt;, &lt;code&gt;show tailscale logs&lt;/code&gt;, and &lt;code&gt;tail -f tailscale&lt;/code&gt; all normalize to the same skill identifier without ever waking the model. On a Pi, that's the difference between sub-100ms and 2–5 seconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgpjs3c1fojyszcuqxsuz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgpjs3c1fojyszcuqxsuz.png" alt="Telegram: " width="800" height="322"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The screenshot above is the same &lt;code&gt;/status&lt;/code&gt; skill, reached without an LLM call: the Russian phrase normalizes deterministically.&lt;/p&gt;

&lt;p&gt;The two-stage split matters because the failure modes get easier to reason about. "Can't pick between &lt;code&gt;service_logs&lt;/code&gt; and &lt;code&gt;service_status&lt;/code&gt;" is not the same problem as "can't tell whether the user wants services or notes." Splitting the decision splits the latency cost, the error mode, and the prompt size.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81mchbekjvqxw5n8xt3b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81mchbekjvqxw5n8xt3b.png" alt="CLI hitting the same runtime as the Telegram bot —  raw `skills` endraw ,  raw `watch list` endraw ,  raw `notes` endraw  against agent.test.yaml." width="800" height="286"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The CLI subcommand wires up the same agent, the same registry, and the same auth as the Telegram path. This is also what the smoke harness drives in CI.&lt;/p&gt;

&lt;h3&gt;
  
  
  From localhost to named SSH nodes
&lt;/h3&gt;

&lt;p&gt;v0.0.1 could only operate the box it ran on. A Telegram bot is a singleton, so deploying one openLight per host wasn't an option. The model became: one openLight, many &lt;em&gt;nodes&lt;/em&gt;, where a node is just a named SSH target in config. Service specs grew a &lt;code&gt;node:vps:compose:/opt/matrix/docker-compose.yml:synapse&lt;/code&gt; syntax that resolves to the right backend on the right host.&lt;/p&gt;

&lt;p&gt;This forced the services module to become an interface, not a function:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9mpl1s592pohx4u03520.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9mpl1s592pohx4u03520.png" alt="One user-facing command, one allowlist check, one resolver, six backends behind it: local systemd / docker / compose, plus remote variants over SSH." width="800" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Six backends behind one skill. The user sees one command. Auditing sees one skill call. The temptation to expose the backend distinction was strong and wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  From request-response to a monitoring loop
&lt;/h3&gt;

&lt;p&gt;The biggest behavioral change is &lt;code&gt;watch&lt;/code&gt;. v0.0.1 was reactive: I asked, it answered. v0.1.0 added a polling subsystem that holds rules, opens incidents, and &lt;em&gt;initiates&lt;/em&gt; messages with inline buttons: &lt;code&gt;Restart&lt;/code&gt;, &lt;code&gt;Logs&lt;/code&gt;, &lt;code&gt;Status&lt;/code&gt;, &lt;code&gt;Ignore&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The interesting part isn't the polling. The interesting part is that alert buttons reuse the existing skill surface. When I tap &lt;code&gt;Restart&lt;/code&gt; on an alert, the runtime calls the same &lt;code&gt;service_restart&lt;/code&gt; skill I'd call manually. One allowlist check, one audit row, one logging path. There is no separate "automation" surface.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmq0774t5xqkev7j6rizx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmq0774t5xqkev7j6rizx.png" alt="End-to-end incident: alert → Restart button → " width="800" height="1284"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I think this is the single design decision I'm proudest of. The fashionable direction in agent frameworks is to give the LLM a sandbox, a shell, and trust. The unfashionable direction — and the correct one for infra — is to make manual and automatic actions go through the &lt;em&gt;exact same&lt;/em&gt; validated code path. Automation is a button press, not a permission level.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed philosophically
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Skills are the only safety boundary
&lt;/h3&gt;

&lt;p&gt;In v0.0.1 I had a &lt;code&gt;mutating_execute_threshold&lt;/code&gt; knob — a confidence floor for state-changing actions. By v0.0.2 it was gone, and I'd internalized the rule: &lt;strong&gt;the LLM can only choose among already-registered skills, and skills enforce their own allowlists in Go.&lt;/strong&gt; No threshold, no soft gate. Either there's a Go function the LLM can name, or there isn't.&lt;/p&gt;

&lt;p&gt;The model is a classifier of intent, not a holder of permissions. Permissions live in the code that runs after the classification.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core vs. Optional, made explicit
&lt;/h3&gt;

&lt;p&gt;The README has two lists: core modules (always on) and optional ones (off by default). Every time I added a fun module — vision, voice, browser — I felt a pull to make it on by default, "because the demo is so cool." Three weeks later I'd be debugging why the Pi's memory was full of Playwright. Making "off by default" structural, not a config preference, is how I keep myself from drifting the project into a generic AI assistant.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj0ljybqh5nwx8c8yvs2a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj0ljybqh5nwx8c8yvs2a.png" alt="/skills response showing all groups: Chat, Notes, Memory, Files, Browser, Services, Watch, System, Core, Vision, OCR, Visual watch." width="800" height="744"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The same registry is exposed to the LLM classifier and to the user's &lt;code&gt;/skills&lt;/code&gt; reply. There is no parallel surface.&lt;/p&gt;

&lt;h3&gt;
  
  
  From a Raspberry Pi project to personal infrastructure
&lt;/h3&gt;

&lt;p&gt;Somewhere during the third or fourth refactor I realized openLight was no longer about a Raspberry Pi. The Pi was just the smallest machine I happened to have. What I'd actually become interested in was the broader category: small, always-on computers running useful local software without cloud-scale tooling and without enterprise infrastructure. A Pi 4 in the closet. A Mac mini M1 on the shelf running local models. A used NUC behind the TV.&lt;/p&gt;

&lt;p&gt;These machines have something in common that mainstream infra tooling does not optimize for. They run on residential power. They reboot when the cleaner unplugs them. They have one operator, who is also the developer, who is also the on-call rotation. Kubernetes is not the answer. A Datadog agent is not the answer. What this hardware needs is a small, reliable, observable layer that gets out of the way until it's needed.&lt;/p&gt;

&lt;p&gt;The Mac mini in particular changed the project's scope. Once I started deploying to &lt;code&gt;darwin/arm64&lt;/code&gt; with &lt;code&gt;launchd&lt;/code&gt; instead of &lt;code&gt;systemd&lt;/code&gt;, openLight stopped being "a Telegram bot for Raspberry Pi" and started being a personal-infrastructure agent that happens to use Telegram. The Pi case became the smallest case of a more general thing, instead of the only case.&lt;/p&gt;

&lt;p&gt;I don't have a clean term for this category. "Homelab" skews hobbyist; "self-hosted" skews ideological. &lt;em&gt;Personal infrastructure&lt;/em&gt; is the working name. It's the layer between "my laptop" and "the cloud," and a non-trivial amount of useful software is going to be built for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I got wrong
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The "alternative to OpenClaw" framing.&lt;/strong&gt; The original README defined openLight by what it wasn't. People who don't know OpenClaw don't care; people who do read it as defensive. Worse, it gave me a permission slip to make decisions by negation. Define yourself by what you are.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Structured tool calling" on the v0.0.1 roadmap.&lt;/strong&gt; The right vocabulary, the wrong target. What I needed wasn't more sophisticated tool calling — it was a stronger pre-LLM router. The skill set is small enough that 80% of intents are reachable by deterministic parsing, and the remaining 20% are better handled by &lt;em&gt;classification into an existing skill&lt;/em&gt; than by &lt;em&gt;generating a tool call&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dumping the full skill catalog into the LLM prompt.&lt;/strong&gt; Early on, the classifier saw every registered skill with full descriptions. This blew the prompt past 4K tokens, slowed routing, and made the model worse — too many near-duplicates. The two-stage classifier fixed it: stage 1 sees only groups, stage 2 sees only skills inside the chosen group. The model's input budget is also a design constraint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Files: too closed, then too open, then re-gated.&lt;/strong&gt; Three rewrites of the file skill, each triggered by me almost doing something stupid via Telegram on a tired night. None of those mistakes shipped. They came close enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  What practice confirmed
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Telegram is the right interface for homelab ops.&lt;/strong&gt; Mobile-native, group-aware, real bot API, already in your pocket. I tried Slack briefly and a web UI for a weekend. Both lost to "I'm at the airport and Synapse is down."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQLite is enough.&lt;/strong&gt; Watches, incidents, settings, message history, skill calls — all of it lives in one file with &lt;code&gt;modernc.org/sqlite&lt;/code&gt; (no CGO). Backup is &lt;code&gt;cp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single Go binary is the right shape.&lt;/strong&gt; No runtime dependencies, no service mesh, no Helm chart, no Postgres. &lt;code&gt;scp&lt;/code&gt; and a systemd unit. Deploys in under a minute.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local LLMs are good enough for routing.&lt;/strong&gt; A 0.5B Qwen on a Pi handles the 20% of intents that don't deterministically parse. I don't need GPT-4 to decide whether "show me what's broken" means &lt;code&gt;/status&lt;/code&gt; or &lt;code&gt;/watch list&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Allowlists beat permission dialogs.&lt;/strong&gt; Asking the user "are you sure?" feels safe and is theatre. Forcing the operator to write &lt;code&gt;services.allowed: [tailscale, synapse]&lt;/code&gt; in YAML is the actual safety boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it's going, and why I think it matters
&lt;/h2&gt;

&lt;p&gt;I don't have a v0.2.0 manifesto. The next things on the list — durable LLM memory, richer filesystem with tracked changes, on-device voice (ffmpeg + whisper-cpp), a read-only browser skill behind a strict allowlist — will all be off by default. If they start drifting toward "generic AI assistant" in code review, they don't ship.&lt;/p&gt;

&lt;p&gt;I want to end on something larger than the project, because I think the project is a small data point in a bigger argument.&lt;/p&gt;

&lt;p&gt;The dominant story about AI agents right now is enormous: cloud-scale models, autonomous multi-agent systems, infinite tool catalogs, generalized assistants that do everything for everyone. I think that story is going to underdeliver in a specific direction. The most useful AI infrastructure of the next few years is not going to be the cloud agent. It's going to be the small, boring, reliable system that runs close to the operator.&lt;/p&gt;

&lt;p&gt;Local-first, because the operator and the hardware are in the same place. Deterministic by default, because the LLM is a great classifier and a poor decision-maker, and infrastructure does not need a poet. Observable, because the operator is also the on-call rotation. Repairable, because something always breaks. Cheap enough to leave running forever, because the moment you can't, you stop running it.&lt;/p&gt;

&lt;p&gt;openLight is a single-person project trying to be a small, honest example of that. It's not trying to replace engineers and it's not trying to be smart. It's trying to reduce the friction between an engineer and the small systems they already operate.&lt;/p&gt;

&lt;p&gt;Code lives at &lt;code&gt;github.com/evgenii-engineer/openLight&lt;/code&gt;. The architecture doc is honest about the seams. The CHANGELOG doesn't oversell. If you run a homelab or a Mac mini or a closet full of small machines and you operate them from a phone, this might be useful. If not, it almost certainly isn't, and I'd rather you know that now than after you've cloned the repo.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>opensource</category>
      <category>go</category>
    </item>
    <item>
      <title>Heavy AI agent frameworks were too slow for my Raspberry Pi. So I built a different one</title>
      <dc:creator>Evgenii Engineer</dc:creator>
      <pubDate>Sat, 14 Mar 2026 18:38:45 +0000</pubDate>
      <link>https://dev.to/evgenii_engineer/heavy-ai-agent-frameworks-were-too-slow-for-my-raspberry-pi-so-i-built-a-different-one-4iee</link>
      <guid>https://dev.to/evgenii_engineer/heavy-ai-agent-frameworks-were-too-slow-for-my-raspberry-pi-so-i-built-a-different-one-4iee</guid>
      <description>&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;I’ve been experimenting with AI agents on a Raspberry Pi 5, and I kept hitting the same issue:&lt;/p&gt;

&lt;p&gt;most agent frameworks felt too heavy for small hardware.&lt;/p&gt;

&lt;p&gt;They often bring a full stack wit multiple services, extra infrastructure, a lot of moving parts and on a Raspberry Pi that quickly turns into slow startup, more memory pressure, and too much complexity for simple tasks.&lt;/p&gt;

&lt;p&gt;I didn’t want that.&lt;/p&gt;

&lt;p&gt;I wanted something that would stay small and still be useful.&lt;/p&gt;

&lt;p&gt;So instead of building yet another agent framework, I started building a lightweight runtime with a different approach to routing.&lt;/p&gt;

&lt;p&gt;The project became openLight:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/evgenii-engineer/openLight" rel="noopener noreferrer"&gt;github repo&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea
&lt;/h2&gt;

&lt;p&gt;What I wanted was not “LLM for everything”.&lt;/p&gt;

&lt;p&gt;For a lot of requests, an LLM is unnecessary.&lt;/p&gt;

&lt;p&gt;If a user wants to check CPU, disk, logs, or run a known action, that should go through a predictable path.&lt;/p&gt;

&lt;p&gt;So openLight is built around a mixed model:&lt;br&gt;
    • deterministic routing where possible&lt;br&gt;
    • LLM-based classification where needed&lt;br&gt;
    • validation before execution&lt;/p&gt;

&lt;p&gt;That keeps the system much more practical on small hardware.&lt;/p&gt;
&lt;h2&gt;
  
  
  How routing works
&lt;/h2&gt;

&lt;p&gt;The flow looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Telegram message
   ↓
Auth
   ↓
Deterministic routing
   ├─ matched → execute skill
   └─ no match → LLM classifier
                     ↓
               chat / skill
                     ↓
                 validate
                     ↓
                 execute
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, that means:&lt;br&gt;
    • every Telegram message first goes through auth and persistence&lt;br&gt;
    • then the runtime tries deterministic routing&lt;br&gt;
    • if there is a direct match, the skill executes immediately&lt;br&gt;
    • if not, the system uses the LLM to decide whether the request is just chat or should be mapped to a skill&lt;br&gt;
    • skill execution is validated before running&lt;/p&gt;

&lt;p&gt;So the LLM is part of the system, but it is not the whole system.&lt;/p&gt;

&lt;p&gt;That was important to me from the start.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this works better on Raspberry Pi
&lt;/h2&gt;

&lt;p&gt;On small hardware, every extra layer matters.&lt;/p&gt;

&lt;p&gt;If every request goes straight into an LLM-driven loop, the system becomes slower, less predictable, and more expensive to run.&lt;/p&gt;

&lt;p&gt;With this design:&lt;br&gt;
    • obvious commands stay fast&lt;br&gt;
    • known actions remain deterministic&lt;br&gt;
    • the LLM is only used where classification is actually useful&lt;br&gt;
    • validation reduces the chance of random execution paths&lt;/p&gt;

&lt;p&gt;For Raspberry Pi and homelab use, this feels much more natural than a heavy agent stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  What openLight is trying to be
&lt;/h2&gt;

&lt;p&gt;I don’t see it as a huge agent framework.&lt;/p&gt;

&lt;p&gt;It’s closer to a small runtime for personal infrastructure.&lt;/p&gt;

&lt;p&gt;Right now the main interface is Telegram, but the bigger idea is wider than that: a lightweight agent runtime that can combine deterministic skills with LLM-based interpretation without dragging in a huge platform around it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I built it
&lt;/h2&gt;

&lt;p&gt;Mostly because I like small tools that are easy to run and easy to understand.&lt;/p&gt;

&lt;p&gt;I wanted something that:&lt;br&gt;
    • works well on Raspberry Pi&lt;br&gt;
    • stays lightweight&lt;br&gt;
    • does not depend on a huge framework&lt;br&gt;
    • uses LLMs where they help, not everywhere by default&lt;/p&gt;

&lt;p&gt;That’s the direction behind openLight.&lt;/p&gt;

&lt;p&gt;If you want to take a look:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/evgenii-engineer/openLight" rel="noopener noreferrer"&gt;github repo&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>raspberrypi</category>
      <category>go</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
