DEV Community

Cover image for I built a fully local AI assistant in 4 weeks — and wrote a safety protocol for it
Andre Zabel
Andre Zabel

Posted on

I built a fully local AI assistant in 4 weeks — and wrote a safety protocol for it

I built a fully local AI assistant in 4 weeks — and wrote a safety protocol for it

Four weeks ago I had an idea. Today I have an installer.

This is what happened in between.


The problem I was trying to solve

Every AI assistant I've tried either sends your data to the cloud, requires a technical setup most people can't follow, or both.

I wanted something different: an assistant that runs entirely on your machine, speaks to you, understands natural language, controls your PC — and never phones home.

Not a framework. Not a chatbot. A finished product you install and use.

So I built one.


What E.L.L.A. is

E.L.L.A. (Embedded Local Logic Agent) is a voice-controlled AI assistant for Windows that runs 100% locally.

You talk to it. It does things.

  • Open apps, move files, search the web, read your screen
  • Answer questions using a local language model (Ollama / llama3.1:8b)
  • Switch to OpenAI as a cloud fallback if you want — or don't
  • Respond in German, English, Spanish, French (v3.8.0)
  • Detect when you're stressed and adapt its behavior (v3.9.0)

~70 tools. One voice command away.


The stack

Layer Technology
Desktop shell Electron
Frontend React 19 + TypeScript + Vite
Backend Express + TypeScript
Local LLM Ollama (llama3.1:8b)
Cloud LLM OpenAI GPT-4o (optional fallback)
Database MariaDB
Cache Redis
Stress detection Python (RMS + ZCR + Pitch analysis)

The pipeline looks like this:

The LLM never executes code directly. It selects from a registered set of typed tool definitions. Every tool call goes through a rule engine before it runs.


Four weeks. Here's how it broke down.

Week 1 — Core pipeline: voice → LLM → tool → TTS. Single language, ~15 tools, very rough.

Week 2 — Tool expansion to ~70. App launching, file management, screen reading, web browsing. A lot of edge cases. Fuzzy app name matching ended up being one of the more interesting problems.

Week 3 — Electron packaging, system tray, license key validation, multilingual TTS. The AudioContext-killed-by-tray-minimize bug cost me half a day (fix: 1×1px trick to keep the window technically alive).

Week 4 — Stress detection via microphone analysis (no recording, no storage), installer packaging, landing page, license server. Production-ready.

The Settings panel lets you switch between languages, manage profiles, configure your local or cloud LLM — and adjust the stress detection sensitivity.

The part most AI projects skip: safety

When you give an AI agent access to your file system, your microphone, your network — you need to think hard about what it's allowed to do.

I spent time on this and ended up writing it down as a formal specification:

The E.L.L.A. Directive

An open safety protocol for autonomous local AI agents.

Four architectural prohibitions. Not guidelines. Not configurable defaults. Prohibitions — enforced at the code level, not the model level.

# Code What it means
1 harm No action that causes physical, financial, psychological, or data-related harm
2 conceal No concealment of actions, capabilities, or system state
3 surveil No observation or recording without explicit, active consent
4 exfiltrate No transmission of user data to any third party without explicit consent

Asimov had three laws. They were fiction, written for a story about robots that break their laws.

These four are implemented in TypeScript. Every tool call in E.L.L.A. passes through them before execution. There is no override.

The Directive is open source and designed to be adopted by other projects. If you're building a local AI agent, you're welcome to use it.

GitHub: The E.L.L.A. Directive

The TypeScript reference implementation ships as @ella-directive/core.


What I learned

Ollama is underrated. Running a capable LLM locally in 2026 is genuinely easy. The hard part is everything around it.

Tool-calling architecture beats prompt engineering. Giving the model a typed, registered set of tools and a rule engine is more reliable and more auditable than trying to constrain behavior through system prompts alone.

Stress detection from audio is simpler than it sounds. RMS (volume), ZCR (frequency patterns), and pitch analysis on a sliding window gives you a surprisingly usable signal — without ever storing a recording.

The AudioContext problem is real. If you're building Electron apps with audio, window.hide() kills your audio context. Don't use it.

Packaging is where time disappears. The core logic took 3 weeks. The last week — installer, edge cases on unknown hardware, license validation, landing page — took as long as the first two weeks combined.


What's next

  • Three separate repos: E.L.L.A. (desktop), HOME (smart home), ARM (facility security)
  • The Directive conformance suite (currently in progress)
  • Launch: July 1, 2026

If you're building something in this space — local agents, privacy-first AI, autonomous desktop tools — I'd be interested in what you're working on.


E.L.L.A. launches July 1, 2026 at ella-agent.de
The E.L.L.A. Directive: github.com/AndreZ1971/The-E.L.L.A.-Directive-

Top comments (0)