DEV Community

Cover image for Flowork Agent — Self-Hosted AI Agent Operating System with Persistent Memory and Built-In Security
floworkos
floworkos

Posted on

Flowork Agent — Self-Hosted AI Agent Operating System with Persistent Memory and Built-In Security

The Memory That Learns From Mistakes

Every Flowork agent carries a private SQLite brain with full-text search (FTS5). When an agent runs a task, it stores:

  • The prompt it was given
  • The exact error or correction it received
  • What it learned from that mistake (not shame — education)
  • When it happened (so it can recall under similar conditions)

If an agent hallucinates stock data and a user corrects it, that correction goes into the brain as a memory. Next time it sees "stock analysis," FTS5 retrieves similar past episodes, and the agent can tune its response. It doesn't memorize the user — it memorizes the domain.

This is not RAG with external documents. This is the agent's own experience, accessible at inference time through a brain.query(question) capability. An agent can even "dream" in idle time — reviewing its own logs and compiling lessons without being prompted.


🎯 Security Without Theater

Flowork does not bolster security theater. It ships a real, built-in security radar:

  • WASM sandbox — every agent bytecode runs in an isolated WebAssembly runtime (via wazero). No direct OS access. Agents cannot fork processes or read sibling files.
  • Capability-based grants — before an agent touches the filesystem, calls an LLM, or runs a tool, the kernel checks one grant per capability. You define which agents can call which tools; others get a clean "no."
  • Frozen kernel guarding — the microkernel code itself is protected. If someone or something tries to patch it at runtime, the Guardian detects tampering and triggers safe-mode. The agent is isolated; the system remains intact.
  • Built-in scanning arsenal — alongside sandbox & grant checks, you get pluggable security scanners (malware patterns, injection detection, drift monitors) that run before code is deployed as an agent.

No other agent framework I'm aware of ships this stack together. Most agent frameworks run agents as plain Python or JavaScript — no sandbox, no capability model, no guard. You're trusting the agent to behave. Flowork trusts the architecture.


🔌 MCP: Both Client and Server

Flowork implements the Model Context Protocol — both sides.

As an MCP client, your agents can call external MCP servers:

Agent → Flowork MCP client → (connects to) → GitHub MCP server → get file from repo
Agent → Flowork MCP client → (connects to) → Filesystem MCP server → read /data
Enter fullscreen mode Exit fullscreen mode

Drop the MCP server address into an agent's tool config, and the tool becomes available immediately. No recompile. No kernel patch.

As an MCP server, Flowork itself exposes your agents to Claude Desktop or Cursor:

Claude Desktop (MCP client) → Flowork MCP server → call agent A's brain query
                            → call agent B's tools
                            → call scanner X
Enter fullscreen mode Exit fullscreen mode

This flips the relationship: instead of your agents calling the LLM, Claude can query your agents as a knowledge base, and they remember what you taught them.


🧩 117 Tools + Plug-and-Play Modules

Flowork ships with 117 built-in tools across nine command families:

  • Text — split, join, format, regex, hash, encode/decode
  • Network — HTTP, DNS, IP geolocation, port scan
  • Crypto — key generation, signing, encryption (libsodium)
  • Database — SQLite query, CSV transform
  • Time — schedule, cron, countdown
  • AI — LLM completion, embedding, vision (multi-model)
  • File — read, write, list (sandboxed within agent's folder)
  • Slack/Discord/Telegram — native channel integrations
  • Voice — offline STT (Whisper) + free TTS

Everything else is plug-and-play. Drop a new tool folder into tools/, a new agent folder into agents/, a scanner into scanners/, or a new channel binding into channels/ — the kernel auto-discovers it, no restart needed.

Modules are versioned (Git SHA), so you can rollback a broken tool without touching other modules.


🚀 One Binary, No Dependencies

Flowork is written in pure Go 1.25 with no cgo. It compiles to a single static binary for Linux, macOS, or Windows. Drop it on any machine, run ./Flowork_Agent, and you have a full agent OS listening on http://127.0.0.1:1987.

  • No Python, no Node.js, no Docker.
  • No service dependencies (no Postgres, no Redis — everything is SQLite or in-process).
  • No telemetry. No phoning home. Your agents' thoughts stay in your folder.
  • Works fully offline (or with local Ollama, or with your own private LLM).

📡 Connectors: Telegram, Discord, Slack, WhatsApp, CLI, Web, Cron

Flowork agents listen to multiple entry points at once:

Channel How It Works Auth
Telegram Webhook or polling BotToken + ChatID
Discord Webhook Bot token + channel ID
Slack Event subscription Slack App token
WhatsApp Twilio integration Twilio credentials
CLI flowork ask --agent mybot "query" Local (no auth needed)
Web Dashboard + chat UI on :1987 Optional JWT
Cron Scheduled tasks Built-in cron parser
MCP Claude Desktop / Cursor MCP server on stdlib

A single agent can listen to all of these at once. The same agent logic answers a Telegram user and a Discord server and runs a nightly batch job.


🎭 Self-Protecting & Tamper-Aware

The microkernel is frozen. You write it once; it is never edited after deployment. This is radical simplicity: no configuration sprawl, no runtime patches, no kernel bugs in production.

A Guardian process monitors the kernel's bytecode. If it detects a modification (a Trojan payload, a supply-chain attack, a bug fix gone wrong), it does not crash the system. Instead:

  1. The agent that attempted the patch is isolated (moved to safe-mode).
  2. A log entry is written (so you know what tried to happen).
  3. The rest of the agents continue working.

This is fault isolation without sacrifice of integrity.


🧯 Mistakes → Lessons (Not Shame)

Flowork ships with an educational error model — a redemptive philosophy baked into the architecture. When an agent makes a mistake:

  1. Capture the error — store the exact input, the agent's reasoning, the mistake, and the correction.
  2. Teach in real-time — during the same conversation, the agent references this memory and adjusts.
  3. Learn offline — in idle time, the agent "dreams," reading its own logs and writing abstract lessons so future similar cases are handled better.
  4. Expose the lesson — via brain.query(), an agent can surface what it learned. A human can verify it, correct it, or approve it.

This is not zero-shot prompting. It's not fine-tuning. It's lived experience, stored and retrieval-augmented at inference time, with human oversight built in.

Honest limitation: This works well for domain-specific agents (stock analysis, music trivia, promo routing) where mistakes are domain errors. It does not teach an agent to be honest if its base LLM is dishonest — but it does catch hallucinations when a human corrects them, and it prevents the same hallucination from happening again in similar context.


🏗️ Architecture Snapshot

┌─────────────────────────────────────────────────────────┐
│                 Flowork Microkernel                      │
│  (frozen once, guarded, sandboxing + capability router) │
└────────┬────────┬────────┬────────┬────────┬────────────┘
         │        │        │        │        │
      [Bus]    [loket]  [Guard]  [MCP]  [LLM Router]
         │        │        │        │        │
    ┌────┴────┐   │    ┌────────┬──┴───┬─────┴────────┐
    │ Channels│   │    │Scanner │MCP   │Local/Remote  │
    │(Tele/DC/   │    │Radar   │Srv   │LLM (Claude,  │
    │Slack/WA)   │    │        │      │Ollama, etc.) │
    └─────┬──────┘    └────────┼──────┴──────────────┘
          │                    │
    ┌─────┴────────────────────┴──────────────────┐
    │                                               │
    │        Agent Instances (WASM sandboxes)      │
    │                                               │
    │  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
    │  │ Agent A  │  │ Agent B  │  │ Agent C  │   │
    │  │(folder)  │  │(folder)  │  │(folder)  │   │
    │  │SQLite    │  │SQLite    │  │SQLite    │   │
    │  │brain +   │  │brain +   │  │brain +   │   │
    │  │tools     │  │tools     │  │tools     │   │
    │  └──────────┘  └──────────┘  └──────────┘   │
    │                                               │
    └───────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Each agent folder contains:

my-agent/
  ├── config.toml          # Agent personality, tools, grants
  ├── brain.db             # SQLite FTS5 memory (private)
  ├── tools/               # Custom tools (WASM or native)
  ├── log/                 # Conversation history
  └── schedule.cron        # (optional) Cron jobs
Enter fullscreen mode Exit fullscreen mode

When the kernel restarts, agents resume with their memories intact. No session loss. No reset.


🎯 Real Strengths (and Trade-Offs)

Strengths

  • True local-first — agents run offline; no cloud dependency.
  • Memory that compounds — an agent with six months of corrected mistakes outperforms a fresh agent, not because it's fine-tuned, but because it recalls relevant context.
  • Plug-and-play modularity — break a tool, fix one folder; the kernel and all other agents are unaffected.
  • Single binary deploy — no DevOps, no containerization, no external service dependencies.
  • Radical transparency — all code frozen, all module versions tracked, all capability grants auditable.
  • Built-in security that's not theater — sandbox + capability model + guard, together, by design.

Trade-Offs

  • WASM sandbox overhead — agents are not as fast as native Python/JavaScript. If you need millisecond-scale performance, this is not your framework. For typical agent workloads (API calls, LLM inference, memory queries), the overhead is negligible.
  • Smaller module ecosystem (compared to, say, LangChain) — you get 117 built-in tools and plug-and-play extensibility, but not ten thousand community packages. This is intentional: fewer tools, each well-audited.
  • Educational errors work best in domains with clear ground truth — stock prices, music facts, promo rules. In open-ended creative work, "mistakes" are subjective; the error model is less useful there.
  • Learning is domain-specific, not general — agents learn from their own experience in their own domain. They do not transfer learning across agents or to a shared model. This is a privacy win but a generalization loss.

🛠️ Getting Started

  1. Clone the repo:
   git clone https://github.com/flowork-os/Flowork_Agent.git
   cd Flowork_Agent
Enter fullscreen mode Exit fullscreen mode
  1. Read the handbook (plain Markdown):
   cat doc/handbook/getting-started.md
Enter fullscreen mode Exit fullscreen mode
  1. Run:
   ./start.sh
Enter fullscreen mode Exit fullscreen mode

(or go run ./cmd/flowork if you want to build from source)

  1. Visit the dashboard:
   http://127.0.0.1:1987
Enter fullscreen mode Exit fullscreen mode
  1. Add a channel (Telegram, Discord, etc.) by pasting tokens into the Web UI.

  2. Create your first agent via the dashboard or CLI, grant it some tools, and talk to it.


🤔 Flowork vs. OpenClaw / Hermes (Same Yard, Different Bets)

Aspect Flowork OpenClaw Hermes
Self-hosted? ✅ Yes, full ✅ Yes (Rust) ✅ Yes (Python)
Memory Private per-agent, FTS5, recall at inference Shared context window Prompt history only
Security WASM sandbox + capability grants + kernel guard Native Python (no sandbox) Python (no sandbox)
MCP Support Both client & server Client only Limited
Single Binary ✅ Pure Go, no deps Rust binary (smaller) Python (requires Python)
Offline ✅ Full ✅ Mostly Needs internet for most LLMs
Plug-and-Play Tools ✅ 117 built-in, hot-load new ones Manual tool registration Manual setup
Multi-Agent ✅ Seamless Supported Supported
Error Learning ✅ Redemptive, FTS-backed Mistakes in history Not explicitly supported

The bet Flowork makes: Simplicity, memory, and security matter more than ecosystem size. One frozen kernel, infinite modules. Agents learn from their own mistakes. Run offline. Own your data.


📝 The Handbook

Start here: doc/handbook/


📖 Further Reading


🤝 Contributing & Community

Flowork is MIT-licensed. Contributions welcome:

  • Bug reports — file an issue.
  • New tools — write a tool module, open a PR.
  • Integrations — new channels, new MCP servers.
  • Docs — the handbook is Markdown; help clarify.

The philosophy is radical simplicity — one frozen kernel, infinite well-audited modules. Keep that in mind when contributing.


🔗 Links

🔗 Flowork is open source — both products

💬 Join the Flowork community on Telegram: https://t.me/+55oqrk75lc43YWE1

Top comments (0)