<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: MihaiBuilds</title>
    <description>The latest articles on DEV Community by MihaiBuilds (@mihaibuildsdev).</description>
    <link>https://dev.to/mihaibuildsdev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3904583%2F67fee610-d560-45d8-a994-78991737033d.jpeg</url>
      <title>DEV Community: MihaiBuilds</title>
      <link>https://dev.to/mihaibuildsdev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mihaibuildsdev"/>
    <language>en</language>
    <item>
      <title>The Brain v1.0 — building open-source workflow orchestration the boring way</title>
      <dc:creator>MihaiBuilds</dc:creator>
      <pubDate>Tue, 16 Jun 2026 10:59:25 +0000</pubDate>
      <link>https://dev.to/mihaibuildsdev/the-brain-v10-building-open-source-workflow-orchestration-the-boring-way-40m5</link>
      <guid>https://dev.to/mihaibuildsdev/the-brain-v10-building-open-source-workflow-orchestration-the-boring-way-40m5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published on &lt;a href="https://mihaibuilds.com/blog/the-brain-v1-0-released.html" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;. Cross-posting here because dev.to is where I find a lot of this kind of work myself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Memory Vault shipped five weeks ago — the substrate for the planned compounding stack. Memory is one half of "your AI assistant should know what you've done." The other half is "your tools should run when they should, against the right state, with the history you can audit." Cron lines on a server you half-remember and a Python file glued to a webhook glued to an LLM is not that. The Brain is the runtime underneath: Python-defined workflows, every run persisted, owned by you.&lt;/p&gt;

&lt;p&gt;The Brain is open-source, self-hosted, MIT-licensed. A Python file defines the workflow. Postgres holds every run. Four trigger types fire it (manual, cron, webhook, file). Four step types execute inside it (shell, LLM, Memory Vault REST, MCP). The whole thing is one multi-arch Docker image. Today it crosses the line from build-in-public project to v1.0 stable release.&lt;/p&gt;

&lt;h2&gt;
  
  
  What The Brain is
&lt;/h2&gt;

&lt;p&gt;A workflow orchestrator with persistent run history. You define a workflow as a Python file — a sequence of named steps. The Brain runs it on whichever of the four triggers you registered (a CLI invocation, a cron expression, an HMAC-signed webhook, or a filesystem change), persists every step's result to a single Postgres database, and stops on the first failure with a clear error. &lt;code&gt;brain history&lt;/code&gt; shows what ran. &lt;code&gt;brain show &amp;lt;run-id&amp;gt;&lt;/code&gt; shows the per-step detail. &lt;code&gt;brain diagnose&lt;/code&gt; produces a redacted zip ready to attach to a bug report.&lt;/p&gt;

&lt;p&gt;It runs entirely on your machine. No SaaS account. No telemetry. No cloud lock-in. &lt;code&gt;docker compose --profile api up -d&lt;/code&gt; and it's running. The CLI alone works against a Dockerized Postgres if you don't need the API or the watcher daemon.&lt;/p&gt;

&lt;p&gt;The biggest thing about The Brain is what it deliberately doesn't do: &lt;strong&gt;it doesn't decide anything on its own&lt;/strong&gt;. Workflows are scripted Python. The LLM step just transforms text. If a workflow wants "the LLM picks the next tool to call," you write that workflow — but the workflow file is doing the picking, not The Brain. The Brain is a workflow orchestrator, not an agent runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 actually does
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Four step types&lt;/strong&gt; — &lt;code&gt;ShellStep&lt;/code&gt; runs subprocess commands with stdout capture and exit-code semantics; &lt;code&gt;LLMStep&lt;/code&gt; calls an OpenAI-compatible chat API with per-step provider/model/key/timeout overrides; &lt;code&gt;MemoryVaultStep&lt;/code&gt; calls Memory Vault's REST API for the friction-free case; &lt;code&gt;McpToolStep&lt;/code&gt; calls any MCP tool over stdio with per-step spawn lifecycle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Four trigger types&lt;/strong&gt; — &lt;code&gt;manual&lt;/code&gt; (CLI), &lt;code&gt;cron&lt;/code&gt; (scheduler daemon), &lt;code&gt;webhook&lt;/code&gt; (HTTP POST with HMAC signature, verified constant-time), &lt;code&gt;file&lt;/code&gt; (filesystem watcher with debounced events).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Substitution between steps&lt;/strong&gt; — &lt;code&gt;{previous.step-name}&lt;/code&gt; references prior step output; &lt;code&gt;{trigger.body}&lt;/code&gt; / &lt;code&gt;{trigger.path}&lt;/code&gt; / &lt;code&gt;{trigger.headers.X}&lt;/code&gt; reference trigger payload data. Textual substitution, not eval'd Python — &lt;code&gt;str.format&lt;/code&gt; shape, strict-by-default errors when a reference doesn't resolve.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Three processes, one database&lt;/strong&gt; — the CLI for one-shot runs, &lt;code&gt;brain serve&lt;/code&gt; for the HTTP API + webhook intake, &lt;code&gt;brain watch&lt;/code&gt; for cron and file triggers. Compose profiles let you run any subset.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured logging with run_id binding&lt;/strong&gt; — &lt;code&gt;bind_run_id()&lt;/code&gt; attaches the run UUID to every log line emitted during a run. &lt;code&gt;LOG_FORMAT=keyvalue&lt;/code&gt; for human-readable console output, &lt;code&gt;LOG_FORMAT=json&lt;/code&gt; for log aggregation pipelines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redacted diagnostic bundle&lt;/strong&gt; — &lt;code&gt;brain diagnose&lt;/code&gt; writes a timestamped zip with recent logs, environment, OS info, Brain version, and Docker container state. The redaction model is allow-list, not blocklist: nine env vars are recorded with values; four (&lt;code&gt;DB_PASSWORD&lt;/code&gt;, &lt;code&gt;LLM_API_KEY&lt;/code&gt;, &lt;code&gt;MEMORY_VAULT_TOKEN&lt;/code&gt;, &lt;code&gt;THE_BRAIN_API_TOKEN&lt;/code&gt;) are presence-only — name appears, value never appears in the bundle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-command Docker, multi-arch&lt;/strong&gt; — &lt;code&gt;linux/amd64&lt;/code&gt; and &lt;code&gt;linux/arm64&lt;/code&gt; images published to &lt;code&gt;ghcr.io/mihaibuilds/the-brain&lt;/code&gt;. Compose profiles for api and watcher. Derive your own image when you need to add MCP servers or extra binaries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MIT-licensed, self-hosted&lt;/strong&gt; — your workflows, your data, your hardware. The whole thing is yours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;363 tests passing&lt;/strong&gt; — pytest with a real Postgres service container, no mocks at integration boundaries.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architectural decisions worth naming
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Postgres is the only state store.&lt;/strong&gt; No Redis. No queue. No second datastore. Run history, webhook secrets, and scheduler state all live in the same Postgres database, with &lt;code&gt;LISTEN&lt;/code&gt;/&lt;code&gt;NOTIFY&lt;/code&gt; for cross-process wakeup and plain SQL polling for the rest. You already know how to back up Postgres. You already know how to monitor it. Adding a queue or a Redis would double the backup story and the operational mental model for no win at the scale a self-hosted workflow runner actually runs at. When that stops being true, the migration path is sane. Until then, one database is the right answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three processes, not one.&lt;/strong&gt; The runner is synchronous per-run, the HTTP API is async per-request, the watcher is a long-running event loop. Splitting them means a crashing scheduler doesn't take the API down, and operators can run only the parts they need via Compose profiles. CLI-only users skip the daemons entirely. API-only users skip the watcher. The full stack runs all three. Process boundaries are cheaper than recovery procedures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-step subprocess spawn for shell and MCP.&lt;/strong&gt; A crashing shell command or a misbehaving MCP server should not take the runner down. &lt;code&gt;ShellStep&lt;/code&gt; always spawns a fresh subprocess per step. &lt;code&gt;McpToolStep&lt;/code&gt; spawns a fresh MCP server, runs the protocol handshake, makes one tool call, and tears it down at step end. No shared client. No pooling. The fork/exec cost is in the noise for the workflow shapes The Brain targets, and the isolation is real — if an MCP server crashes mid-call, only that one step fails and the next step gets a brand-new subprocess.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflow files are trusted Python, not a YAML DSL.&lt;/strong&gt; Every YAML workflow language ends up needing escape hatches that turn back into code. Workflow-as-Python means full editor support, full type checking, zero new syntax to learn — at the cost of treating workflow files as part of your trust boundary, same as any other file in your repo. The threat model in &lt;code&gt;SECURITY.md&lt;/code&gt; says so up front rather than hiding it behind a "do not run untrusted workflows" footnote.&lt;/p&gt;

&lt;h2&gt;
  
  
  The first time the ecosystem composed end-to-end
&lt;/h2&gt;

&lt;p&gt;Memory Vault has been live since early May. The Brain has been under construction since mid-May. I've been calling them "the ecosystem" the whole time, but they were two completely separate products living in two completely separate places — separate Postgres databases, separate codebases, separate Docker images. They had never actually worked together end-to-end.&lt;/p&gt;

&lt;p&gt;v1.0 of The Brain is the first time the two compose in production shape. The pattern is what I call &lt;strong&gt;horizontal substrate, vertical products&lt;/strong&gt;: Memory Vault is the substrate that stores and retrieves; The Brain is the engine that runs workflows; both run independently, both compose together. A workflow on The Brain can ask Memory Vault for memories over MCP, pipe those memories into an LLM step that summarizes them, write the summary to a file. Real Postgres on both sides. Real MCP stdio transport. Real LLM. Real file written.&lt;/p&gt;

&lt;p&gt;The integration path is the &lt;strong&gt;derive-pattern&lt;/strong&gt;: you write a Dockerfile that derives &lt;code&gt;FROM ghcr.io/mihaibuilds/the-brain:1.0&lt;/code&gt;, installs Memory Vault into the image, and the workflow step's &lt;code&gt;server_command&lt;/code&gt; spawns Memory Vault's MCP server inside the same container as a per-step subprocess. The Brain ships zero MCP servers in the stock image — that's deliberate. The Brain stays small. Memory Vault stays independent. The composition is opt-in and explicit. The same shape works for any MCP server: GitHub's, Sentry's, your own.&lt;/p&gt;

&lt;p&gt;This is the moment "the ecosystem" stops being something I write on a roadmap and becomes a system that genuinely exists. Two products. Two databases. One Docker network. Composing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The version-string story
&lt;/h2&gt;

&lt;p&gt;I tagged &lt;code&gt;v1.0.0&lt;/code&gt; from &lt;code&gt;main&lt;/code&gt; after the security audit pass. The release workflow fired, built the multi-arch image, pushed it to ghcr.io, and auto-published the GitHub Release from my annotated tag message. Two minutes twenty-nine seconds, end to end. Clean.&lt;/p&gt;

&lt;p&gt;Then I ran the smoke test. &lt;code&gt;docker pull ghcr.io/mihaibuilds/the-brain:1.0.0&lt;/code&gt; succeeded. Multi-arch manifest verified — &lt;code&gt;linux/amd64&lt;/code&gt; and &lt;code&gt;linux/arm64&lt;/code&gt; both present. I ran &lt;code&gt;brain --version&lt;/code&gt; inside the published image to confirm the version string matched the tag.&lt;/p&gt;

&lt;p&gt;It returned &lt;code&gt;brain, version 0.0.1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I never bumped &lt;code&gt;pyproject.toml&lt;/code&gt; from &lt;code&gt;0.0.1&lt;/code&gt;. The image tag said &lt;code&gt;1.0.0&lt;/code&gt;. The CLI inside the image said &lt;code&gt;0.0.1&lt;/code&gt;. The release was published, immutable, and visibly inconsistent on every fresh install.&lt;/p&gt;

&lt;p&gt;The fix was four lines. Three options were on the table: hotfix to &lt;code&gt;v1.0.1&lt;/code&gt;, delete the published tag and re-tag (which would break anyone who'd already pulled &lt;code&gt;:1.0.0&lt;/code&gt;), or leave it for the natural course of a &lt;code&gt;v1.0.x&lt;/code&gt;. The right call was the cheap one — hotfix. Bump &lt;code&gt;pyproject&lt;/code&gt; to &lt;code&gt;1.0.1&lt;/code&gt; (skipping &lt;code&gt;1.0.0&lt;/code&gt; because that tag is published and immutable), open a one-file PR, watch CI go green, merge, tag &lt;code&gt;v1.0.1&lt;/code&gt;, push. The release workflow re-fired and republished. Five minutes later, &lt;code&gt;brain --version&lt;/code&gt; inside &lt;code&gt;:1.0.1&lt;/code&gt; and &lt;code&gt;:latest&lt;/code&gt; both returned &lt;code&gt;brain, version 1.0.1&lt;/code&gt;. The digest is &lt;code&gt;sha256:41d0becdf7d036…&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The lesson isn't about version strings. It's that automated release infrastructure makes hotfix cycles cheap, and "cheap to do correctly the second time" is what gives you permission to do v1.0.0 with confidence rather than paralysis. The pre-tag checklist now has a &lt;code&gt;pyproject.version&lt;/code&gt; line. Future ecosystem product launches will hit this gate before the tag is pushed. The hotfix lesson costs five minutes of CI; the lesson is permanent.&lt;/p&gt;

&lt;p&gt;The discipline that built the v1.0 ship survived its own first encounter with reality — by treating a real release-engineering bug like a real bug, not by quietly editing history.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 doesn't do, on purpose
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;No retries on step failure.&lt;/strong&gt; First step failure halts the workflow. The persisted run row shows exactly where it stopped and why. Retry-on-failure with backoff is a v1.x candidate when real users ask for it, not a v1.0 default that hides bugs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No catch-up on missed cron windows.&lt;/strong&gt; If The Brain is down when a cron expression matches, that window is skipped — not deferred and replayed. Distributed time-shifted scheduling is its own product; The Brain assumes the host is online when the cron fires.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No &lt;code&gt;tools/list&lt;/code&gt; discovery for MCP servers.&lt;/strong&gt; Workflow authors know the tool name and argument shape in advance, like they know which shell commands they're calling. Discovery is an interactive-IDE problem, not a workflow-orchestrator problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No MCP over HTTP transport.&lt;/strong&gt; Stdio only in v1.0. HTTP transport is a v1.x candidate if real demand surfaces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No LLM-driven tool orchestration.&lt;/strong&gt; &lt;code&gt;LLMStep&lt;/code&gt; is chat-completion only. If a workflow wants "the LLM picks the MCP tool to call," that's a two-step pattern: the LLM step produces a tool name, the substitution pipes it into a downstream &lt;code&gt;McpToolStep&lt;/code&gt;. The workflow is the agent. The Brain is the runtime that runs the workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No bundled MCP servers in the stock image.&lt;/strong&gt; The base image is intentionally minimal. Derive-your-own-image is the documented composition path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No multi-user.&lt;/strong&gt; v1.0 is single-tenant. The single &lt;code&gt;THE_BRAIN_API_TOKEN&lt;/code&gt; env var is the auth model. Multi-user with workspace isolation is part of the planned PRO tier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No visual workflow builder.&lt;/strong&gt; Workflows are Python.&lt;/p&gt;

&lt;p&gt;These are deliberate trade-offs. Honest gaps documented up front build more trust than feature bullets that fall apart when someone actually tries them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The open-core model
&lt;/h2&gt;

&lt;p&gt;The Brain is and will always be MIT-licensed. The whole thing — the runner, the four step types, the four trigger types, the HTTP API, the CLI, the scheduler daemon, the watcher daemon, the structured logging, the diagnose bundle, the multi-arch Docker images. You can run it on your machine. You can fork it. You can use it inside a commercial product. The free tier is the full runtime, not a crippled demo of a paid tier.&lt;/p&gt;

&lt;p&gt;A paid PRO tier is planned: multi-user with workspace isolation, a hosted scheduler for people who don't want to run their own watcher daemon, a secrets vault, conflict resolution for shared workflow repos, additional integrations. The PRO tier is genuinely paid operational features — what teams running shared workflow infrastructure actually need, not what a solo developer on a laptop strictly needs. v1.x stays free forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker pull ghcr.io/mihaibuilds/the-brain:latest

git clone https://github.com/MihaiBuilds/the-brain
&lt;span class="nb"&gt;cd &lt;/span&gt;the-brain
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
docker compose &lt;span class="nt"&gt;--profile&lt;/span&gt; api up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:8001&lt;/code&gt; and the HTTP API is running. &lt;code&gt;brain run examples/hello.py&lt;/code&gt; from the CLI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/the-brain/releases/tag/v1.0.1" rel="noopener noreferrer"&gt;GitHub release page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/the-brain#readme" rel="noopener noreferrer"&gt;README and quick start&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/MihaiBuilds/the-brain/blob/main/ARCHITECTURE.md" rel="noopener noreferrer"&gt;ARCHITECTURE.md&lt;/a&gt; — the design overview&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/MihaiBuilds/the-brain/blob/main/SECURITY.md" rel="noopener noreferrer"&gt;SECURITY.md&lt;/a&gt; — the threat model and the trust posture for workflow inputs&lt;/li&gt;
&lt;li&gt;Questions and bug reports: &lt;a href="https://github.com/MihaiBuilds/the-brain/issues" rel="noopener noreferrer"&gt;GitHub Issues&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Follow along
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Twitter / X: &lt;a href="https://x.com/mihaibuilds" rel="noopener noreferrer"&gt;@mihaibuilds&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Blog: &lt;a href="https://mihaibuilds.com" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;github.com/MihaiBuilds/the-brain&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>ai</category>
      <category>selfhosted</category>
    </item>
    <item>
      <title>The Brain talks to everything now</title>
      <dc:creator>MihaiBuilds</dc:creator>
      <pubDate>Fri, 12 Jun 2026 10:41:24 +0000</pubDate>
      <link>https://dev.to/mihaibuildsdev/the-brain-talks-to-everything-now-3nl4</link>
      <guid>https://dev.to/mihaibuildsdev/the-brain-talks-to-everything-now-3nl4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published on &lt;a href="https://mihaibuilds.com/blog/the-brain-talks-to-everything-now.html" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;. Cross-posting here because dev.to is where I read a lot of work like this myself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A few days ago I shipped &lt;a href="https://dev.to/mihaibuildsdev/the-brain-reacts-now-13a8"&gt;the third milestone of The Brain&lt;/a&gt; — webhook triggers with HMAC auth, file watchers in their own container, the &lt;code&gt;{trigger.X}&lt;/code&gt; placeholder family for inbound payloads. That was M3. The Brain had the four classical trigger types: manual, scheduled, webhook, file.&lt;/p&gt;

&lt;p&gt;Today M4 is done. The Brain now talks to other tools — natively, over MCP — and the LLM step picks its own model per call.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;M1 was the runner. M2 made the runner work unattended. M3 made the runner reactive. &lt;strong&gt;M4 makes the runner ecosystem-aware.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before M4, The Brain was a workflow orchestrator that knew how to do three things on its own: run shell commands, call a local LLM through a fixed configured endpoint, and call Memory Vault over its REST API. Useful, but every integration with anything new required writing a custom adapter.&lt;/p&gt;

&lt;p&gt;After M4, The Brain can call any MCP server as a workflow step. Memory Vault's MCP server, GitHub's, Sentry's, your own. The stdio transport is the v1.0 commitment; the workflow file says "spawn this MCP server, call this tool, here are the arguments" and The Brain handles the lifecycle.&lt;/p&gt;

&lt;p&gt;The LLM step also got per-step overrides. Before M4, every workflow used one configured model server at one URL. Now each step can name its own provider URL, its own model, its own API key, its own timeout, its own max tokens. Mix a fast local model and a slow careful one in the same workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  What M4 ships
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Per-step LLM overrides.&lt;/strong&gt; Each &lt;code&gt;LLMStep&lt;/code&gt; can override the global &lt;code&gt;LLM_BASE_URL&lt;/code&gt; / &lt;code&gt;LLM_API_KEY&lt;/code&gt; / &lt;code&gt;LLM_MODEL&lt;/code&gt; env vars per call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;LLMStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fast_summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Two sentences: {previous.recall}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mistralai/ministral-3-3b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;timeout_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nc"&gt;LLMStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;careful_analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Detailed breakdown of: {fast_summary}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;provider_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://other-host:1234/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic/claude-3-5-sonnet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;timeout_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each field falls back to the corresponding env var when set to None. Tested against LM Studio only — other OpenAI-compatible providers (Ollama, vLLM, llama.cpp server, OpenAI proper) may work via the same wire format but are not promised in v1.0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP tool calling as a step type.&lt;/strong&gt; A new &lt;code&gt;McpToolStep&lt;/code&gt; peer to the existing step types:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;McpToolStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recall&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;server_command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python -m memory_vault.mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recall&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{previous.search_term}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;limit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;timeout_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;server_command&lt;/code&gt; and string values in &lt;code&gt;args&lt;/code&gt; accept &lt;code&gt;{previous.X}&lt;/code&gt; and &lt;code&gt;{trigger.X}&lt;/code&gt; placeholders the same way &lt;code&gt;ShellStep.command&lt;/code&gt; does. The &lt;code&gt;tool&lt;/code&gt; name and &lt;code&gt;args&lt;/code&gt; keys are never substituted — protocol-level identifiers, not user data. Non-string args values (ints, bools, nested dicts) pass through unchanged.&lt;/p&gt;

&lt;p&gt;stdio transport only in v1.0. &lt;code&gt;initialize&lt;/code&gt; + &lt;code&gt;tools/call&lt;/code&gt; only — no &lt;code&gt;tools/list&lt;/code&gt;, no resources, no prompts, no server-initiated notifications. Each step spawns the MCP server fresh, runs the handshake, calls one tool, and tears the subprocess down. No shared state. No pooling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The derive-your-own-image pattern.&lt;/strong&gt; The stock &lt;code&gt;mihaibuilds/the-brain&lt;/code&gt; image bundles zero MCP servers. The Brain is a workflow orchestrator; MCP servers are independent products. Coupling them would force users into installing things they don't need.&lt;/p&gt;

&lt;p&gt;If your workflow calls an MCP server, install that server in a derived image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; mihaibuilds/the-brain:latest&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&amp;lt;install-command-per-the-mcp-server-s-readme&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;examples/brain-with-mv-mcp/&lt;/code&gt; ships a complete worked composition with Memory Vault — Dockerfile, docker-compose.yml, a verify workflow, and a runbook README.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architectural decisions worth naming
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Per-step spawn lifecycle.&lt;/strong&gt; Every &lt;code&gt;McpToolStep&lt;/code&gt; spawns its MCP server subprocess at step start, runs the MCP &lt;code&gt;initialize&lt;/code&gt; handshake, calls one &lt;code&gt;tools/call&lt;/code&gt;, and kills the subprocess at step end. No shared client. No connection pool. Cold start cost per step is ~200-500ms for a server like MV's that loads sentence-transformers + spaCy + a pgvector connection on every spawn. The trade-off: isolation per call. A crashed MCP server kills only one step. A leaked file descriptor in the MCP server is cleaned up by the OS when we kill it. The next step gets a fresh subprocess. Per-run pooling is a future consideration if real latency complaints surface; v1.0 takes the isolation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;stdio transport, newline-delimited JSON, no Content-Length framing.&lt;/strong&gt; The MCP spec defines stdio framing as newline-delimited JSON — one JSON message per line, terminated by &lt;code&gt;\n&lt;/code&gt; on both stdin and stdout. The Content-Length framing is the streamable-HTTP transport, which is a separate protocol surface with its own auth concerns (Bearer / mTLS / OAuth). For v1.0, stdio is the deeper and more universal transport — Memory Vault's MCP server uses it, Claude Desktop uses it, and every reference MCP implementation uses it. HTTP transport may come in a future version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single-flight via &lt;code&gt;asyncio.Lock&lt;/code&gt;.&lt;/strong&gt; A single &lt;code&gt;StdioMcpClient&lt;/code&gt; instance serializes &lt;code&gt;call_tool&lt;/code&gt; invocations internally. The per-step-spawn lifecycle means concurrent calls per client never happen in normal use, but the lock removes a real foot-gun if someone hand-shares a client. Cheap insurance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Eager handshake on connect.&lt;/strong&gt; The MCP &lt;code&gt;initialize&lt;/code&gt; handshake runs in &lt;code&gt;__aenter__&lt;/code&gt; / &lt;code&gt;connect&lt;/code&gt;, not lazily on first &lt;code&gt;call_tool&lt;/code&gt;. The per-call timeout covers handshake + tool call together from the caller's POV. If &lt;code&gt;initialize&lt;/code&gt; hasn't run yet when &lt;code&gt;call_tool&lt;/code&gt; fires, the caller's 30-second budget would silently include some unknown amount of handshake time. Eager handshake makes the budget actually mean what it says.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Background stderr reader for pipe-fill resilience.&lt;/strong&gt; A continuous background task drains the subprocess's stderr pipe to a rolling ~1 KB tail. Without it, a chatty MCP server writing lots of stderr (say, a debug-build that logs everything) would fill the OS pipe buffer (~64 KB on macOS) and the subprocess would block waiting for someone to read stderr. Meanwhile The Brain would be waiting for stdout, deadlocking the whole call. The background reader prevents that. The captured tail is exposed via the &lt;code&gt;stderr_tail&lt;/code&gt; property for debug logging at step boundary — and never returned in &lt;code&gt;StepResult.output&lt;/code&gt;. Workflow data and debug data are different surfaces. A workflow author querying &lt;code&gt;{previous.recall}&lt;/code&gt; must never see stderr noise mixed into their workflow values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Substitution boundaries are sharp.&lt;/strong&gt; The runner's &lt;code&gt;_resolve_step&lt;/code&gt; function gains a new branch for &lt;code&gt;McpToolStep.args&lt;/code&gt; (a dict). It iterates dict values, substitutes string-typed values via &lt;code&gt;{previous.X}&lt;/code&gt; + &lt;code&gt;{trigger.X}&lt;/code&gt; resolvers, leaves non-strings and keys untouched. The &lt;code&gt;tool&lt;/code&gt; name is never substituted. Nested-dict args (&lt;code&gt;args={"filter": {"query": "{previous.X}"}}&lt;/code&gt;) are not recursively substituted — consistent with the &lt;code&gt;{trigger.body.foo}&lt;/code&gt; no-nesting rule from M3. Pinned by five separate substitution-boundary tests plus cross-PR pins in the audit-pass test file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;isError: true&lt;/code&gt; becomes step failure.&lt;/strong&gt; When an MCP server returns a successful JSON-RPC response containing &lt;code&gt;isError: true&lt;/code&gt;, The Brain treats it as step failure — same shape as a non-zero shell exit code. The first text content block in the response becomes the step's error message. MCP-side tool errors flow through the same workflow-halt semantics as every other failure path, so workflow authors don't have to check &lt;code&gt;isError&lt;/code&gt; in every downstream step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;MemoryVaultStep&lt;/code&gt; ↔ &lt;code&gt;McpToolStep&lt;/code&gt; coexistence.&lt;/strong&gt; Both ship in v1.0. Neither is deprecated. &lt;code&gt;MemoryVaultStep&lt;/code&gt; calls MV over its REST API with no extra setup — easy default for "I just want hybrid search from MV." &lt;code&gt;McpToolStep&lt;/code&gt; is the generic any-MCP-server mechanism — works for MV's MCP server (via the derive-pattern), GitHub's, Sentry's, your own. The deprecation question was considered and rejected — forcing users into the harder setup path right at v1.0 is the wrong direction.&lt;/p&gt;

&lt;h2&gt;
  
  
  The moment for the ecosystem
&lt;/h2&gt;

&lt;p&gt;I want to call this out separately because it matters more than either feature individually.&lt;/p&gt;

&lt;p&gt;Memory Vault went live two months ago. The Brain has been under construction since May. I've been calling them "the ecosystem" the whole time, but they were two completely separate projects living in two completely separate repositories. They had never actually worked together end-to-end.&lt;/p&gt;

&lt;p&gt;For M4's verify pass, I built a derived image with both projects installed, separate Postgres instances (Brain's tables + MV's pgvector tables), three containers in one Docker network. The verify workflow asks Memory Vault — over MCP — for memories matching a query. Memory Vault searches its pgvector index and returns chunks with similarity scores. The Brain pipes the chunks into a local LLM step. The LLM writes a digest. A shell step saves it.&lt;/p&gt;

&lt;p&gt;It worked. Real database, real hybrid search, real LLM call, real file written.&lt;/p&gt;

&lt;p&gt;I ran it twice. Once with Ministral-3B-Instruct loaded in LM Studio — about 4 seconds end-to-end. Once with Qwen3.5-9B, a reasoning-style model — about 2 minutes 13 seconds. Same workflow file. The only difference was three fields on the LLM step: &lt;code&gt;model&lt;/code&gt;, &lt;code&gt;timeout_seconds&lt;/code&gt;, &lt;code&gt;max_tokens&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Both summaries were real. The fast model wrote a tight two-sentence digest. The reasoning model produced a longer, more comprehensive summary that captured more of the original context — at thirty times the wall-clock cost. Same per-step override mechanism made the swap trivial.&lt;/p&gt;

&lt;p&gt;This is the first time The Brain and Memory Vault have actually composed in production shape. The moment where "the ecosystem" stops being a roadmap word and starts being a system that exists.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 won't do, on purpose
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The LLM step does not drive tool calling.&lt;/strong&gt; &lt;code&gt;LLMStep&lt;/code&gt; is chat-completion only — it produces text. If a workflow wants "LLM picks an MCP tool to call," it wires that explicitly: &lt;code&gt;LLMStep&lt;/code&gt; produces a tool name, &lt;code&gt;{previous.X}&lt;/code&gt; substitution puts that name into the next step's args (the &lt;code&gt;tool&lt;/code&gt; field itself is locked NOT-substituted, so the workflow author chains through args or uses separate branches). The workflow file is the orchestrator. The LLM transforms text. It does not decide. This is by design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No &lt;code&gt;tools/list&lt;/code&gt; discovery.&lt;/strong&gt; Workflow authors know the tool name and the args shape in advance, the same way they know what shell commands they're calling. If you want introspection, build it in a separate workflow step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP HTTP transport is not in v1.0.&lt;/strong&gt; Stdio only. HTTP transport (the streamable-HTTP MCP variant) brings its own auth surface. For v1.0, stdio is the deeper transport.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The stock image bundles zero MCP servers.&lt;/strong&gt; Per the ecosystem rule. Derive-your-own-image is the documented path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-run MCP server pooling is not implemented.&lt;/strong&gt; Per-step spawn is the v1.0 lifecycle. Two &lt;code&gt;McpToolStep&lt;/code&gt; calls to the same server in one workflow run produce two distinct subprocess PIDs. The cold-start cost is real; v1.0 takes the isolation guarantee.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No custom LLM auth schemes.&lt;/strong&gt; Bearer-only when an &lt;code&gt;api_key&lt;/code&gt; is set, no header when it isn't. If your provider needs something else, bake it into your derived image.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No bundled MCP servers.&lt;/strong&gt; Stock image stays lean. Each MCP server is a separate install in your derived Dockerfile.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No Docker-socket-mount for The Brain container.&lt;/strong&gt; Considered and rejected. A leaked webhook secret + a malicious payload substituted into &lt;code&gt;server_command&lt;/code&gt; would become a host escape. The derive-your-own-image pattern is the secure alternative — you control the contents of your derived image, not a runtime Docker socket.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reasoning models need bigger budgets.&lt;/strong&gt; Reasoning-style LLMs (qwen 3.x+, o1-style, R1-style, QwQ) consume token budget on internal reasoning before producing visible content. If you point a per-step LLM call at a reasoning model with default budgets, you may get empty visible output. The fix is bigger budgets — &lt;code&gt;timeout_seconds=600&lt;/code&gt; and &lt;code&gt;max_tokens=8000+&lt;/code&gt; is a reasonable starting point for a 9B reasoning model. Instruct models (Ministral, Mistral Instruct, Llama Instruct) don't have this behavior.&lt;/p&gt;

&lt;p&gt;These are deliberate trade-offs. M4 is the smallest correct ecosystem-aware surface, not the most ambitious one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who this is for
&lt;/h2&gt;

&lt;p&gt;Same audience as M1 + M2 + M3, with one addition: anyone building self-hosted workflow automation that needs to reach multiple specialized tools without writing a custom adapter for each one. The MCP ecosystem in 2026 has dozens of servers — for memory (Memory Vault), for code review (GitHub MCP), for observability (Sentry MCP), for filesystems, for databases, for browser control. M4 makes any of them callable from a Brain workflow step with the same shape.&lt;/p&gt;

&lt;p&gt;If you've ever wanted to wire an LLM workflow into multiple specialized backends without committing to LangChain — this is for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Milestone 5 is the v1.0 launch milestone. It's not new features — continuous integration, a security audit, full docs, the public README polish, and the launch ritual. After M5 ships, The Brain is publicly v1.0 — open-source, MIT, single-tenant, self-hosted, same shape Memory Vault took at its own v1.0.&lt;/p&gt;

&lt;p&gt;There's no M5 dev-log post on this dev.to series. The next post will be the v1.0 launch post itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/MihaiBuilds/the-brain
&lt;span class="nb"&gt;cd &lt;/span&gt;the-brain
&lt;span class="nv"&gt;THE_BRAIN_API_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;any-value docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;

&lt;span class="c"&gt;# call any MCP server from a workflow (build your own derived image first&lt;/span&gt;
&lt;span class="c"&gt;# with the MCP server installed — see examples/brain-with-mv-mcp/)&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain run examples/mcp_recall_memory.py

&lt;span class="c"&gt;# or use per-step LLM overrides without any MCP setup&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain run examples/daily_digest.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The repo has the full README, the derive-pattern example with a complete runbook for composing The Brain with Memory Vault, and reference workflows for both &lt;code&gt;LLMStep&lt;/code&gt; and &lt;code&gt;McpToolStep&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;GitHub — The Brain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/memory-vault" rel="noopener noreferrer"&gt;Memory Vault — the layer underneath&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/mihaibuildsdev/the-brain-reacts-now-13a8"&gt;M3 dev-log post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/mihaibuildsdev/the-brain-runs-on-a-schedule-now-31ch"&gt;M2 dev-log post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/mihaibuildsdev/i-built-the-memory-now-im-building-the-brain-2c9c"&gt;M1 debut post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Follow along
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Twitter / X: &lt;a href="https://x.com/mihaibuilds" rel="noopener noreferrer"&gt;@mihaibuilds&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Blog: &lt;a href="https://mihaibuilds.com" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;github.com/MihaiBuilds/the-brain&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>mcp</category>
    </item>
    <item>
      <title>The Brain reacts now</title>
      <dc:creator>MihaiBuilds</dc:creator>
      <pubDate>Wed, 10 Jun 2026 09:49:18 +0000</pubDate>
      <link>https://dev.to/mihaibuildsdev/the-brain-reacts-now-13a8</link>
      <guid>https://dev.to/mihaibuildsdev/the-brain-reacts-now-13a8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published on &lt;a href="https://mihaibuilds.com/blog/the-brain-reacts-now.html" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;. Cross-posting here because dev.to is where I read a lot of work like this myself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Three weeks ago I shipped &lt;a href="https://dev.to/mihaibuildsdev/the-brain-runs-on-a-schedule-now-31ch"&gt;the second milestone of The Brain&lt;/a&gt; — the scheduler daemon, cron triggers, workflows that read their previous run, an opt-in HTTP endpoint. That was M2. The Brain could run unattended on a clock.&lt;/p&gt;

&lt;p&gt;Today M3 is done. The Brain now reacts — to HTTP requests, and to filesystem changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;M2 made The Brain worth running unattended on a schedule. M3 makes it react to things that happen. The hardest, most useful workflow automations are the reactive ones — the workflow that fires when a customer signs up, the workflow that processes a file the moment it lands on disk, the workflow that wakes up because another system has news.&lt;/p&gt;

&lt;p&gt;M1 was the runner. M2 made the runner unattended. M3 makes the runner reactive. M1 + M2 + M3 together is the trigger surface most people actually need.&lt;/p&gt;

&lt;h2&gt;
  
  
  What M3 ships
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Webhook triggers.&lt;/strong&gt; Register any workflow as a webhook endpoint. The Brain prints a secret once, you save it, and from that moment on, any HTTP caller with the secret can fire the workflow over the network.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain register-webhook examples/webhook_handler.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI prints the HMAC secret exactly once — same caller-side-storage discipline as a GitHub personal access token. There's no &lt;code&gt;brain show-webhook-secret&lt;/code&gt; command by design; if you lose the secret, you unregister and re-register to issue a fresh one.&lt;/p&gt;

&lt;p&gt;Fire it from anywhere that can compute an HMAC signature:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;your-saved-secret&amp;gt;
&lt;span class="nv"&gt;BODY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{"hello":"world"}'&lt;/span&gt;
&lt;span class="nv"&gt;SIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"sha256=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s'&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BODY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | openssl dgst &lt;span class="nt"&gt;-sha256&lt;/span&gt; &lt;span class="nt"&gt;-hmac&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SECRET&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $2}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8001/webhook/webhook-handler &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-Brain-Signature: &lt;/span&gt;&lt;span class="nv"&gt;$SIG&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$BODY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;X-Brain-Signature: sha256=&amp;lt;hex&amp;gt;&lt;/code&gt; header convention is identical to GitHub's &lt;code&gt;X-Hub-Signature-256&lt;/code&gt; — so existing webhook senders work without translation. The endpoint runs the workflow synchronously and returns the run metadata. Wrong signature is 401. Unknown workflow name is 404, same shape as a disabled webhook, so existence is not leaked through the response code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File watcher triggers.&lt;/strong&gt; Register a workflow to fire when something changes on disk. The Brain runs a separate watcher daemon that observes the directory and fires the workflow on filesystem events.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain-watcher brain register-watcher examples/markdown_watcher.py &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--path&lt;/span&gt; /data/watched &lt;span class="nt"&gt;--events&lt;/span&gt; modified
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--events&lt;/code&gt; accepts any combination of &lt;code&gt;created&lt;/code&gt;, &lt;code&gt;modified&lt;/code&gt;, &lt;code&gt;deleted&lt;/code&gt;. The watcher daemon picks up the new registration on its next 10-second sync. A 500ms debounce per (workflow, path) coalesces multiple filesystem events from a single editor save into one workflow run.&lt;/p&gt;

&lt;p&gt;The watcher runs in its own container behind the &lt;code&gt;watcher&lt;/code&gt; compose profile. If the watcher crashes, the scheduler from M2 keeps running. If the scheduler crashes, the watcher keeps watching. Isolation by container, not by retry loops.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A trigger placeholder family.&lt;/strong&gt; Workflows triggered by a webhook or file event can read the inbound payload via a new placeholder family:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;ShellStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;received&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;echo got event={trigger.event} body={trigger.body}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four placeholders are available wherever string substitution works: &lt;code&gt;{trigger.event}&lt;/code&gt; (the trigger mechanism), &lt;code&gt;{trigger.body}&lt;/code&gt; (the inbound body — parsed JSON stringified deterministically, or raw string fallback), &lt;code&gt;{trigger.headers.X}&lt;/code&gt; (case-insensitive HTTP header lookup, allowlist gated), &lt;code&gt;{trigger.path}&lt;/code&gt; (the file path for file-triggered runs). Referencing &lt;code&gt;{trigger.X}&lt;/code&gt; on a workflow you ran manually fails the step with a clear error — same strict-failure shape as M2's &lt;code&gt;{previous.X}&lt;/code&gt; placeholder.&lt;/p&gt;

&lt;p&gt;The four classical trigger types — manual, cron, webhook, file — are now all there. Same workflow model. Same persistence model. Same &lt;code&gt;brain history&lt;/code&gt; and &lt;code&gt;brain show&lt;/code&gt; view of every run.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architectural decisions worth naming
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;HMAC verification is constant-time at every failure path.&lt;/strong&gt; Wrong prefix, wrong algorithm, malformed hex, length mismatch, non-string input — every failure shape runs through the same &lt;code&gt;hmac.compare_digest&lt;/code&gt; call against a placeholder digest. There is no early &lt;code&gt;len(a) != len(b)&lt;/code&gt; branch. The verifier is pinned by an end-to-end timing-attack regression test: it measures wall-clock variance between "wrong-length" and "right-length-wrong-value" failures across 2000 iterations and asserts the ratio stays under 10x. If a future refactor introduces a length-check shortcut, the test breaks loudly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;404 on unknown webhook is a locked v1.0 behavior, not a bug.&lt;/strong&gt; A probe CAN distinguish "unknown webhook" from "known but wrong signature" via response code. The webhook name is not a secret in this threat model — single-token-server-to-server with known callers, and if you can list webhooks you already have privileged access. Pinning the lock as a regression test: any future refactor that adds a constant-time-equal-lookup must break the test and surface as an explicit architectural decision, not a silent change. Same lock applies to the 404-for-disabled case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Watcher and scheduler heartbeats coexist via daemon_id suffix.&lt;/strong&gt; Both daemons UPSERT into the same &lt;code&gt;daemon_heartbeats&lt;/code&gt; table. The scheduler uses the container hostname as its daemon_id. The watcher appends &lt;code&gt;:watcher&lt;/code&gt;. The crash-recovery sweeps are mutually disjoint via the &lt;code&gt;trigger_context-&amp;gt;&amp;gt;'event'&lt;/code&gt; JSONB filter — the scheduler clears &lt;code&gt;running&lt;/code&gt; rows broadly, the watcher clears only file-triggered ones, and the two queries never overlap. Both daemons can be in the table at once without collision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The 500ms debounce is in-memory.&lt;/strong&gt; A &lt;code&gt;dict[tuple[workflow_name, path], float]&lt;/code&gt; keyed by monotonic time. Lost on daemon restart, which is fine because crash recovery re-fires from current FS state and any in-flight transient state is by definition stale. The boundary is exact and pinned at three layers: the unit &lt;code&gt;_should_fire&lt;/code&gt; function, the module constant &lt;code&gt;DEBOUNCE_SECONDS == 0.5&lt;/code&gt;, and an audit-pass test that pins 499ms blocks, 500ms fires, 501ms fires.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sequential within process, concurrent across processes.&lt;/strong&gt; Two webhook calls to the same workflow queue inside the API process. Two file events to the same watcher queue inside the watcher process. But the API, scheduler, and watcher daemons all run workflows in parallel because they're separate processes against the same database. No global work queue. The database absorbs concurrent INSERTs at the workflow_runs row level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The &lt;code&gt;{trigger.body}&lt;/code&gt; resolver stringifies JSON deterministically.&lt;/strong&gt; &lt;code&gt;json.dumps(body, sort_keys=True, separators=(",", ":"))&lt;/code&gt;. So the same parsed payload produces the same substituted command every time — deterministic for cache invariants, deterministic for diff workflows. Raw string bodies pass through unchanged. &lt;strong&gt;Nested JSON access (&lt;code&gt;{trigger.body.foo}&lt;/code&gt;) is NOT supported in v1.0&lt;/strong&gt; — the body is a string after serialization; &lt;code&gt;body.foo&lt;/code&gt; is treated as an unknown trigger field. Pinned with a locked-behavior test.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Header allowlist is hardcoded, not configurable.&lt;/strong&gt; The four placeholders the workflow can read from &lt;code&gt;{trigger.headers.X}&lt;/code&gt; are bounded by the allowlist: &lt;code&gt;content-type&lt;/code&gt;, &lt;code&gt;user-agent&lt;/code&gt;, &lt;code&gt;x-github-event&lt;/code&gt;, &lt;code&gt;x-github-delivery&lt;/code&gt;, &lt;code&gt;x-stripe-event&lt;/code&gt;, &lt;code&gt;x-event-key&lt;/code&gt;. Authorization, &lt;code&gt;X-Brain-Signature&lt;/code&gt;, cookies, infrastructure headers — never exposed to the workflow. If a step needs another header, that's a workflow-step concern that should be visible in the workflow source, not a configuration knob.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 won't do, on purpose
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The watcher daemon is not highly available.&lt;/strong&gt; One watcher per host, same single-daemon-per-host invariant as the scheduler. Two watchers running in parallel would clobber each other's crash-recovery logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No nested JSON access via &lt;code&gt;{trigger.body.foo}&lt;/code&gt;.&lt;/strong&gt; Locked. The body is a string after serialization; &lt;code&gt;body.foo&lt;/code&gt; is treated as an unknown trigger field. If you need to pluck a field, do it in the workflow step (e.g. &lt;code&gt;echo {trigger.body} | jq -r .foo&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No recursive directory watching.&lt;/strong&gt; Single directory per watcher row, no globs. If you want to watch a tree, run multiple watchers. The hardest part of watcher correctness is bounding the work; bounding it explicitly via one-dir-per-row is the v1.0 choice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No webhook API docs in production.&lt;/strong&gt; &lt;code&gt;/docs&lt;/code&gt;, &lt;code&gt;/redoc&lt;/code&gt;, &lt;code&gt;/openapi.json&lt;/code&gt; all 404 by design. The threat model is known-callers — anyone enumerating the endpoint shape is in scope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No replay protection on webhooks.&lt;/strong&gt; Idempotency is the workflow's concern, not the transport's. If your workflow can't be replayed safely, build the idempotency key check into the workflow itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No catching up on missed file events.&lt;/strong&gt; Filesystem events are not persistent. The watcher daemon sees current FS state at boot; events that happened during downtime are missed. Don't use file watchers for anything where missing events is unacceptable — use a cron schedule that reconciles state instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflows still execute one at a time per process.&lt;/strong&gt; Cross-process concurrency exists (API + scheduler + watcher in three containers can all run workflows simultaneously). Within-process concurrency is sequential by design for v1.0.&lt;/p&gt;

&lt;p&gt;These are deliberate trade-offs. M3 is the smallest correct reactive trigger surface, not the most ambitious one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who this is for
&lt;/h2&gt;

&lt;p&gt;Same audience as M1 + M2, with one addition: anyone building self-hosted automation against webhook senders (GitHub, Stripe, your own dashboards) who's tired of either rolling their own webhook server with no run history, or paying for a managed orchestrator that owns their auth.&lt;/p&gt;

&lt;p&gt;If you've ever wired a webhook to a tiny Flask app that calls a script and then forgotten about it for six months until something breaks — this is for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Milestone 4 adds MCP tool calling as a step type, plus a pluggable LLM provider abstraction. That's the milestone where The Brain becomes ecosystem-aware — any MCP server in your environment becomes a callable step, not just Memory Vault. Each milestone gets a dev-log post here as it ships — one of four dev.to posts across the build period.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/MihaiBuilds/the-brain
&lt;span class="nb"&gt;cd &lt;/span&gt;the-brain
&lt;span class="nv"&gt;THE_BRAIN_API_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;any-value docker compose &lt;span class="nt"&gt;--profile&lt;/span&gt; api &lt;span class="nt"&gt;--profile&lt;/span&gt; watcher up &lt;span class="nt"&gt;-d&lt;/span&gt;

&lt;span class="c"&gt;# register a webhook (saves the secret to stdout — copy it now)&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain register-webhook examples/webhook_handler.py

&lt;span class="c"&gt;# register a file watcher (must run from inside the watcher container)&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain-watcher brain register-watcher examples/markdown_watcher.py &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--path&lt;/span&gt; /data/watched &lt;span class="nt"&gt;--events&lt;/span&gt; modified

&lt;span class="c"&gt;# see all your triggers in one place&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain list-triggers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop a file in &lt;code&gt;./watched&lt;/code&gt;, sign and POST to &lt;code&gt;http://localhost:8001/webhook/webhook-handler&lt;/code&gt;, and the runs land in &lt;code&gt;brain history&lt;/code&gt; alongside any manual or scheduled runs from M1 and M2. The repo has the longer version with the full HMAC signing recipe, the trigger-placeholder reference, and the lifecycle commands for both trigger types.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;GitHub — The Brain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/memory-vault" rel="noopener noreferrer"&gt;Memory Vault — the layer underneath&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/mihaibuildsdev/the-brain-runs-on-a-schedule-now-31ch"&gt;M2 dev-log post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/mihaibuildsdev/i-built-the-memory-now-im-building-the-brain-2c9c"&gt;M1 debut post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Follow along
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Twitter / X: &lt;a href="https://x.com/mihaibuilds" rel="noopener noreferrer"&gt;@mihaibuilds&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Blog: &lt;a href="https://mihaibuilds.com" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;github.com/MihaiBuilds/the-brain&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>webhooks</category>
      <category>devops</category>
    </item>
    <item>
      <title>The Brain runs on a schedule now</title>
      <dc:creator>MihaiBuilds</dc:creator>
      <pubDate>Fri, 05 Jun 2026 12:52:02 +0000</pubDate>
      <link>https://dev.to/mihaibuildsdev/the-brain-runs-on-a-schedule-now-31ch</link>
      <guid>https://dev.to/mihaibuildsdev/the-brain-runs-on-a-schedule-now-31ch</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published on &lt;a href="https://mihaibuilds.com/blog/the-brain-runs-on-a-schedule-now.html" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;. Cross-posting here because dev.to is where I read a lot of work like this myself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Two weeks ago I shipped &lt;a href="https://dev.to/mihaibuildsdev/i-built-the-memory-now-im-building-the-brain-2c9c"&gt;the first milestone of The Brain&lt;/a&gt; — the bare runner. A Python file with a sequence of steps, &lt;code&gt;brain run path/to/workflow.py&lt;/code&gt;, the run lands in Postgres, you inspect it from the CLI. That was M1. It works, and you can run it on demand whenever you want.&lt;/p&gt;

&lt;p&gt;Today M2 is done. The Brain now runs on a schedule, on its own, without you in the loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;The most useful workflow automation only kicks in when you stop having to babysit it — daily digests, scheduled exports, nightly summaries, anything that compounds. M1 proved the runner works. M2 is the milestone where leaving it alone is a reasonable thing to do.&lt;/p&gt;

&lt;p&gt;That's the whole point of M2 in one sentence. The rest of this post is what that looks like in practice — and what it deliberately doesn't try to do.&lt;/p&gt;

&lt;h2&gt;
  
  
  What M2 ships
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Cron schedules.&lt;/strong&gt; Register a workflow on a standard 5-field cron expression. The Brain writes the schedule to Postgres next to your run history.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain register examples/daily_digest.py &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 9 * * 1-5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The schedule validates the cron expression and the workflow file before it lands in the database. Duplicate schedule names are rejected — no silent overwrite. You can list everything that's registered, see when each one ran last and when it'll fire next, pause and resume schedules (idempotent), and unregister them when you're done. Same CLI you used in M1, against the same database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A scheduler daemon.&lt;/strong&gt; The container now runs a long-running process — a daemon — that polls the schedule table every 10 seconds and fires whatever is due. SIGTERM finishes the currently-running workflow before exiting cleanly. On a crash, any run that was in flight gets recovered as a failed run with a clear error, so the run history never lies about what's running and what isn't.&lt;/p&gt;

&lt;p&gt;The daemon and the CLI are separate processes against the same database. You don't have to "stop the daemon to run a workflow" — &lt;code&gt;docker compose exec brain brain run ...&lt;/code&gt; still works exactly as it did in M1, in parallel with whatever the daemon is doing.&lt;/p&gt;

&lt;p&gt;A new &lt;code&gt;brain daemon-status&lt;/code&gt; command tells you whether the daemon is alive (&lt;code&gt;exit 0&lt;/code&gt; if it ticked within the last 30 seconds). Docker uses the same command as its container healthcheck.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflows that read their previous run.&lt;/strong&gt; A step can write &lt;code&gt;{previous.&amp;lt;step_name&amp;gt;}&lt;/code&gt; in its prompt or command, and The Brain substitutes the same step's output from the last successful run of the same workflow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;LLMStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Yesterday&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s summary:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;{previous.summary}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Today&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s memories:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;{recent}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write today&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s summary.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the very first run, when there is no previous successful run, the step fails with a clear error rather than silently substituting empty string. Same strict-failure shape as M1's intra-run &lt;code&gt;{step_name}&lt;/code&gt; placeholder — better to halt loudly than to leak unresolved braces into a shell command. Once one run has succeeded, every subsequent run sees its output via the placeholder.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An opt-in HTTP endpoint.&lt;/strong&gt; &lt;code&gt;POST /run&lt;/code&gt; accepts a workflow path, runs the workflow, and returns the run's metadata as JSON. Bearer token from an environment variable; without the token in the environment, the service refuses to start. Designed for server-to-server, not browsers — no CORS, no public docs, single token. Opt in by bringing up the &lt;code&gt;api&lt;/code&gt; compose profile.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;THE_BRAIN_API_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-secret docker compose &lt;span class="nt"&gt;--profile&lt;/span&gt; api up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to fire workflows from another machine, this is the surface. If you don't, ignore the profile and nothing in M1 changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architectural decisions worth naming
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;daemon_tick(now)&lt;/code&gt; is the unit of behavior, not the polling loop.&lt;/strong&gt; The daemon does one thing well: a single async function takes a wall-clock moment and runs one poll cycle (heartbeat, look up due schedules, fire each sequentially, advance &lt;code&gt;next_run_at&lt;/code&gt;). A separate &lt;code&gt;run_daemon&lt;/code&gt; wraps it in a 10-second loop with signal handlers. Tests drive the cycle function directly with a frozen clock instead of spawning a real long-running process — the wrapper is dumb on purpose. Two hours of test-design savings every time a future scheduler concern needs a regression test.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skip, don't catch up.&lt;/strong&gt; If a workflow takes longer than its cron interval — say a 1-minute cron whose last run took 5 minutes — the daemon does not queue up four backlog fires for the boundaries it missed. It fires once, advances &lt;code&gt;next_run_at&lt;/code&gt; to the next cron boundary after right-now, and moves on. A schedule that fell six hours behind because the container was off fires once and continues on its current cadence. Catching up across a long outage is almost always wrong; it floods the system with stale work the moment it comes back.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sequential within a poll cycle.&lt;/strong&gt; No concurrent workflow execution. A long-running workflow blocks the daemon from picking up other due workflows until it finishes. This is by design for v1.0 — parallel execution and a real work queue carry concurrency-control complexity that needs to wait until I have a real workload to optimize against, not a hypothetical one. v1.1 concern, called out in the explainer notes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Crash recovery on boot, not in-flight.&lt;/strong&gt; When &lt;code&gt;run_daemon&lt;/code&gt; starts, it sweeps &lt;code&gt;workflow_runs WHERE status='running'&lt;/code&gt; and marks them all failed with a locked error message. Under the single-daemon-per-host invariant these are by definition orphans from a previous crash. No heartbeat liveness check, no leader election, no consensus protocol. Single daemon means single source of truth for what "in flight" means.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A &lt;code&gt;planned_steps&lt;/code&gt; JSONB snapshot on every run.&lt;/strong&gt; Each &lt;code&gt;workflow_runs&lt;/code&gt; row now has the full step list at run-creation time — &lt;code&gt;[{"name": ..., "type": ...}, ...]&lt;/code&gt;. Lets postmortem disambiguate "step absent from &lt;code&gt;output&lt;/code&gt; because the run halted before reaching it" from "step never existed in this workflow version." One extra &lt;code&gt;json.dumps&lt;/code&gt; per run, no extra query. The cost is rounding error; the postmortem clarity is worth it. Suggested by a comment under the M1 dev.to post — pinned to the schema before the analyzer existed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;{previous.X}&lt;/code&gt; is a single indexed lookup.&lt;/strong&gt; A partial index on &lt;code&gt;workflow_runs (workflow_name, started_at DESC) WHERE status = 'success'&lt;/code&gt; makes the previous-run lookup an index-only scan. The previous run's &lt;code&gt;output&lt;/code&gt; JSONB is decomposed into a step-name → output map at lookup time, which is what &lt;code&gt;{previous.X}&lt;/code&gt; resolves against. Strict on failure: no prior successful run, or step name missing from the previous run, both fail THAT step with a clear distinct error. Two messages, two tests pinning them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HTTPBearer with &lt;code&gt;auto_error=False&lt;/code&gt;.&lt;/strong&gt; FastAPI's default &lt;code&gt;HTTPBearer&lt;/code&gt; returns 403 on a missing Authorization header. That's wrong — RFC 7235 says missing auth is 401, forbidden is 403. The explicit &lt;code&gt;auto_error=False&lt;/code&gt; + manual 401 raise corrects this. Small bug, but it's the kind of small bug that wastes a peer's afternoon when they're integrating against the endpoint and can't figure out why curl gets 403 from a missing-header request that should be 401. Pinned by three auth-branch tests.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 won't do, on purpose
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The daemon is not highly available.&lt;/strong&gt; One daemon per host. Two running in parallel would clobber each other's crash-recovery logic. The single-daemon invariant is what lets the recovery sweep be a simple &lt;code&gt;UPDATE WHERE status='running'&lt;/code&gt;. Adding HA means leader election or run-level ownership — both v1.1+ concerns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;There's no instant pickup.&lt;/strong&gt; New registrations and cron-boundary fires land within ten seconds of being due. Postgres LISTEN/NOTIFY would close that gap but adds complexity that 10s polling makes unnecessary for v1.0. Most workflows run on minute-or-coarser cron expressions; 10s is rounding error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;There's no queue for missed fires.&lt;/strong&gt; Skip-don't-catch-up is the locked behavior. If you genuinely need every fire to land, write a workflow that does its own backfill — The Brain won't second-guess your cron expression.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The HTTP endpoint isn't a public API.&lt;/strong&gt; Single token, no CORS, opt-in, designed for known callers on the same network. Path allowlisting and per-caller scoping are v1.1+. The threat model is single-token-server-to-server; anyone with the token can execute arbitrary server-side Python by pointing the endpoint at any file on the host. The token is the only gate. Treat it like a database password.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflows still execute one at a time per host.&lt;/strong&gt; Sequential within a tick. Concurrent execution is v1.1 territory and brings concurrency-control problems that need to wait for a real workload to design against.&lt;/p&gt;

&lt;p&gt;These are deliberate trade-offs. M2 is the smallest correct unattended-runner, not the most ambitious one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who this is for
&lt;/h2&gt;

&lt;p&gt;Same audience as M1, with one addition: anyone who needs a scheduled workflow runner they can self-host and inspect end-to-end — and who's tired of either rolling their own cron-in-a-container with no run history, or paying for a managed orchestrator that owns their data.&lt;/p&gt;

&lt;p&gt;If you've ever written a Python script, wired it to a system cron entry, then realized a week later you have no record of which days it failed and why — this is for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Milestone 3 is the reactive layer — webhook triggers and file-watcher triggers. That's when The Brain stops only firing on the clock and starts firing in response to things that happen.&lt;/p&gt;

&lt;p&gt;The full roadmap and milestone progress table live in the repo's README. Each milestone gets a dev-log post here as it ships — one of four dev.to posts across the build period.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/MihaiBuilds/the-brain
&lt;span class="nb"&gt;cd &lt;/span&gt;the-brain
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain daemon-status
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain register examples/hello.py &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"*/1 * * * *"&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait a minute, run &lt;code&gt;brain history&lt;/code&gt;, and you'll see the daemon-fired run sitting in there alongside any &lt;code&gt;brain run&lt;/code&gt; invocations from the M1 quickstart — same row shape, same inspection commands, same database. The repo has the longer version with state-across-runs and the HTTP endpoint walkthrough.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;GitHub — The Brain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/memory-vault" rel="noopener noreferrer"&gt;Memory Vault — the layer underneath&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/mihaibuildsdev/i-built-the-memory-now-im-building-the-brain-2c9c"&gt;M1 debut post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Follow along
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Twitter / X: &lt;a href="https://x.com/mihaibuilds" rel="noopener noreferrer"&gt;@mihaibuilds&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Blog: &lt;a href="https://mihaibuilds.com" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;github.com/MihaiBuilds/the-brain&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>postgres</category>
      <category>devops</category>
    </item>
    <item>
      <title>I built the memory, now I'm building the brain</title>
      <dc:creator>MihaiBuilds</dc:creator>
      <pubDate>Thu, 28 May 2026 08:10:21 +0000</pubDate>
      <link>https://dev.to/mihaibuildsdev/i-built-the-memory-now-im-building-the-brain-2c9c</link>
      <guid>https://dev.to/mihaibuildsdev/i-built-the-memory-now-im-building-the-brain-2c9c</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published on &lt;a href="https://mihaibuilds.com/blog/i-built-the-memory-now-im-building-the-brain.html" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;. Cross-posting here because dev.to is where I read a lot of work like this myself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Three weeks ago I shipped Memory Vault v1.0 — an open-source, self-hosted AI memory layer you run yourself. Postgres + pgvector under the hood, hybrid search on top, an MCP server so Claude can read and write to it directly. The first product in a planned compounding stack.&lt;/p&gt;

&lt;p&gt;Today the second product in that stack exists too. It's called The Brain.&lt;/p&gt;

&lt;p&gt;I'll get to what it is in a second. First, the honest part: I didn't announce it the day I started. I built the first milestone in private, on my own, with no audience watching. Three days of focused work, ten merged PRs, then a clean stop. Build-in-public is the long-term plan for this project the same way it was for Memory Vault. But the first week was head-down, because the riskiest part of a new product isn't the announcement — it's whether the thing actually works. Now that it does, I can tell you about it without hedging.&lt;/p&gt;

&lt;h2&gt;
  
  
  What The Brain is
&lt;/h2&gt;

&lt;p&gt;The Brain is a &lt;strong&gt;workflow orchestrator&lt;/strong&gt;, not an AI agent. It runs Python-defined workflows you author, with full visibility into every step. The intelligence is in the workflow you write; The Brain is the runtime that makes it repeatable and observable. It calls LLMs as steps when needed; it doesn't replace them.&lt;/p&gt;

&lt;p&gt;Concretely: you write a Python file that describes a sequence of steps. Each step is a shell command, a Memory Vault query, or a local LLM call. The Brain runs them top to bottom, passes output forward between them with named placeholders, and persists every run to Postgres. You inspect runs from the CLI. Successful runs exit 0; failed runs exit 1. It drops straight into cron jobs or CI pipelines.&lt;/p&gt;

&lt;p&gt;That's the whole pitch. There's no autonomous decision-making, no agent loop, no self-direction. It runs what you tell it to run, and it records what happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is a workflow orchestrator, not an agent
&lt;/h2&gt;

&lt;p&gt;The orchestration layer is too load-bearing to depend on someone else's framework. When the framework changes, your workflows break — and these frameworks change constantly. LangChain, LangGraph, CrewAI, AutoGen: they're all moving targets, and "agent autonomy" is a moving definition. Owned runtime, owned database, owned LLM client, owned everything. Five years from now this still runs.&lt;/p&gt;

&lt;p&gt;The other reason: build-in-public projects have an honesty constraint that pure-agent products don't. If The Brain claims to "decide" or "reason," I'd have to explain in every blog post what that means, what model it uses, and why the decision quality is what it is. Calling it a workflow orchestrator collapses that ambiguity. The user writes the logic. The Brain runs it. The output is reproducible. The behavior is auditable. The audience this is for — solo developers who use AI seriously and want their tools to be transparent — is allergic to the alternative.&lt;/p&gt;

&lt;h2&gt;
  
  
  What M1 ships, today
&lt;/h2&gt;

&lt;p&gt;M1 is called "Bare Runner." The name is the honest scope: it's the smallest thing that proves The Brain works end-to-end.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Run Python-defined workflows from the CLI&lt;/strong&gt; with &lt;code&gt;brain run path/to/workflow.py&lt;/code&gt;. A workflow is a plain Python file exposing a module-level &lt;code&gt;workflow = Workflow(...)&lt;/code&gt;. Loaded with &lt;code&gt;importlib&lt;/code&gt; and validated at load time via Pydantic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Three step types&lt;/strong&gt;: &lt;code&gt;ShellStep&lt;/code&gt; (subprocess + timeout), &lt;code&gt;MemoryVaultStep&lt;/code&gt; (Memory Vault REST), &lt;code&gt;LLMStep&lt;/code&gt; (OpenAI-compatible HTTP against LM Studio). Each lives in its own executor class; the runner dispatches by step type with no &lt;code&gt;isinstance&lt;/code&gt; chains.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Placeholder substitution&lt;/strong&gt; — steps pass output forward with &lt;code&gt;{step_name}&lt;/code&gt; tokens in any string field (&lt;code&gt;prompt&lt;/code&gt;, &lt;code&gt;command&lt;/code&gt;, &lt;code&gt;query&lt;/code&gt;). Strict: a placeholder that names no prior completed step fails THAT step with a clear error. Fail fast; never pass literal braces downstream.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistent run history in Postgres&lt;/strong&gt; — every run, every step, every output, every error. One &lt;code&gt;workflow_runs&lt;/code&gt; table; the run's full step-by-step output is stored as a JSONB array (not an object — JSONB doesn't preserve key order, and execution order is part of the data).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLI introspection&lt;/strong&gt; — &lt;code&gt;brain history&lt;/code&gt; lists past runs with &lt;code&gt;--limit&lt;/code&gt;/&lt;code&gt;--workflow&lt;/code&gt;/&lt;code&gt;--status&lt;/code&gt; filters; &lt;code&gt;brain show &amp;lt;run_id&amp;gt;&lt;/code&gt; shows full step-by-step detail for one run. Run IDs match by prefix (Memory Vault's &lt;code&gt;token revoke&lt;/code&gt; precedent).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strict failure semantics&lt;/strong&gt; — a workflow halts on the first failed step; the run row always lands in Postgres with a terminal status, even if an executor raises unexpectedly. The runner catches every executor exception and persists. A run that started always ends with a terminal DB row; no exception escapes unpersisted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-command Docker&lt;/strong&gt; — &lt;code&gt;docker compose up -d&lt;/code&gt; brings up Postgres and The Brain together, migrations run on boot via a hand-rolled migration runner in &lt;code&gt;src/db.py&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;46 hermetic tests&lt;/strong&gt; — pytest with a real Postgres test container, MV and LLM HTTP faked via &lt;code&gt;httpx.MockTransport&lt;/code&gt; (built-in, no &lt;code&gt;respx&lt;/code&gt; dependency). The suite is fast, deterministic, and runs anywhere with no external services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A run looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain run examples/hello.py
&lt;span class="go"&gt;Running workflow 'hello' (2 steps)
  ✓ greeting
  ✓ echo_it_back
Run c609f5e0 — success
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And inspecting it after:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain show c609f5e0
&lt;span class="go"&gt;Run:      c609f5e0-a8d6-4221-84c0-58c0b5d0460d
Workflow: hello
Status:   success
Started:  2026-05-22 19:54:58
Duration: 0.0s

Steps:
  ✓ greeting
      Hello from The Brain
  ✓ echo_it_back
      The previous step said: Hello from The Brain
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Architectural decisions worth naming
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Functional/declarative workflow files, not class + decorator.&lt;/strong&gt; A workflow is a data structure: &lt;code&gt;workflow = Workflow(name=..., steps=[Step(...), Step(...)])&lt;/code&gt;. Easiest to introspect, easiest to serialize, easiest to register for cron in the next milestone. Class-with-decorators looks ergonomic at first and gets in the way the moment you try to load workflows dynamically. The declarative form is what every workflow tool I respect converges on for a reason.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single &lt;code&gt;workflow_runs&lt;/code&gt; table for M1, per-step granularity deferred to M2.&lt;/strong&gt; The whole run's step-by-step output goes in one JSONB column. Yes, a per-step table is the "right" long-term schema. But M2 is where state-between-runs lands, and that's the milestone where it actually pays for itself. Shipping the right table in M1 would be carrying schema complexity for a feature M1 doesn't have. Defer it; revisit when the use case lands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thin in-repo Memory Vault REST client (~30 LOC), no shared library.&lt;/strong&gt; The Brain talks to Memory Vault over HTTP. I could extract a shared &lt;code&gt;mihaibuilds-clients&lt;/code&gt; library now. I'd be over-engineering for a future I haven't reached. The right time to extract a client library is when there are three or more callers — not when there's one. Right now the entire client is &lt;code&gt;httpx.post(...)&lt;/code&gt;. When The Brain plus two or three addons all talk to Memory Vault, the duplication will tell me it's time to extract.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LM Studio only in v1.0, not LM Studio + Ollama.&lt;/strong&gt; This is the explicit lesson I'm carrying from Memory Vault. Memory Vault's marketing claimed both LM Studio and Ollama support; only LM Studio was end-to-end tested. The Brain ships LM Studio only in v1.0. Ollama probably works through the same OpenAI-compatible client shape, but "probably works" isn't a release guarantee. Only claim providers you've actually tested. This rule survives every product I build.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Owned runtime, not LangChain/LangGraph/CrewAI wrapper.&lt;/strong&gt; Already covered above — but worth re-stating in the architecture section because it's the decision the rest of the codebase shape derives from. The Brain is ~1,500 lines of Python. A LangChain wrapper would be more code, more dependencies, and a runtime that breaks every time the upstream framework changes its API. Owned runtime is the simpler answer, not the more ambitious one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 won't do, on purpose
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;No autonomous decision-making.&lt;/strong&gt; The Brain runs the workflow you defined. It doesn't pick a different step at runtime. If you want branching, you write a workflow that branches. Rich conditional logic is in the v1.0-out section deliberately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No multi-user / team workflows.&lt;/strong&gt; Single-tenant by design. Multi-user activation lives behind a PRO tier later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No managed cloud.&lt;/strong&gt; Self-hosted, MIT-licensed, runs on your laptop or your VPS. Always.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No visual workflow builder.&lt;/strong&gt; The workflow file is the source of truth. You read it like Python, you diff it like Python, you grep it like Python. Visual builders are a PRO concern, not a v1.0 concern.&lt;/p&gt;

&lt;p&gt;These are deliberate trade-offs. The Brain v1.0 is the smallest correct version, not the most ambitious one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who this is for
&lt;/h2&gt;

&lt;p&gt;Developers who run real workflows on their own machines and want LLMs as a step inside those workflows — not as the thing in charge. Solo builders stitching together memory, models, and shell tools who are tired of agent frameworks that change their API every quarter. Anyone who wants every run to be inspectable, every output persisted, and every decision their own to make.&lt;/p&gt;

&lt;p&gt;If you've ever written a Python script that calls an LLM, then bolted on a cron entry, then realized you have no record of what it did yesterday — this is for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Milestone 2 is triggers and state — cron schedules, a long-running scheduler daemon, and workflows that read the previous run's output. M2 is the milestone where The Brain becomes worth running unattended.&lt;/p&gt;

&lt;p&gt;The full roadmap and milestone progress table live in the repo's README. Each milestone gets a dev-log post here as it ships — one of four dev.to posts across the build period.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/MihaiBuilds/the-brain
&lt;span class="nb"&gt;cd &lt;/span&gt;the-brain
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;brain brain run examples/hello.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The repo has the full quickstart with configuration, Memory Vault wiring, and the real-world digest example (recent memories → local LLM summary → markdown file, all in one Python file).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;GitHub — The Brain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/memory-vault" rel="noopener noreferrer"&gt;Memory Vault — the layer underneath&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Follow along
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Twitter / X: &lt;a href="https://x.com/mihaibuilds" rel="noopener noreferrer"&gt;@mihaibuilds&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Blog: &lt;a href="https://mihaibuilds.com" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/MihaiBuilds/the-brain" rel="noopener noreferrer"&gt;github.com/MihaiBuilds/the-brain&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>postgres</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Memory Vault v1.0 — building open-source AI memory the boring way</title>
      <dc:creator>MihaiBuilds</dc:creator>
      <pubDate>Sat, 09 May 2026 13:15:56 +0000</pubDate>
      <link>https://dev.to/mihaibuildsdev/memory-vault-v10-building-open-source-ai-memory-the-boring-way-33ej</link>
      <guid>https://dev.to/mihaibuildsdev/memory-vault-v10-building-open-source-ai-memory-the-boring-way-33ej</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published on &lt;a href="https://mihaibuilds.com/blog/memory-vault-v1-0-released.html" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;. Cross-posting here because dev.to is where I find a lot of this kind of work myself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For the past year I kept hitting the same wall. I'd have a real conversation with Claude — work through a database design, debug something gnarly, agree on a convention I wanted to keep — and the next morning it was gone. Not summarized. Not searchable. Just gone. ChatGPT was the same. Every assistant I used had the long-term memory of a goldfish, and the workaround the industry settled on was "paste the relevant context back in every time." That's not memory. That's me being the memory.&lt;/p&gt;

&lt;p&gt;So I built one. Memory Vault is an open-source, self-hosted AI memory system you run yourself: Postgres with pgvector underneath, hybrid search on top, an MCP server so Claude can read and write to it directly, a knowledge graph that extracts entities without an LLM bill, a local LLM chat with retrieved-source citations, and a one-command Docker setup. Two days ago it crossed the line from "build-in-public project" to "v1.0 stable release." (v1.0.2 yesterday closed two security findings I caught after enabling branch protection — path-traversal + info-exposure on an internal stream handler.)&lt;/p&gt;

&lt;h2&gt;
  
  
  What Memory Vault is
&lt;/h2&gt;

&lt;p&gt;A long-term memory layer for AI assistants and the apps you build on top of them. You ingest text — markdown notes, conversation logs, anything plain — and it gets chunked, embedded, full-text indexed, and stored in a single Postgres database. Hybrid search (vector similarity + keyword tsvector + Reciprocal Rank Fusion) returns the right chunks back when you query. An MCP server exposes four tools (&lt;code&gt;recall&lt;/code&gt;, &lt;code&gt;remember&lt;/code&gt;, &lt;code&gt;forget&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;) that Claude Desktop or Claude Code can call directly, which means Claude can read and write to your memory inside any conversation without you copy-pasting context. A REST API exposes the same operations for any app you build. A dashboard gives you a Search, Browse, Graph, Ingest, Stats, and Chat page. A local LLM chat (LM Studio in v1.0) lets you talk to your memories with full source citations — every response shows which chunks it pulled from, clickable.&lt;/p&gt;

&lt;p&gt;It runs entirely on your machine. No API keys. No cloud. No telemetry. Postgres on port 5432, the API on port 8000, dashboard on the same port. &lt;code&gt;docker compose up&lt;/code&gt; and it's running.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fls3i6vlk1ep80b08npnf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fls3i6vlk1ep80b08npnf.png" alt="Memory Vault dashboard Chat page answering a question with the sources panel expanded showing retrieved chunks" width="800" height="686"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 actually does
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid search&lt;/strong&gt; — pgvector HNSW for semantic + tsvector GIN for keyword + Reciprocal Rank Fusion to merge them. Vector-only search misses exact terms; keyword-only misses paraphrases. RRF gets both.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server&lt;/strong&gt; — four tools (&lt;code&gt;recall&lt;/code&gt;, &lt;code&gt;remember&lt;/code&gt;, &lt;code&gt;forget&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;) callable from Claude Desktop, Claude Code, or any MCP client. Claude reads and writes your memory in-conversation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge graph&lt;/strong&gt; — spaCy NER plus co-occurrence extracts entities (Person, Project, Tool, Concept) and &lt;code&gt;related_to&lt;/code&gt; relationships from every ingested chunk. No LLM, no per-token cost, rendered as an interactive Cytoscape force-directed graph.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory spaces&lt;/strong&gt; — namespacing for different contexts (work, personal, projects). Per-space dedup; cross-space isolation by default.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local LLM chat&lt;/strong&gt; — LM Studio native API with sources panel showing retrieved chunks for every answer. Every response is grounded and the grounding is visible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;REST API&lt;/strong&gt; — bearer-auth-protected, OpenAPI-documented at &lt;code&gt;/docs&lt;/code&gt;, every operation the dashboard does is also a documented endpoint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-command Docker&lt;/strong&gt; — &lt;code&gt;docker compose up&lt;/code&gt;. Postgres, the app, and the spaCy model bundled into a single image at build time, no first-run download.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-hosted, MIT-licensed&lt;/strong&gt; — your data stays on your machine. The whole thing is yours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;170 tests passing&lt;/strong&gt; — pytest with a real Postgres + pgvector service container, no mocks of the database.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architectural decisions worth naming
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Postgres + pgvector instead of a dedicated vector database.&lt;/strong&gt; I run one database, not two. Operationally this matters more than the marginal performance of a purpose-built vector store at small scale. You already know how to back up Postgres. You already know how to monitor it. HNSW indexes plus tuned &lt;code&gt;maintenance_work_mem&lt;/code&gt; and &lt;code&gt;ef_search&lt;/code&gt; get you to "fast enough for hundreds of thousands of chunks on a laptop." When that stops being true, the migration path is sane. Until then, one database is the right answer for a self-hosted personal-memory tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hybrid search instead of vector-only.&lt;/strong&gt; Pure vector search is great at paraphrase and concept. It's bad at exact terms — model names, error codes, file paths, anything where the literal string is the signal. Memory Vault stores both an embedding and a tsvector for every chunk and merges the two ranked result sets with Reciprocal Rank Fusion. RRF is parameter-free, doesn't require score normalization, and consistently beats either approach alone on the kind of mixed queries real users actually type.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;spaCy + co-occurrence for the knowledge graph, not an LLM.&lt;/strong&gt; The default move in this space is to feed every chunk through an LLM and ask it for entities and relationships. It works. It also costs money on every ingest, couples your graph quality to whichever model you happened to pick, and requires API keys for a tool whose entire pitch is no API keys. spaCy's &lt;code&gt;en_core_web_sm&lt;/code&gt; model plus a co-occurrence rule (two entities in the same chunk = a &lt;code&gt;related_to&lt;/code&gt; edge, weighted by frequency) gets you a useful graph for zero per-ingest cost. The honest limits — English only, context-dependent NER, no fuzzy matching — are documented up front rather than masked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP-first, not REST-first.&lt;/strong&gt; Memory Vault was designed around the assumption that the primary user of this database is going to be Claude, not me. The MCP server isn't a wrapper around a REST API — it's a direct path into the same code that the REST API uses. Both are first-class. But the design starting point was "what does Claude need to call to make memory feel native," and then the REST API was the same operations exposed for human-driven apps. That ordering changes which tradeoffs are interesting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The PoolClosed story
&lt;/h2&gt;

&lt;p&gt;About a week before tag day, I added a CLI command called &lt;code&gt;memory-vault diagnose&lt;/code&gt;. It bundles app logs, database logs, status output, OS info, and redacted environment into a zip file users can attach to bug reports. Foundation work. Paid for once. The kind of thing that makes every future bug report ten times higher signal-to-noise.&lt;/p&gt;

&lt;p&gt;I shipped it. Then I ran the test suite. 163 passed, 52 errored. Every error was &lt;code&gt;psycopg_pool.PoolClosed&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;First instinct: probably an &lt;code&gt;httpx&lt;/code&gt; lifespan thing. Modern &lt;code&gt;httpx&lt;/code&gt; has changed how it handles ASGI lifespan events between minor versions. The test suite uses &lt;code&gt;httpx.ASGITransport&lt;/code&gt; to drive the FastAPI app in-process, sharing a session-wide connection pool fixture. If the transport was firing shutdown events between tests, the pool would close mid-suite. There's a kwarg for this. I added &lt;code&gt;lifespan="off"&lt;/code&gt; to the transport. &lt;code&gt;TypeError: ASGITransport.__init__() got an unexpected keyword argument 'lifespan'&lt;/code&gt;. The kwarg doesn't exist in 0.28.x. Reverted.&lt;/p&gt;

&lt;p&gt;Second instinct: walk the call graph. &lt;code&gt;memory-vault diagnose&lt;/code&gt; calls into the CLI's &lt;code&gt;_run_status&lt;/code&gt; helper to capture status output for the bundle. &lt;code&gt;_run_status&lt;/code&gt; was implemented as &lt;code&gt;asyncio.run(_cmd_status())&lt;/code&gt; — directly calling the CLI's status function in-process. &lt;code&gt;_cmd_status&lt;/code&gt; initializes a connection pool at the top of the function and closes it via a &lt;code&gt;finally&lt;/code&gt; block at the end. Which is correct behavior for the CLI. It's also exactly what you don't want when something else in the same process — like a session-wide test fixture — already owns a pool that's mid-flight.&lt;/p&gt;

&lt;p&gt;The fix was four lines. Replace the in-process &lt;code&gt;asyncio.run&lt;/code&gt; with &lt;code&gt;subprocess.run(["memory-vault", "status"])&lt;/code&gt;. The subprocess gets its own pool, lives its own lifecycle, exits cleanly, and the parent process's pool is never touched. 163 passed, 0 errored.&lt;/p&gt;

&lt;p&gt;The lesson isn't about pools or fixtures specifically. It's that "obvious" fixes (changing the test transport config) and root causes (one function quietly tearing down state owned by a different function) live in different parts of the code. The &lt;code&gt;lifespan="off"&lt;/code&gt; move would have masked the symptom in the tests and left the actual bug in the CLI, where users would have hit it. Almost the entire week's gap between "all my sub-steps look done" and "v1.0 is actually shippable" was the discipline of not bypassing this kind of thing when bypassing was easy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What v1.0 doesn't do, on purpose
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;English-only NER.&lt;/strong&gt; The bundled spaCy model is &lt;code&gt;en_core_web_sm&lt;/code&gt;. Non-English content gets little to no useful entity extraction. Multilingual models exist; they're heavier and slower; they're a question driven by real user demand, not a v1.0 must-have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No fuzzy entity matching.&lt;/strong&gt; "PostgreSQL" and "Postgres" are separate entities in the graph. No alias merging in v1.0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No re-extraction on edit.&lt;/strong&gt; If you re-ingest a corrected version of a chunk, the new entities are added but the old ones aren't cleaned up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single-user.&lt;/strong&gt; v1.0 has bearer auth and one user behind it. The schema has &lt;code&gt;owner_id&lt;/code&gt; and &lt;code&gt;access_level&lt;/code&gt; columns from day one, but multi-user activation is part of the PRO tier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LM Studio only for chat.&lt;/strong&gt; Ollama and llama.cpp use the same OpenAI-compatible client architecture under the hood, but the only end-to-end-tested path in v1.0 is LM Studio. Ollama support is not in v1.0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No multi-conversation history in chat.&lt;/strong&gt; Single-thread chat. Driven by whether real users ask for it.&lt;/p&gt;

&lt;p&gt;These are deliberate trade-offs. Honest gaps documented up front build more trust than feature bullets that fall apart when someone actually tries them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The open-core model
&lt;/h2&gt;

&lt;p&gt;Memory Vault is and will always be MIT-licensed. The whole thing — search, MCP, graph, REST API, dashboard, local LLM chat, ingestion pipeline, the database schema, the Docker setup. You can run it on your machine. You can fork it. You can use it inside a commercial product. The free tier is genuinely useful — not a crippled demo of the paid tier.&lt;/p&gt;

&lt;p&gt;A paid PRO tier is on the roadmap for teams: dedup with importance decay, conflict resolution and supersede chains, multi-user activation, additional adapters (PDF, web pages), automated encrypted backups, and a fuller dashboard with analytics. The PRO tier is genuinely paid features — operational tools that solo users on a laptop don't strictly need, and teams running shared knowledge bases really do. The split is honest by design.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this took to build
&lt;/h2&gt;

&lt;p&gt;Seven weeks of evenings and weekends across nine locked milestones, scope frozen on March 27. M1 was the announcement. M2 the core hybrid search. M3 the one-command Docker. M4 the MCP server. M5 the REST API. M6 the dashboard. M7 the knowledge graph. M8 — this one — was local LLM chat plus the polish, CI/CD, security review, and release engineering that turn a build-in-public project into something other people can actually use.&lt;/p&gt;

&lt;p&gt;Two of those weeks were the kind of work nobody sees: structured JSON logging with request ID propagation, a diagnostic CLI that produces a redacted bundle for bug reports, GitHub Actions for lint and test and multi-arch Docker release, security audit (bandit, npm audit, Dependabot, CodeQL, plus a 15-test pentest pass with curl), Contributor Covenant Code of Conduct, threat model in SECURITY.md, branch protection rules, and the discipline to fix the actual root cause of a test failure instead of bypassing it. Unglamorous. Also the difference between v0.7 and v1.0.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Beyond.&lt;/strong&gt; Memory Vault is the first product in a planned compounding stack — The Brain is the next layer, building agents on top of this memory infrastructure. The memory layer is the one that has to be solid first. Today it is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/MihaiBuilds/memory-vault
&lt;span class="nb"&gt;cd &lt;/span&gt;memory-vault
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;code&gt;http://localhost:8000&lt;/code&gt; and you're running.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/memory-vault/releases/latest" rel="noopener noreferrer"&gt;GitHub — latest release&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/memory-vault#readme" rel="noopener noreferrer"&gt;README and quick start&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MihaiBuilds/memory-vault#mcp-integration" rel="noopener noreferrer"&gt;MCP setup for Claude Desktop / Claude Code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Questions and bug reports: &lt;a href="https://github.com/MihaiBuilds/memory-vault/issues" rel="noopener noreferrer"&gt;GitHub Issues&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;General discussion: &lt;a href="https://github.com/MihaiBuilds/memory-vault/discussions" rel="noopener noreferrer"&gt;GitHub Discussions&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Credits
&lt;/h2&gt;

&lt;p&gt;Three Postgres tuning tips landed during M6 and M7 that materially improved Memory Vault: &lt;a href="https://x.com/rivestack" rel="noopener noreferrer"&gt;@rivestack&lt;/a&gt; on &lt;code&gt;maintenance_work_mem&lt;/code&gt;, &lt;code&gt;ef_search&lt;/code&gt; as a runtime knob, and post-deploy cache warmup for HNSW indexes. The first ships in v1.0; we'll use the others when we get to them. Public credit, fair credit. Build-in-public works because builders with deeper expertise see what you're shipping and tell you what's wrong before production does.&lt;/p&gt;

&lt;p&gt;Beta tester Inevitable-Way-3916 ran the dashboard early, asked the architecture questions that forced the ARCHITECTURE.md doc to exist, and put bulk ingest on the list. Thanks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Follow along
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Twitter / X: &lt;a href="https://x.com/mihaibuilds" rel="noopener noreferrer"&gt;@mihaibuilds&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Blog: &lt;a href="https://mihaibuilds.com" rel="noopener noreferrer"&gt;mihaibuilds.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/MihaiBuilds/memory-vault" rel="noopener noreferrer"&gt;github.com/MihaiBuilds/memory-vault&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>ai</category>
      <category>postgres</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
