Building a Multi-Agent AI Swarm with Valkey as the Nervous System

#ai #valkey #programming #dailybuild2026

AI agents need more than model calls. They need memory, coordination, and deterministic state transitions.

In this project, we built NeuroValkey Agents: a 3-agent Node.js swarm where Valkey is not a cache, but the central runtime substrate for orchestration.

Agent 1 (Researcher) generates facts and stores vector memory.
Agent 2 (Writer) retrieves semantic context with KNN search and drafts summary text.
Agent 3 (Editor) grades the draft and writes final output.

Everything is connected through Valkey primitives: Pub/Sub, Search (vector), JSON, Hashes.

Why this architecture works

Most agent prototypes couple control flow tightly to app memory. That makes state invisible and difficult to debug live.

This design externalizes swarm state into Valkey:

Pub/Sub channels make orchestration explicit.
JSON keys hold durable run snapshots.
Hash keys hold vectorized facts.
Search index enables semantic retrieval with minimal dependencies.

The result is a fast, event-driven, inspectable system.

System architecture

Core implementation walkthrough

1) Valkey Search index creation via raw commands

To keep compatibility with module variants, we use valkey.call() directly:

await this.commandClient.call(
  'FT.CREATE',
  this.indexName,
  'ON', 'HASH',
  'PREFIX', '1', 'fact:',
  'SCHEMA',
  'topic', 'TAG',
  'agent', 'TAG',
  'embedding', 'VECTOR', 'FLAT', '6',
  'TYPE', 'FLOAT32',
  'DIM', String(CONFIG.embeddingDim),
  'DISTANCE_METRIC', 'COSINE'
);

2) Writing vector memory into hash records

Each fact is embedded using OpenAI and written to fact:<uuid>:

await this.commandClient.hset(
  key,
  'text', text,
  'topic', normalizedTopic,
  'agent', 'researcher',
  'embedding', toFloat32Buffer(embedding)
);

3) Writer semantic retrieval with KNN

const response = await this.commandClient.call(
  'FT.SEARCH',
  this.indexName,
  `*=>[KNN ${k} @embedding $vec AS score]`,
  'PARAMS', '2', 'vec', vector,
  'NOCONTENT',
  'DIALECT', '2'
);

Then we hydrate matched keys with HMGET to get text/topic/agent.

4) Global state tracking with JSON

Manifest state changes are persisted as JSON, not process memory:

await this.commandClient.call('JSON.SET', key, '$', JSON.stringify(value));
const raw = await this.commandClient.call('JSON.GET', key, '$');

This enables reproducibility and UI introspection.

The dashboard: making Valkey visible

Terminal output is useful but not audience-friendly for demos. The UI layer solves that by showing:

live keyspace map (swarm:*, fact:*) and key types
raw JSON for manifest/draft/final
vector fact records with embeddingBytes
Search index telemetry from FT.INFO
event timeline + process feed

Developers can correlate each event with exact state written to Valkey.

Engineering decisions and tradeoffs

Polling over sockets for simplicity
Chosen: periodic /api/state + /api/logs polling
Tradeoff: slightly higher request volume
Benefit: zero extra infra, easy local demo reliability
Hash + JSON hybrid model
Chosen: vectors in Hash, workflow state in JSON
Tradeoff: two key representations
Benefit: better fit for each data access pattern
Raw Search commands vs abstraction library
Chosen: raw FT.CREATE / FT.SEARCH
Tradeoff: lower-level API surface
Benefit: explicit control and compatibility visibility

Running the project

docker compose up -d
npm install
cp .env.example .env
# set OPENAI_API_KEY
npm run ui

Open http://localhost:3055, launch a run, and watch the keyspace evolve.

Where to take this next

Convert polling to SSE/WebSockets for lower latency updates
Add run history table with previous manifests
Add configurable agent graph (DAG) in manifest
Add retries, backoff, and dead-letter channels
Add benchmark mode for throughput and latency stats
Add trace IDs and distributed telemetry

The important lesson is architectural, not cosmetic: LLMs are reasoning engines, but Valkey is the operational substrate that turns them into coordinated systems.

If you can observe your keyspace evolving in real time, you can trust, debug, and scale your swarm.