DEV Community

Harish Kotra (he/him)
Harish Kotra (he/him)

Posted on

Building a Multi-Agent AI Swarm with Valkey as the Nervous System

AI agents need more than model calls. They need memory, coordination, and deterministic state transitions.

In this project, we built NeuroValkey Agents: a 3-agent Node.js swarm where Valkey is not a cache, but the central runtime substrate for orchestration.

  • Agent 1 (Researcher) generates facts and stores vector memory.
  • Agent 2 (Writer) retrieves semantic context with KNN search and drafts summary text.
  • Agent 3 (Editor) grades the draft and writes final output.

Everything is connected through Valkey primitives: Pub/Sub, Search (vector), JSON, Hashes.


Why this architecture works

Most agent prototypes couple control flow tightly to app memory. That makes state invisible and difficult to debug live.

This design externalizes swarm state into Valkey:

  • Pub/Sub channels make orchestration explicit.
  • JSON keys hold durable run snapshots.
  • Hash keys hold vectorized facts.
  • Search index enables semantic retrieval with minimal dependencies.

The result is a fast, event-driven, inspectable system.


System architecture

System architecture


Core implementation walkthrough

1) Valkey Search index creation via raw commands

To keep compatibility with module variants, we use valkey.call() directly:

await this.commandClient.call(
  'FT.CREATE',
  this.indexName,
  'ON', 'HASH',
  'PREFIX', '1', 'fact:',
  'SCHEMA',
  'topic', 'TAG',
  'agent', 'TAG',
  'embedding', 'VECTOR', 'FLAT', '6',
  'TYPE', 'FLOAT32',
  'DIM', String(CONFIG.embeddingDim),
  'DISTANCE_METRIC', 'COSINE'
);
Enter fullscreen mode Exit fullscreen mode

2) Writing vector memory into hash records

Each fact is embedded using OpenAI and written to fact:<uuid>:

await this.commandClient.hset(
  key,
  'text', text,
  'topic', normalizedTopic,
  'agent', 'researcher',
  'embedding', toFloat32Buffer(embedding)
);
Enter fullscreen mode Exit fullscreen mode

3) Writer semantic retrieval with KNN

const response = await this.commandClient.call(
  'FT.SEARCH',
  this.indexName,
  `*=>[KNN ${k} @embedding $vec AS score]`,
  'PARAMS', '2', 'vec', vector,
  'NOCONTENT',
  'DIALECT', '2'
);
Enter fullscreen mode Exit fullscreen mode

Then we hydrate matched keys with HMGET to get text/topic/agent.

4) Global state tracking with JSON

Manifest state changes are persisted as JSON, not process memory:

await this.commandClient.call('JSON.SET', key, '$', JSON.stringify(value));
const raw = await this.commandClient.call('JSON.GET', key, '$');
Enter fullscreen mode Exit fullscreen mode

This enables reproducibility and UI introspection.


The dashboard: making Valkey visible

Terminal output is useful but not audience-friendly for demos. The UI layer solves that by showing:

  • live keyspace map (swarm:*, fact:*) and key types
  • raw JSON for manifest/draft/final
  • vector fact records with embeddingBytes
  • Search index telemetry from FT.INFO
  • event timeline + process feed

Developers can correlate each event with exact state written to Valkey.


Engineering decisions and tradeoffs

  1. Polling over sockets for simplicity
  2. Chosen: periodic /api/state + /api/logs polling
  3. Tradeoff: slightly higher request volume
  4. Benefit: zero extra infra, easy local demo reliability

  5. Hash + JSON hybrid model

  6. Chosen: vectors in Hash, workflow state in JSON

  7. Tradeoff: two key representations

  8. Benefit: better fit for each data access pattern

  9. Raw Search commands vs abstraction library

  10. Chosen: raw FT.CREATE / FT.SEARCH

  11. Tradeoff: lower-level API surface

  12. Benefit: explicit control and compatibility visibility


Running the project

docker compose up -d
npm install
cp .env.example .env
# set OPENAI_API_KEY
npm run ui
Enter fullscreen mode Exit fullscreen mode

Open http://localhost:3055, launch a run, and watch the keyspace evolve.


Where to take this next

  • Convert polling to SSE/WebSockets for lower latency updates
  • Add run history table with previous manifests
  • Add configurable agent graph (DAG) in manifest
  • Add retries, backoff, and dead-letter channels
  • Add benchmark mode for throughput and latency stats
  • Add trace IDs and distributed telemetry

The important lesson is architectural, not cosmetic: LLMs are reasoning engines, but Valkey is the operational substrate that turns them into coordinated systems.

If you can observe your keyspace evolving in real time, you can trust, debug, and scale your swarm.

Screenshot

Screenshot Example

Github: https://github.com/harishkotra/neurovalkey-agents

Top comments (0)