AI agents need more than model calls. They need memory, coordination, and deterministic state transitions.
In this project, we built NeuroValkey Agents: a 3-agent Node.js swarm where Valkey is not a cache, but the central runtime substrate for orchestration.
- Agent 1 (Researcher) generates facts and stores vector memory.
- Agent 2 (Writer) retrieves semantic context with KNN search and drafts summary text.
- Agent 3 (Editor) grades the draft and writes final output.
Everything is connected through Valkey primitives: Pub/Sub, Search (vector), JSON, Hashes.
Why this architecture works
Most agent prototypes couple control flow tightly to app memory. That makes state invisible and difficult to debug live.
This design externalizes swarm state into Valkey:
- Pub/Sub channels make orchestration explicit.
- JSON keys hold durable run snapshots.
- Hash keys hold vectorized facts.
- Search index enables semantic retrieval with minimal dependencies.
The result is a fast, event-driven, inspectable system.
System architecture
Core implementation walkthrough
1) Valkey Search index creation via raw commands
To keep compatibility with module variants, we use valkey.call() directly:
await this.commandClient.call(
'FT.CREATE',
this.indexName,
'ON', 'HASH',
'PREFIX', '1', 'fact:',
'SCHEMA',
'topic', 'TAG',
'agent', 'TAG',
'embedding', 'VECTOR', 'FLAT', '6',
'TYPE', 'FLOAT32',
'DIM', String(CONFIG.embeddingDim),
'DISTANCE_METRIC', 'COSINE'
);
2) Writing vector memory into hash records
Each fact is embedded using OpenAI and written to fact:<uuid>:
await this.commandClient.hset(
key,
'text', text,
'topic', normalizedTopic,
'agent', 'researcher',
'embedding', toFloat32Buffer(embedding)
);
3) Writer semantic retrieval with KNN
const response = await this.commandClient.call(
'FT.SEARCH',
this.indexName,
`*=>[KNN ${k} @embedding $vec AS score]`,
'PARAMS', '2', 'vec', vector,
'NOCONTENT',
'DIALECT', '2'
);
Then we hydrate matched keys with HMGET to get text/topic/agent.
4) Global state tracking with JSON
Manifest state changes are persisted as JSON, not process memory:
await this.commandClient.call('JSON.SET', key, '$', JSON.stringify(value));
const raw = await this.commandClient.call('JSON.GET', key, '$');
This enables reproducibility and UI introspection.
The dashboard: making Valkey visible
Terminal output is useful but not audience-friendly for demos. The UI layer solves that by showing:
- live keyspace map (
swarm:*,fact:*) and key types - raw JSON for manifest/draft/final
- vector fact records with
embeddingBytes - Search index telemetry from
FT.INFO - event timeline + process feed
Developers can correlate each event with exact state written to Valkey.
Engineering decisions and tradeoffs
- Polling over sockets for simplicity
- Chosen: periodic
/api/state+/api/logspolling - Tradeoff: slightly higher request volume
Benefit: zero extra infra, easy local demo reliability
Hash + JSON hybrid model
Chosen: vectors in Hash, workflow state in JSON
Tradeoff: two key representations
Benefit: better fit for each data access pattern
Raw Search commands vs abstraction library
Chosen: raw
FT.CREATE/FT.SEARCHTradeoff: lower-level API surface
Benefit: explicit control and compatibility visibility
Running the project
docker compose up -d
npm install
cp .env.example .env
# set OPENAI_API_KEY
npm run ui
Open http://localhost:3055, launch a run, and watch the keyspace evolve.
Where to take this next
- Convert polling to SSE/WebSockets for lower latency updates
- Add run history table with previous manifests
- Add configurable agent graph (DAG) in manifest
- Add retries, backoff, and dead-letter channels
- Add benchmark mode for throughput and latency stats
- Add trace IDs and distributed telemetry
The important lesson is architectural, not cosmetic: LLMs are reasoning engines, but Valkey is the operational substrate that turns them into coordinated systems.
If you can observe your keyspace evolving in real time, you can trust, debug, and scale your swarm.


Top comments (0)