DEV Community

Cover image for Building Agent Arena: Using Valkey as the Nervous System for Multi-Agent AI
Harish Kotra (he/him)
Harish Kotra (he/him)

Posted on

Building Agent Arena: Using Valkey as the Nervous System for Multi-Agent AI

Most AI agent demos prove intelligence. Very few prove coordination.

In this project, we built Agent Arena: Fact or Fake, a real-time multiplayer game where four autonomous agents collaborate through one shared substrate: Valkey.

This post walks through the architecture, implementation details, tradeoffs, and developer patterns you can reuse.

Problem Statement

An LLM can generate content. But production-grade multi-agent systems need more:

  • Shared state across independent workers
  • Event-driven handoffs without tight coupling
  • Long-term memory that informs future behavior
  • Observability and recovery under failure

Without these, agent systems become brittle chains of API calls.

System Overview

Agents in this app:

  • Researcher: generates factual/misleading claim candidates (Ollama)
  • Writer: rewrites claim into player-facing question (Ollama)
  • Editor: validates truth + confidence (OpenAI)
  • Game Master: orchestrates timed rounds, scoring, leaderboard

Players join over WebSocket and answer FACT or FAKE in real time.

Core Design Principle

No direct agent-to-agent calls.

Every handoff is done through Valkey:

  • State -> JSON keys
  • Orchestration -> Pub/Sub events
  • Long-term recall -> vector index (FT.CREATE / FT.SEARCH)

Architecture Diagram

Players (WS) -> FastAPI -> Valkey (JSON + Pub/Sub + Vector)
                                |      |           |
                                v      v           v
                           Researcher Writer     Editor
                                   \      |      /
                                    \     |     /
                                     -> Game Master
Enter fullscreen mode Exit fullscreen mode

Implementation Breakdown

1) Shared State (Valkey JSON)

Agent outputs and game state are written into namespaced keys:

  • game:state:{room_id}
  • agent:researcher:output:{room}
  • agent:writer:draft:{room}
  • agent:editor:review:{room}
  • game:round:{room}:{round}
# backend/services/state_store.py
async def set_game_state(self, room_id: str, state: dict[str, Any]) -> None:
    await self.valkey.set_json(f'game:state:{room_id}', state)
Enter fullscreen mode Exit fullscreen mode

The set_json/get_json layer supports fallback to SET/GET JSON strings if RedisJSON is unavailable, keeping local demos robust.

2) Event-Driven Orchestration (Valkey Pub/Sub)

Every workflow transition publishes an event envelope:

# backend/services/event_bus.py
await self.valkey.publish(channel, envelope.model_dump_json())
Enter fullscreen mode Exit fullscreen mode

Each agent subscribes only to channels it cares about and reacts to events.

This enables:

  • decoupled scaling
  • independent process restarts
  • clean failure boundaries

3) Long-Term Memory (ValkeySearch vectors)

Questions are embedded and stored as memory documents:

# backend/services/vector_memory.py
await self.valkey.set_json(f'memory:question:{round_id}', {
  'question': question,
  'topic': topic,
  'difficulty': difficulty,
  'player_accuracy': player_accuracy,
  'embedding': emb,
})
Enter fullscreen mode Exit fullscreen mode

Vector search retrieves similar prior questions to reduce repetition and improve topic progression.

Round Lifecycle

The event chain per round:

  1. GAME_START / START_ROUND -> Researcher emits RESEARCH_DONE
  2. Writer reacts -> emits DRAFT_READY
  3. Editor reacts -> emits VALIDATION_COMPLETE
  4. Game Master launches round -> emits NEW_QUESTION
  5. Players answer via WebSocket
  6. Game Master emits ROUND_RESULT + LEADERBOARD_UPDATE + ROUND_COMPLETE

We also added cycle_id propagation to guard against stale or duplicate downstream processing.

Reliability Improvements We Added

  • Preserved player score on reconnect (HSETNX)
  • Reset scores on a new game start in the same room
  • Event handler safety: per-event exceptions don’t kill whole agent loop
  • WebSocket payload validation (invalid_json, invalid_round_id, round_not_active)
  • Health endpoint checks Valkey reachability + JSON/Search capability

Operational Checks

Use this before demos:

curl -s http://127.0.0.1:8000/health | jq
Enter fullscreen mode Exit fullscreen mode

You’ll see:

  • reachable
  • json_module
  • search_module

Environment Management (Varlock-first)

The app loads settings from process environment variables, which makes it a good fit for Varlock-managed secrets/config.

Example runtime:

varlock run -- uvicorn main:app --reload --port 8000
varlock run -- python scripts/run_agents.py --agents researcher writer editor game_master
Enter fullscreen mode Exit fullscreen mode

Test Strategy

Minimal integration test included:

pytest -q tests/test_integration_round.py
Enter fullscreen mode Exit fullscreen mode

It validates: start game -> first round -> leaderboard update.

Why This Pattern Matters

Compared with direct API chaining between agents, this design gives:

  • Better fault isolation
  • Better observability
  • Easier horizontal scaling
  • Simpler mental model for distributed workflows

What to Improve Next

  • Move orchestration from Pub/Sub to Valkey Streams (durable delivery)
  • Add event idempotency store + dead-letter handling
  • Add OpenTelemetry traces for event lifecycle
  • Add CI pipeline for contract tests + reliability tests

LLMs provide reasoning, but coordination makes systems reliable.

If you’re building multi-agent workflows, treat Valkey as your shared cognition fabric, not just cache.

Screenshots

Example Output 1

Example Output 2

Example Output 3

Example Output 4

Example Output 5

Example Output 6

Example Output 7

*Github: *https://github.com/harishkotra/neuroloop

Top comments (0)