Most AI agent demos prove intelligence. Very few prove coordination.
In this project, we built Agent Arena: Fact or Fake, a real-time multiplayer game where four autonomous agents collaborate through one shared substrate: Valkey.
This post walks through the architecture, implementation details, tradeoffs, and developer patterns you can reuse.
Problem Statement
An LLM can generate content. But production-grade multi-agent systems need more:
- Shared state across independent workers
- Event-driven handoffs without tight coupling
- Long-term memory that informs future behavior
- Observability and recovery under failure
Without these, agent systems become brittle chains of API calls.
System Overview
Agents in this app:
- Researcher: generates factual/misleading claim candidates (Ollama)
- Writer: rewrites claim into player-facing question (Ollama)
- Editor: validates truth + confidence (OpenAI)
- Game Master: orchestrates timed rounds, scoring, leaderboard
Players join over WebSocket and answer FACT or FAKE in real time.
Core Design Principle
No direct agent-to-agent calls.
Every handoff is done through Valkey:
- State -> JSON keys
- Orchestration -> Pub/Sub events
- Long-term recall -> vector index (
FT.CREATE/FT.SEARCH)
Architecture Diagram
Players (WS) -> FastAPI -> Valkey (JSON + Pub/Sub + Vector)
| | |
v v v
Researcher Writer Editor
\ | /
\ | /
-> Game Master
Implementation Breakdown
1) Shared State (Valkey JSON)
Agent outputs and game state are written into namespaced keys:
game:state:{room_id}agent:researcher:output:{room}agent:writer:draft:{room}agent:editor:review:{room}-
game:round:{room}:{round}
# backend/services/state_store.py
async def set_game_state(self, room_id: str, state: dict[str, Any]) -> None:
await self.valkey.set_json(f'game:state:{room_id}', state)
The set_json/get_json layer supports fallback to SET/GET JSON strings if RedisJSON is unavailable, keeping local demos robust.
2) Event-Driven Orchestration (Valkey Pub/Sub)
Every workflow transition publishes an event envelope:
# backend/services/event_bus.py
await self.valkey.publish(channel, envelope.model_dump_json())
Each agent subscribes only to channels it cares about and reacts to events.
This enables:
- decoupled scaling
- independent process restarts
- clean failure boundaries
3) Long-Term Memory (ValkeySearch vectors)
Questions are embedded and stored as memory documents:
# backend/services/vector_memory.py
await self.valkey.set_json(f'memory:question:{round_id}', {
'question': question,
'topic': topic,
'difficulty': difficulty,
'player_accuracy': player_accuracy,
'embedding': emb,
})
Vector search retrieves similar prior questions to reduce repetition and improve topic progression.
Round Lifecycle
The event chain per round:
-
GAME_START/START_ROUND-> Researcher emitsRESEARCH_DONE - Writer reacts -> emits
DRAFT_READY - Editor reacts -> emits
VALIDATION_COMPLETE - Game Master launches round -> emits
NEW_QUESTION - Players answer via WebSocket
- Game Master emits
ROUND_RESULT+LEADERBOARD_UPDATE+ROUND_COMPLETE
We also added cycle_id propagation to guard against stale or duplicate downstream processing.
Reliability Improvements We Added
- Preserved player score on reconnect (
HSETNX) - Reset scores on a new game start in the same room
- Event handler safety: per-event exceptions don’t kill whole agent loop
- WebSocket payload validation (
invalid_json,invalid_round_id,round_not_active) - Health endpoint checks Valkey reachability + JSON/Search capability
Operational Checks
Use this before demos:
curl -s http://127.0.0.1:8000/health | jq
You’ll see:
reachablejson_modulesearch_module
Environment Management (Varlock-first)
The app loads settings from process environment variables, which makes it a good fit for Varlock-managed secrets/config.
Example runtime:
varlock run -- uvicorn main:app --reload --port 8000
varlock run -- python scripts/run_agents.py --agents researcher writer editor game_master
Test Strategy
Minimal integration test included:
pytest -q tests/test_integration_round.py
It validates: start game -> first round -> leaderboard update.
Why This Pattern Matters
Compared with direct API chaining between agents, this design gives:
- Better fault isolation
- Better observability
- Easier horizontal scaling
- Simpler mental model for distributed workflows
What to Improve Next
- Move orchestration from Pub/Sub to Valkey Streams (durable delivery)
- Add event idempotency store + dead-letter handling
- Add OpenTelemetry traces for event lifecycle
- Add CI pipeline for contract tests + reliability tests
LLMs provide reasoning, but coordination makes systems reliable.
If you’re building multi-agent workflows, treat Valkey as your shared cognition fabric, not just cache.
Screenshots
*Github: *https://github.com/harishkotra/neuroloop







Top comments (0)