A deep dive into building a real-time multi-agent simulation where AI agents think, collaborate, hire interns, and grow their team — all powered by local LLMs.
The Idea
What if AI agents didn't just respond to prompts — what if they lived somewhere? Had a desk, colleagues, tasks, memories, and the ability to grow their own team?
That's AgentOffice: a TypeScript monorepo that renders a pixel-art virtual office where AI agents powered by Ollama walk around, think, talk to each other, execute code, search the web, assign tasks, and even hire new team members — all in real-time.
The entire thing runs locally. No cloud APIs required. No vendor lock-in.
Architecture Overview
AgentOffice is built as a monorepo with 5 npm workspace packages:
┌─────────────────────────────────────────────────────────────┐
│ BROWSER │
│ Phaser.js (pixel rendering) + React (UI overlays) │
│ Chat · TaskBoard · SystemLog · Inspector · LayoutEditor │
└──────────────────────────┬──────────────────────────────────┘
│ WebSocket (Colyseus)
┌──────────────────────────▼──────────────────────────────────┐
│ SERVER │
│ Colyseus Room → Agent Think Loop → Action Dispatch │
│ ToolExecutor (code, search, notes, file I/O) │
│ MemoryStore (SQLite + Ollama Embeddings) │
└──────────────────────────┬──────────────────────────────────┘
│ HTTP API
┌──────────────────────────▼──────────────────────────────────┐
│ OLLAMA │
│ llama3.2 (chat completions + embeddings) │
└─────────────────────────────────────────────────────────────┘
The Core Loop
Every ~15 seconds, each agent runs this cycle:
- Perceive — Gather context: nearby agents, unread messages, recent memories, current task
- Think — Send all context to Ollama and get back a JSON decision
- Act — Execute the decision: move, talk, use a tool, or hire someone
- Remember — Store the thought as a memory with importance scoring
The LLM returns structured JSON like:
{
"thought": "Bob hasn't updated me on the auth module. I should ask him.",
"action": "talk",
"target": "Bob",
"message": "Hey Bob, what's the status on the auth module?"
}
Or when using tools:
{
"thought": "I need to check if this algorithm works.",
"action": "use_tool",
"toolCall": { "name": "code_execute", "params": { "code": "console.log(fibonacci(10))" } }
}
Key Technical Decisions
1. Colyseus for Real-Time State
I chose Colyseus over Socket.io for its built-in state synchronization. The @colyseus/schema decorators automatically delta-compress state changes and sync them to all connected clients:
class AgentState extends Schema {
@type('string') name: string;
@type('number') x: number;
@type('number') y: number;
@type('string') action: string;
@type('string') thought: string;
}
When the server updates agent.x, Colyseus patches only the changed bytes to every browser. The Phaser UI listens via agent.onChange() and smoothly tweens the sprite.
2. Inversion of Control for LLM Providers
The InferenceAdapter interface decouples the agent from any specific LLM:
interface InferenceAdapter {
complete(request: CompletionRequest): Promise<CompletionResponse>;
}
Ollama, OpenAI, Gaia, Anthropic — any provider that speaks the chat completions format works as a drop-in adapter. The agent doesn't know or care where its thoughts come from.
3. SQLite + Ollama Embeddings for Memory
Instead of using a vector database like ChromaDB, I went with a simpler approach:
-
SQLite stores memories with an optional
embeddingcolumn (JSON blob) -
Ollama's
/api/embeddingsendpoint generates vectors for important memories (importance ≥ 0.5) - Cosine similarity in JavaScript ranks memories at retrieval time
async semanticSearch(agentId: string, query: string): Promise<MemoryEntry[]> {
const queryEmb = await this.generateEmbedding(query);
const allMemories = await this.db.all('SELECT * FROM memories WHERE embedding IS NOT NULL');
return allMemories
.map(m => ({ ...m, score: cosineSimilarity(queryEmb, JSON.parse(m.embedding)) }))
.sort((a, b) => b.score - a.score)
.slice(0, 5);
}
For a few hundred memories per agent, this is fast enough. For thousands, you'd swap in a proper vector index.
4. Dynamic Agent Hiring
This is where it gets fun. Agents have a hire_agent tool in their LLM capabilities. When the model decides the team needs help, it outputs:
{
"action": "use_tool",
"toolCall": { "name": "hire_agent", "params": { "name": "Charlie", "role": "Intern" } }
}
The server then:
- Creates a new Colyseus
AgentState(the UI auto-renders the sprite viaonAdd) - Instantiates a new
Agentwith its own personality and system prompt - Connects it to Ollama with a prompt saying "You were hired by Alice. Introduce yourself."
- Assigns it a desk and starts its think loop
The team grows organically. Capped at 7 agents to prevent LLM overload.
5. Phaser.js + React Hybrid UI
The game canvas is Phaser, but all UI panels (chat, tasks, activity log, inspector, layout editor) are React components rendered as an HTML overlay on top of the canvas. They communicate via:
- Colyseus messages for server data (tasks, chat, state)
-
Custom EventTarget (
eventBus) for Phaser → React events (activity log, agent focus)
The Focus Mode is a good example: clicking an agent sprite in Phaser dispatches an agent-focus event, and the update() loop smoothly lerps the camera toward the followed agent:
if (this.followTarget) {
cam.scrollX += (targetX - cam.scrollX) * 0.08;
cam.scrollY += (targetY - cam.scrollY) * 0.08;
}
The Gotchas
Phaser steals keyboard input. When I added React input fields over the Phaser canvas, pressing Space would scroll the game instead of typing in the input. Fix: set input.keyboard.capture: [] in the Phaser config and disable the keyboard entirely when an <input> is focused.
TypeScript narrowing fails in closures. Colyseus's forEach callback mutates a variable (closest), but TypeScript narrows it to never after the null check. Fix: explicit cast with as { x: number; y: number }.
LLM outputs are unpredictable. Sometimes the model returns "action": "think" instead of the valid options. The agent's think() method wraps the JSON parse in a try-catch and defaults to idle on failure.
Fork It & Build Your Own
The project is designed for forking. Here are some ideas:
| Fork Idea | What to Change |
|---|---|
| AI Classroom | Change roles to Teacher/Student, add quiz tools |
| Game NPC Engine | Strip the React UI, keep the agent brain + Phaser |
| Startup Simulator | Add revenue/burn metrics, strategy LLM prompts |
| DevOps War Room | Connect to real monitoring APIs as tools |
| Research Lab | Add a read_paper tool that fetches arXiv abstracts |
| Social Experiment | Vary personality traits and observe emergent behavior |
| Customer Support Sim | Route real tickets to agents, train them on your docs |
How to Fork:
- Fork the repo on GitHub
- Edit
OfficeRoom.tsto change agents, roles, tools - Add custom tools in
ToolExecutor.ts - Modify the Phaser tilemap in
Game.tsfor your theme - Deploy with
docker compose up
What's Next
- Voice mode — Agents speak via TTS (ElevenLabs / Coqui)
- GitHub integration — Agents create PRs and review code
- Slack bridge — Talk to your virtual team from Slack
- Plugin system — Drop-in behaviors without forking
- Multi-floor offices — Different departments on different floors
Stack
| Layer | Technology |
|---|---|
| Rendering | Phaser.js (pixel art, sprite animation) |
| UI Overlay | React + Vite |
| Real-time Sync | Colyseus (WebSocket + delta compression) |
| AI Inference | Ollama (local LLMs) |
| Memory Store | SQLite (structured) + Ollama Embeddings (semantic) |
| Language | TypeScript (strict mode) |
| Deployment | Docker Compose |
AgentOffice is open source under MIT. Star the repo if AI agents having their own office makes you smile.
GitHub →
Inspired by https://github.com/pablodelucca/pixel-agents
Top comments (0)