DEV Community

Cover image for How I Built AgentOffice: Self-Growing AI Teams in a Pixel-Art Virtual Office
Harish Kotra (he/him)
Harish Kotra (he/him)

Posted on

How I Built AgentOffice: Self-Growing AI Teams in a Pixel-Art Virtual Office

A deep dive into building a real-time multi-agent simulation where AI agents think, collaborate, hire interns, and grow their team — all powered by local LLMs.

The Idea

What if AI agents didn't just respond to prompts — what if they lived somewhere? Had a desk, colleagues, tasks, memories, and the ability to grow their own team?

That's AgentOffice: a TypeScript monorepo that renders a pixel-art virtual office where AI agents powered by Ollama walk around, think, talk to each other, execute code, search the web, assign tasks, and even hire new team members — all in real-time.

The entire thing runs locally. No cloud APIs required. No vendor lock-in.


Architecture Overview

AgentOffice is built as a monorepo with 5 npm workspace packages:

┌─────────────────────────────────────────────────────────────┐
│                       BROWSER                                │
│   Phaser.js (pixel rendering)  +  React (UI overlays)       │
│   Chat · TaskBoard · SystemLog · Inspector · LayoutEditor    │
└──────────────────────────┬──────────────────────────────────┘
                           │ WebSocket (Colyseus)
┌──────────────────────────▼──────────────────────────────────┐
│                       SERVER                                 │
│   Colyseus Room → Agent Think Loop → Action Dispatch         │
│   ToolExecutor (code, search, notes, file I/O)               │
│   MemoryStore (SQLite + Ollama Embeddings)                   │
└──────────────────────────┬──────────────────────────────────┘
                           │ HTTP API
┌──────────────────────────▼──────────────────────────────────┐
│                       OLLAMA                                 │
│   llama3.2 (chat completions + embeddings)                   │
└─────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The Core Loop

Every ~15 seconds, each agent runs this cycle:

  1. Perceive — Gather context: nearby agents, unread messages, recent memories, current task
  2. Think — Send all context to Ollama and get back a JSON decision
  3. Act — Execute the decision: move, talk, use a tool, or hire someone
  4. Remember — Store the thought as a memory with importance scoring

The LLM returns structured JSON like:

{
  "thought": "Bob hasn't updated me on the auth module. I should ask him.",
  "action": "talk",
  "target": "Bob",
  "message": "Hey Bob, what's the status on the auth module?"
}
Enter fullscreen mode Exit fullscreen mode

Or when using tools:

{
  "thought": "I need to check if this algorithm works.",
  "action": "use_tool",
  "toolCall": { "name": "code_execute", "params": { "code": "console.log(fibonacci(10))" } }
}
Enter fullscreen mode Exit fullscreen mode

Key Technical Decisions

1. Colyseus for Real-Time State

I chose Colyseus over Socket.io for its built-in state synchronization. The @colyseus/schema decorators automatically delta-compress state changes and sync them to all connected clients:

class AgentState extends Schema {
    @type('string') name: string;
    @type('number') x: number;
    @type('number') y: number;
    @type('string') action: string;
    @type('string') thought: string;
}
Enter fullscreen mode Exit fullscreen mode

When the server updates agent.x, Colyseus patches only the changed bytes to every browser. The Phaser UI listens via agent.onChange() and smoothly tweens the sprite.

2. Inversion of Control for LLM Providers

The InferenceAdapter interface decouples the agent from any specific LLM:

interface InferenceAdapter {
    complete(request: CompletionRequest): Promise<CompletionResponse>;
}
Enter fullscreen mode Exit fullscreen mode

Ollama, OpenAI, Gaia, Anthropic — any provider that speaks the chat completions format works as a drop-in adapter. The agent doesn't know or care where its thoughts come from.

3. SQLite + Ollama Embeddings for Memory

Instead of using a vector database like ChromaDB, I went with a simpler approach:

  • SQLite stores memories with an optional embedding column (JSON blob)
  • Ollama's /api/embeddings endpoint generates vectors for important memories (importance ≥ 0.5)
  • Cosine similarity in JavaScript ranks memories at retrieval time
async semanticSearch(agentId: string, query: string): Promise<MemoryEntry[]> {
    const queryEmb = await this.generateEmbedding(query);
    const allMemories = await this.db.all('SELECT * FROM memories WHERE embedding IS NOT NULL');
    return allMemories
        .map(m => ({ ...m, score: cosineSimilarity(queryEmb, JSON.parse(m.embedding)) }))
        .sort((a, b) => b.score - a.score)
        .slice(0, 5);
}
Enter fullscreen mode Exit fullscreen mode

For a few hundred memories per agent, this is fast enough. For thousands, you'd swap in a proper vector index.

4. Dynamic Agent Hiring

This is where it gets fun. Agents have a hire_agent tool in their LLM capabilities. When the model decides the team needs help, it outputs:

{
  "action": "use_tool",
  "toolCall": { "name": "hire_agent", "params": { "name": "Charlie", "role": "Intern" } }
}
Enter fullscreen mode Exit fullscreen mode

The server then:

  1. Creates a new Colyseus AgentState (the UI auto-renders the sprite via onAdd)
  2. Instantiates a new Agent with its own personality and system prompt
  3. Connects it to Ollama with a prompt saying "You were hired by Alice. Introduce yourself."
  4. Assigns it a desk and starts its think loop

The team grows organically. Capped at 7 agents to prevent LLM overload.

5. Phaser.js + React Hybrid UI

The game canvas is Phaser, but all UI panels (chat, tasks, activity log, inspector, layout editor) are React components rendered as an HTML overlay on top of the canvas. They communicate via:

  • Colyseus messages for server data (tasks, chat, state)
  • Custom EventTarget (eventBus) for Phaser → React events (activity log, agent focus)

The Focus Mode is a good example: clicking an agent sprite in Phaser dispatches an agent-focus event, and the update() loop smoothly lerps the camera toward the followed agent:

if (this.followTarget) {
    cam.scrollX += (targetX - cam.scrollX) * 0.08;
    cam.scrollY += (targetY - cam.scrollY) * 0.08;
}
Enter fullscreen mode Exit fullscreen mode

The Gotchas

Phaser steals keyboard input. When I added React input fields over the Phaser canvas, pressing Space would scroll the game instead of typing in the input. Fix: set input.keyboard.capture: [] in the Phaser config and disable the keyboard entirely when an <input> is focused.

TypeScript narrowing fails in closures. Colyseus's forEach callback mutates a variable (closest), but TypeScript narrows it to never after the null check. Fix: explicit cast with as { x: number; y: number }.

LLM outputs are unpredictable. Sometimes the model returns "action": "think" instead of the valid options. The agent's think() method wraps the JSON parse in a try-catch and defaults to idle on failure.


Fork It & Build Your Own

The project is designed for forking. Here are some ideas:

Fork Idea What to Change
AI Classroom Change roles to Teacher/Student, add quiz tools
Game NPC Engine Strip the React UI, keep the agent brain + Phaser
Startup Simulator Add revenue/burn metrics, strategy LLM prompts
DevOps War Room Connect to real monitoring APIs as tools
Research Lab Add a read_paper tool that fetches arXiv abstracts
Social Experiment Vary personality traits and observe emergent behavior
Customer Support Sim Route real tickets to agents, train them on your docs

How to Fork:

  1. Fork the repo on GitHub
  2. Edit OfficeRoom.ts to change agents, roles, tools
  3. Add custom tools in ToolExecutor.ts
  4. Modify the Phaser tilemap in Game.ts for your theme
  5. Deploy with docker compose up

What's Next

  • Voice mode — Agents speak via TTS (ElevenLabs / Coqui)
  • GitHub integration — Agents create PRs and review code
  • Slack bridge — Talk to your virtual team from Slack
  • Plugin system — Drop-in behaviors without forking
  • Multi-floor offices — Different departments on different floors

Stack

Layer Technology
Rendering Phaser.js (pixel art, sprite animation)
UI Overlay React + Vite
Real-time Sync Colyseus (WebSocket + delta compression)
AI Inference Ollama (local LLMs)
Memory Store SQLite (structured) + Ollama Embeddings (semantic)
Language TypeScript (strict mode)
Deployment Docker Compose

AgentOffice is open source under MIT. Star the repo if AI agents having their own office makes you smile.

GitHub →
Inspired by https://github.com/pablodelucca/pixel-agents

Top comments (0)