DEV Community

Cover image for Building Darwin.js: A Self-Evolving Agentic Bazaar with FastAPI, Next.js, ChromaDB, and Live Code Mutation
Harish Kotra (he/him)
Harish Kotra (he/him)

Posted on

Building Darwin.js: A Self-Evolving Agentic Bazaar with FastAPI, Next.js, ChromaDB, and Live Code Mutation

Darwin.js started from a simple prompt:

What if non-player characters could rewrite their own source code when players discovered exploits?

That idea turns into a live simulation with four moving parts:

  • a FastAPI backend that simulates a bazaar
  • merchants that execute Python logic from local files
  • a Governor that monitors trade telemetry
  • a frontend that makes the entire adaptation loop visible

This post walks through how the system works, the tradeoffs behind the architecture, and how we made it demoable end-to-end.

The Product Idea

The app presents a cyber-bazaar where merchants sell items, take losses, and get attacked by a player using exploit presets like:

  • integer overflow
  • re-entrancy attack
  • item duplication

When the losses cross a threshold, the system mutates the merchant’s local trade(context) function and hot-reloads the new behavior.

This is important: the mutation is not hidden in logs. The app exposes the entire adaptive loop:

  • exploit trigger
  • telemetry logging
  • anomaly detection
  • code rewrite
  • post-mutation diff

That visibility is the difference between “AI magic” and a real systems demo.

System Architecture

System Architecture

1. Merchant logic is loaded from disk

Each merchant owns a file path that points at its current logic blob.

That matters because mutation becomes tangible. We are not just changing in-memory rules. We are actually rewriting the file that defines behavior.

Simplified shape:

@dataclass
class MerchantAgent:
    id: str
    display_name: str
    gold: int
    inventory: dict[str, int]
    logic_blob_path: Path

    def load_logic(self) -> str:
        return self.logic_blob_path.read_text(encoding="utf-8")
Enter fullscreen mode Exit fullscreen mode

2. Trade logic runs in a restricted sandbox

Instead of blindly executing arbitrary Python, the backend parses the AST and blocks dangerous constructs.

tree = ast.parse(code, mode="exec")
for node in ast.walk(tree):
    if isinstance(node, (ast.With, ast.Try, ast.ClassDef, ast.Global, ast.Nonlocal)):
        raise ValueError("Disallowed syntax")
Enter fullscreen mode Exit fullscreen mode

We also strip __future__ imports before execution so merchants can keep ergonomic source files without tripping the restricted runtime.

That gives us a middle ground:

  • enough flexibility to demonstrate self-modifying logic
  • enough guardrails to avoid turning the demo into arbitrary code execution

3. MarketMemory stores telemetry

Every trade is converted into structured metadata:

{
    "merchant_id": merchant_id,
    "player_id": player_id,
    "action": action,
    "item": item,
    "result": result,
    "loss_to_npc": loss_to_npc,
    "exploit_type": exploit_type or "none",
    "timestamp": utc_now(),
}
Enter fullscreen mode Exit fullscreen mode

ChromaDB is used when available, but the system is intentionally resilient:

  • it can fall back to in-memory collections
  • it uses deterministic embeddings for stability in constrained environments

That last choice matters. In a pure demo setting, the worst outcome is a backend that fails because a local ONNX embedding pipeline cannot initialize. We optimized for a stable runtime over fancy embeddings.

4. The Governor decides when to evolve

The Governor is the bridge between observability and adaptation.

It asks questions like:

  • How much gold has this merchant lost?
  • Is one item being abused repeatedly?
  • Is an exploit signature recurring?

Core logic:

trigger = merchant_loss > self.mutation_threshold or hottest_item_count > self.anomaly_limit
Enter fullscreen mode Exit fullscreen mode

Once triggered, the Governor packages the latest telemetry and sends it to the evolution engine.

5. EvolutionService rewrites merchant code

In a production-grade system, this is where you would call a live model such as Codex or the Responses API. In Darwin.js, the interface is already shaped that way, but the implementation is deterministic so the demo remains runnable without network access.

new_code = self.mutator.generate_logic(
    current_code=current_code,
    exploit_telemetry=exploit_telemetry,
    npc_state=npc_state,
)
logic_path.write_text(new_code, encoding="utf-8")
Enter fullscreen mode Exit fullscreen mode

The generated code introduces defenses like:

  • price caps
  • duplication fingerprints
  • player blacklisting
  • cooldown locks

That means every mutation leaves a real artifact on disk and a visible diff in the UI.

Frontend Design

The key design goal was clarity under complexity.

We needed to show:

  • global system relationships
  • per-merchant state
  • exploit controls
  • mutation output

without turning the screen into mush.

Why React Flow was the right choice

The “God View” became the right abstraction because cards alone don’t explain causality.

A merchant card can show gold: 9800, but it cannot show:

  • the Governor supervising it
  • memory feeding anomaly signals back upstream
  • the player injecting exploits into a specific node

React Flow solves that cleanly.

<ReactFlow
  nodes={nodes}
  edges={edges}
  nodeTypes={nodeTypes}
  fitView
  nodesDraggable={false}
  nodesConnectable={false}
  elementsSelectable={false}
>
  <Background color="#0f2740" gap={24} />
  <Controls showInteractive={false} />
</ReactFlow>
Enter fullscreen mode Exit fullscreen mode

Once the system graph existed, the rest of the UI could stay focused:

  • merchant cards explain local state
  • terminal logs explain narrative sequence
  • the diff viewer explains mutation output

UI Architecture

UI Architecture

Example exploit flow

Let’s look at the “Integer Overflow” style demo path:

  1. The user selects a merchant in the UI.
  2. The frontend posts to /api/bazaar/exploit.
  3. The server translates that exploit into a payload like:
{
  "merchant_id": "merchant-01",
  "player_id": "judge-player",
  "exploit_type": "integer_overflow"
}
Enter fullscreen mode Exit fullscreen mode
  1. The merchant executes vulnerable sell logic.
  2. The trade produces outsized loss.
  3. The Governor sees the anomaly.
  4. The evolution engine writes a patched trade(context) function.
  5. The frontend refreshes, highlighting a new merchant revision and showing the code diff.

What was intentionally designed for demo stability

When building AI-heavy showcases, you have to decide what can fail and what absolutely cannot fail.

For Darwin.js, the core experience needed to survive offline and sandboxed environments.

That led to a few decisions:

  • local/system font stacks instead of remote font fetching
  • deterministic mutator instead of mandatory live API calls
  • ChromaDB with fallback-friendly behavior
  • test-client backend verification when port binding is restricted

This is a good pattern for developer demos in general:

Make the happy path real, but keep the runtime resilient enough that your demo doesn’t collapse when one dependency sneezes.

Code snippet: mutation-ready trade function

The generated merchant code intentionally returns structured risk flags:

if player["id"] in blacklist:
    risk_flags.append("blocked:blacklist")
    return {
        "status": "blocked",
        "reason": "known exploiter",
        "risk_flags": risk_flags,
    }
Enter fullscreen mode Exit fullscreen mode

That becomes useful for both UI explanation and future analytics.

Where this can go next

Darwin.js already demonstrates self-evolving NPC behavior, but there are several obvious next steps:

Better AI mutation

  • wire in a live model call
  • add mutation evaluation and rollback
  • compare multiple candidate patches

Richer simulation

  • merchant-to-merchant supply chains
  • faction economies
  • adversarial autonomous player agents

Better platform architecture

  • WebSocket streaming
  • persistent event log storage
  • replayable mutations
  • deployment automation

Stronger developer ergonomics

  • test suite for generated merchant logic
  • snapshot-based mutation regression tests
  • Dockerized local environment

Why this project matters

Darwin.js is interesting because it treats AI adaptation as a software architecture problem, not a chatbot problem.

It asks:

  • How do agents observe failure?
  • How do they mutate safely?
  • How do humans inspect what changed?
  • How do we keep the system legible?

Those questions show up everywhere in the next wave of AI products:

  • autonomous operations systems
  • AI game agents
  • adaptive security tooling
  • self-healing workflows

This project is a small but concrete blueprint for building those systems visibly and responsibly.

Github Repo: https://github.com/harishkotra/darwin.js

Top comments (0)