Loïc Baumann

Posted on Apr 5 • Originally published at nockawa.github.io

What Game Engines Know About Data That Databases Forgot

#csharp #database #ecs #gamedev

💡Typhon is an embedded, persistent, ACID database engine written in .NET that speaks the native language of game servers and real-time simulations: entities, components, and systems.

It delivers full transactional safety with MVCC snapshot isolation at sub-microsecond latency, powered by cache-line-aware storage, zero-copy access, and configurable durability.

Series: A Database That Thinks Like a Game Engine

Why I'm Building a Database Engine in C#

What Game Engines Know About Data That Databases Forgot (this post)

Microsecond Latency in a Managed Language (coming soon)

Game servers sit at an uncomfortable intersection. They need the raw throughput of a game engine — tens of thousands of entities updated every tick. But they also need what databases provide: transactions that don't corrupt state, queries that don't scan everything, and durability that survives crashes.

Today, game server teams pick one side and hack around the other. An Entity-Component-System framework for speed, with manual serialization to a database for persistence. Or a database for safety, with an impedance mismatch every time they touch game state.

Typhon draws from both traditions. It's a database engine that stores data the way game engines do — and provides the guarantees that game servers need. Here's why those two worlds aren't as far apart as they look.

Two Fields, One Problem

ECS architecture evolved in game engines. Relational databases evolved in enterprise software. They never talked to each other. But look at what they built:

ECS Concept	Database Concept	Shared Principle
Archetype	Table	Homogeneous, fixed-schema storage
Component	Column	Typed, blittable, bulk-iterable data
Entity	Row	Identity with dynamic composition
System	Query	Process all records matching a signature
Frame Budget (16ms)	Latency SLA	Hard real-time deadline

An ECS "archetype" is a table. A "component" is a column. A "system" is a query. The vocabulary is different, the underlying structure is the same. Two fields, separated by decades and industry boundaries, converged on structurally identical solutions because they were solving the same fundamental problem: managing structured data under performance constraints.

This convergence is why a synthesis is possible at all. It's not an accident — it's driven by the same physics. Data must be laid out for the CPU cache. Access patterns must be predictable. Latency budgets are real.

What We Learned From Game Engines

ECS taught the database world something important about how data should be stored. Three lessons Typhon draws directly from game engine architecture:

Cache locality by default. In a traditional row store, reading all player positions means loading entire rows — names, inventories, health, everything. Most of those bytes are wasted. In ECS, components are stored per type: all positions contiguous, all health values contiguous. Reading 10,000 positions is a linear memory scan where every byte is useful.

This matters more than most developers realize. An L1 cache hit costs roughly 1 nanosecond. A DRAM miss costs 60-70 ns — a 65x penalty. When your database layout forces cache misses, no amount of algorithmic cleverness can save you.

Zero-copy is the default, not the optimization. In a traditional database, reading a record means deserializing from a storage page into a language-level object. In ECS, a component is already in memory in its final layout — you just hand back a pointer. Typhon preserves this: components are blittable unmanaged structs read directly from pinned memory pages. No serialization, no managed heap allocation, no GC involvement.

Entity as pure identity. In ECS, an entity is just an ID — a 64-bit number with no inherent structure. All data lives externally in component tables. This is the opposite of ORM thinking where the object is the entity. Typhon inherits this: EntityId is a lightweight value type, all state lives in typed component storage. This separation is what makes the rest of the architecture possible — per-component versioning, per-component storage modes, independent indexes per component type.

What We Learned From Databases

Traditional databases solved problems that ECS never had to face. Four capabilities Typhon draws from database architecture:

ACID transactions with per-component MVCC. Game engines typically have no isolation. Two systems modifying the same entity in the same tick is a race condition — and in a single-process game, you control the execution order so you can manage it. On a game server with concurrent player sessions, you can't.

Databases solved this decades ago with MVCC: snapshot isolation where readers never block writers, with conflict detection at commit time. Typhon brings this in — but with a twist. Traditional databases version entire rows. Typhon versions each component independently. An entity's PositionComponent and InventoryComponent each maintain their own revision chain: a circular buffer of 12-byte revision entries, each stamped with a 48-bit transaction sequence number.

// Simplified: finding the visible revision for a snapshot
foreach (var rev in WalkRevisions(entityId))
{
    if (rev.IsolationFlag && rev.TSN != myTransactionTSN)
        continue;  // Skip uncommitted revisions from other transactions

    if (rev.TSN <= snapshotTSN)
        return rev; // Most recent revision visible to our snapshot
}

This means a transaction reading a player's position sees a consistent frozen point-in-time across all component types simultaneously — without locking any of them. Writers never block readers. And because revisions are per-component rather than per-entity, updating a player's position doesn't create a new version of their inventory. Less data copied, less garbage to collect.

Indexed selective access. This is the big one. ECS systems iterate everything matching a component signature every tick. That works brilliantly for particle simulations where every particle needs updating. But game servers often don't need all of them:

Scenario	Total Entities	Processed Per Tick	Useful Work
Battle royale (per-client relevancy)	50,000 actors	500–2,000	1–4%
MMO area of interest	100,000	200–1,000	0.2–1%
Physics (awake bodies only)	All rigidbodies	Awake subset	5–20%

When you're processing 1–4% of your entities, scanning everything is doing 25–100x more work than necessary. ECS frameworks recognized this — Unity DOTS added enableable components, Flecs added group_by, Unreal MassEntity added LOD tiers. These are all clever workarounds for the same underlying issue: ECS was designed for bulk iteration, not selective access.

Databases solved this with indexes. B+Trees for value-based lookups, spatial trees for area-of-interest queries, selectivity estimation to decide when to scan versus when to seek. Typhon brings these into the component storage model — not as bolted-on workarounds, but as first-class citizens.

Spatial partitioning. For spatial access patterns specifically — the #1 selective access need in game servers — Typhon integrates a two-layer spatial index directly into the component storage:

Layer 1: Sparse hash map — maps coarse grid cells to entity counts. O(1) rejection of empty regions before the tree is even touched.
Layer 2: Page-backed R-Tree — AABB, radius, ray, frustum, and kNN queries. Same OLC-latched, SOA node architecture as the B+Trees.

Both layers run inside the same transactional model as everything else. No external spatial hash bolted on alongside your ECS. No cache locality destroyed by chasing pointers into a separate data structure.

Durability. A game client can afford to lose state on crash — reload the level. A game server cannot. Player inventories, economy state, progression data — all must survive process restarts and crashes. WAL-based crash recovery, checkpointing, configurable fsync — these are database fundamentals that game servers need but ECS frameworks never provided.

Query planning. When you have both indexes and sequential storage, someone needs to decide which access path to use. Databases have decades of work on cost-based query optimization — selectivity estimation, histogram statistics, index selection. Typhon brings a query planner into the ECS world: given a predicate on a component field, it automatically chooses full scan or B+Tree seek based on estimated selectivity.

Purpose-Built for Game Servers

Typhon doesn't glue ECS and database concepts together with duct tape. It synthesizes them into a single model designed for game server workloads.

A component in Typhon is simultaneously an ECS component and a database schema:

[Component]
public struct PlayerComponent
{
    [Field]
    public String64 Name;

    [Field]
    [Index]                    // B+Tree for fast lookups
    public int AccountId;

    [Field]
    public float Experience;
}

Blittable, unmanaged, fixed-size, stored contiguously per type — that's the ECS side. Typed fields with automatic B+Tree indexes on marked fields — that's the database side. One declaration, both worlds.

The query API makes the synthesis concrete:

var topPlayers = db.Query<Player>()
    .Where(p => p.Level >= 50)
    .OrderByDescending(p => p.Level)
    .Take(10)
    .ExecuteOrdered(tx);

ECS-style typed component access. Database-style predicate filtering with automatic index selection. Inside a transaction with snapshot isolation. The query planner chooses scan vs B+Tree based on selectivity — the developer doesn't have to.

And because game servers have different durability needs for different operations, Typhon lets you choose per unit of work:

// Position ticks: game-engine speed, batched durability
using var uow = dbe.CreateUnitOfWork(DurabilityMode.Deferred);

// Legendary item drop: database safety, immediate fsync
using var uow = dbe.CreateUnitOfWork(DurabilityMode.Immediate);

Same engine, same API. Deferred mode gives game-engine-class commit latency for position updates that can be re-simulated on crash. Immediate mode gives database-class guarantees for a transaction that grants a rare item worth real money. The game server decides per operation — not globally.

Storage Modes: Not All Data Is Equal

A game server doesn't treat all data the same. Player positions change 60 times per second and can be re-simulated on crash. Inventory mutations are rare but must never be lost. AI runtime state — current targets, threat scores, pathfinding waypoints — is recomputed every tick and worthless after a restart.

Traditional databases treat all data identically. Traditional ECS keeps everything in memory with no durability distinction. Typhon lets you choose per component type:

Mode	MVCC History	Persisted	Change Tracking	Best For
Versioned	Full revision chains	Yes (WAL + checkpoint)	Via MVCC	Inventory, economy, progression
SingleVersion	Current state only	Yes (WAL + checkpoint)	DirtyBitmap	Positions, health, frequently-updated state
Transient	Current state only	No	DirtyBitmap	AI blackboard, threat scores, pathfinding scratch

SingleVersion components skip the revision chain overhead entirely — no circular buffer, no per-write allocation. They track changes through a DirtyBitmap instead: one bit per entity, flipped on write, scanned on tick fence. This is how game engines track what changed, and it's the right model for data that updates every tick.

Versioned components get full MVCC with snapshot isolation — readers see consistent historical state, writers don't block readers, conflicts are detected at commit time. This is how databases protect critical data, and it's the right model for things that must never be corrupted.

Transient components never touch disk at all — no WAL, no checkpoint, no recovery. Pure in-memory storage with the same query and indexing API as everything else. AI blackboard data that's recomputed every tick has no business paying persistence overhead.

The same engine, the same transaction API, but the storage layer does exactly what each component type needs. This is what "purpose-built for game servers" means in practice.

Views: The Bridge Between ECS Systems and Database Queries

In ECS, a "system" runs every tick, processing all matching entities. In a database, a "materialized view" maintains a cached result set and refreshes it incrementally. Typhon's Views are both:

using var view = db.Query<ItemData>()
    .Where(i => i.Rarity >= 3)
    .ToView();

// Game loop
while (running)
{
    using var tx = dbe.CreateQuickTransaction();
    view.Refresh(tx);  // Microsecond incremental refresh

    // React to changes — like an ECS system, but only for what changed
    var delta = view.GetDelta();
    foreach (var pk in delta.Added)   SpawnVisual(pk);
    foreach (var pk in delta.Removed) DespawnVisual(pk);
    foreach (var pk in delta.Modified) UpdateVisual(pk);
    view.ClearDelta();
}

The initial ToView() runs a full query. After that, Refresh() drains a lock-free ring buffer of changes pushed by the commit path — only entities whose indexed fields actually changed are re-evaluated. If 100,000 entities match your view but only 12 changed since last refresh, you do 12 evaluations, not 100,000.

This is the iterate-everything problem solved from the database side: don't re-scan, track deltas.

Trade-offs

Specializing for game servers means giving things up.

Blittable components only. No string, no object references, no variable-length arrays inside components. Text uses fixed-size types like String64. This is the price of zero-copy reads and cache-friendly storage — and it's a constraint game developers are already familiar with from ECS frameworks.

Entity-centric relationships, not SQL JOINs. Typhon supports navigation links, 1:N and N:M relationships — but they follow entity references, closer to a graph database than a traditional SQL one. This matches how game servers naturally think about data (an entity has components, a guild contains members), but if your mental model is SELECT ... FROM a JOIN b ON a.x = b.y, it's a different paradigm.

Schema in code, not SQL. Components are C# structs with attributes, not DDL statements. Natural for game developers, unfamiliar territory for database administrators. If your team thinks in SQL, this is a paradigm shift.

What's Next

In the next post, I'll go deeper into the performance philosophy that makes all of this actually fast — data-oriented design, cache-line awareness, and zero-allocation hot paths. The principles that let a managed language hit microsecond-latency transactions.

If you want to follow along, the best way is to star the repo or subscribe to the RSS feed.

DEV Community