TechLogStack

Posted on May 20 • Originally published at techlogstack.com on May 18

Figma's Database Grew 100x in Four Years — Here's How a Small Team Kept It From Toppling

#webdev #backend #architecture #database

100× database growth since 2020 — single Postgres instance to horizontal shards
Billions of rows per table; some tables at several terabytes
Postgres vacuum falling behind on largest tables — reliability incidents already occurring
9 months — migration timeline from design to production completion
DBProxy — custom Go query proxy with AST parser, built in-house
Zero-downtime logical → physical sharding; rollback possible even after physical split

In 2020, Figma ran on a single Postgres instance on AWS's largest available machine. Four years later, that database had grown nearly 100×. Some tables had swelled to several terabytes and billions of rows. The Postgres vacuum process was causing reliability incidents. They had months of runway left before hitting the IOPS ceiling. A small databases team had nine months to fix it.

The Story

We needed a bigger lever.

— Sammy Steele, Tech Lead, Figma Databases Team, via Figma Engineering Blog

Figma's database story follows a pattern familiar to every fast-growing product company, but with stakes that were unusually high and a timeline unusually compressed. In 2020, Figma ran on a single Postgres database on AWS's largest available RDS instance. By end of 2022, the team had done what most scaling playbooks suggest first: add read replicas, add a connection pooler (PgBouncer — a lightweight PostgreSQL connection pooler that sits between application code and the database, multiplexing many application connections down to a smaller pool of real connections), and vertically partition (splitting a single database into multiple smaller databases each containing a logical group of related tables — one database for file data, another for organisation data) into a dozen domain-specific shards. These steps bought runway. They did not buy enough runway.

Certain tables — the ones tracking Figma files, user activity, and collaboration state — were growing at rates that would soon exceed what RDS could support in IOPS. Some already contained several terabytes and billions of rows. At that size, Postgres's vacuum process (a critical background maintenance operation in Postgres that reclaims storage from deleted rows and prevents the database from running out of 32-bit transaction IDs — if vacuuming falls behind, it can cause severe performance degradation and, in extreme cases, force the database into read-only emergency mode) was beginning to cause reliability incidents — falling behind on the largest tables, unable to reclaim space fast enough to keep up with write volume. Vertical partitioning couldn't fix this: the smallest unit is a single table, and individual tables were the problem.

The Vacuum Problem at Scale

Postgres must periodically vacuum tables to reclaim space from deleted and updated rows. On tables with billions of rows and high write rates, the vacuum process can fall behind the rate of new writes — leading to bloated tables, degraded query plans, and in extreme cases the risk of transaction ID wraparound, a catastrophic condition where Postgres enters read-only emergency mode to prevent data corruption. Figma was seeing the early signs of this at scale. This was the alarm that confirmed horizontal sharding was urgent, not optional.

Problem

IOPS Ceiling and Vacuum Incidents Converge

By late 2022, Figma's largest tables had grown to several terabytes with billions of rows, and Postgres vacuum was causing reliability incidents on the highest-write tables. Projections showed they would exceed RDS maximum IOPS within months. Vertical partitioning — splitting databases by domain — couldn't help because individual tables were the bottleneck.

Cause

Single-Table Ceiling: The Vertical Partitioning Limit

Horizontal sharding (splitting a single large table's rows across multiple physical database instances based on a shard key — allowing any individual table to grow beyond the limits of a single machine) was the only viable path. But implementing it on a complex relational data model, with hundreds of engineers writing queries, required solving three hard problems simultaneously: routing queries correctly, maintaining developer productivity, and enabling rollback if something went wrong.

Solution

Colos + Logical Sharding + DBProxy

The team invented three interlocking abstractions: colos (colocation groups of related tables sharing a shard key), logical sharding via Postgres views (enabling safe percentage-based rollout without moving any data), and DBProxy (a custom Go query proxy with an AST parser routing queries to the correct physical shard). Together these allowed incremental, reversible rollout of horizontal sharding without disrupting product development.

Result

Nine Months, Near-Infinite Scalability

The migration completed in nine months with zero downtime and the ability to roll back at any step. Future shard splits at the physical level are now transparent to application developers — after the initial upfront work to make a table compatible with horizontal sharding, all subsequent scale-outs happen in the infrastructure layer without any product team involvement.

The Fix

The Three-Layer Solution

Figma's horizontal sharding solution had three distinct architectural components. Colos (colocations) were the conceptual layer — groups of related tables sharing the same sharding key and physical shard layout. Tables within a colo could be joined and queried transactionally as long as the join was on the sharding key. The sharding keys were drawn from a small set: user_id, file_id, or org_id — most tables at Figma could be naturally associated with one of these. Logical sharding was the rollout layer — using Postgres views to simulate sharding behaviour without moving any data. DBProxy was the execution layer — intercepting queries, parsing them into an AST, determining which logical shard the query targeted, and routing it to the appropriate physical database.

100× — database growth since 2020; the scale that made vertical partitioning insufficient
9 months — total migration timeline from design to production completion
90% — query coverage targeted by DBProxy's query engine; the pragmatic threshold that kept the proxy simple while covering the vast majority of production access patterns
0 — application layer changes required for future shard splits; after initial table compatibility work, all subsequent scale-outs are transparent to product engineers

-- Logical sharding via Postgres views: the key insight
-- No data moves during the logical sharding phase.
-- Tables behave as if sharded — but it's just views on the same physical table.

-- Physical table still holds all data:
-- CREATE TABLE figma_files (file_id UUID, org_id UUID, data JSONB, ...)

-- Logical shards created as views filtered by hash range:
CREATE VIEW figma_files_shard_0 AS
  SELECT * FROM figma_files
  WHERE hashtext(file_id::text) % 4 = 0;

CREATE VIEW figma_files_shard_1 AS
  SELECT * FROM figma_files
  WHERE hashtext(file_id::text) % 4 = 1;

-- Postgres views accept both reads AND writes:
-- INSERT INTO figma_files_shard_0 (file_id, data) VALUES (...);
-- → Postgres routes to the underlying table
-- → DBProxy validates the shard key is in the correct range

-- Rollback during logical phase: change the view definition, flip config, done.
-- No data to move back. Zero risk.

-- Physical sharding (Phase 2):
-- Data is ACTUALLY moved to separate RDS instances per shard.
-- DBProxy routing stays the same — application code unchanged.
-- Rollback: re-point physical shard back to original instance.

DBProxy: The Query Engine That Made It Transparent

DBProxy is a Go service sitting between the application layer and PgBouncer. Its query engine has three components: a query parser that transforms SQL into an AST, a logical planner that extracts query type and shard IDs from the AST, and a physical planner that maps logical shard IDs to physical database instances and rewrites queries accordingly. DBProxy also handles scatter-gather queries (fanning out to all shards and aggregating results), dynamic load-shedding, and database topology management. Building it took months — but it was the only way to make sharding transparent to application developers without requiring rewrites across hundreds of engineers' code.

Why not CockroachDB, TiDB, Spanner, or Vitess?

Figma's databases team evaluated all the obvious alternatives before committing to building their own sharding layer. CockroachDB, TiDB, Google Spanner, and Vitess were all on the list. All were rejected for the same core reason: switching to any of them would have required a complex data migration across two different database stores simultaneously — under a months-long deadline. The team had also accumulated significant operational expertise running RDS Postgres. That expertise would have to be rebuilt from scratch for any new system. They chose to build horizontal sharding on top of existing RDS Postgres infrastructure — not a generic solution, but one scoped precisely to Figma's data model and access patterns.

The shadow planning framework: designing DBProxy for real queries

Before building DBProxy's query engine, the team built a shadow planning framework that let engineers define potential sharding schemes for their tables, then ran those plans against live production traffic — logging queries and plans to Snowflake for offline analysis. This gave empirical data to design a query language covering the most common 90% of queries while deliberately excluding the rare worst-case patterns that would have made DBProxy impossibly complex. The 10% of queries excluded from support were known and manageable, not discovered as surprises in production.

Figma's three-phase database scaling journey:

Phase	Architecture	Bottleneck Addressed	Runway Gained
2020	Single RDS Postgres instance	Initial growth	Moderate
2021–2022	Vertical partitioning (12 domain DBs) + read replicas + PgBouncer	CPU, read load, connection pool	~1 year
2023–2024	Horizontal sharding via colos + DBProxy + logical/physical migration	Table-level IOPS ceiling, vacuum backlog, billions-of-row tables	Near-infinite

Architecture

The before-state had application services talking directly to PgBouncer, which connected to RDS Postgres. Vertical partitioning meant multiple databases, but each was still a single physical instance — and the largest individual tables had no mechanism to distribute rows across instances. DBProxy was inserted between the application and PgBouncer layers, adding query parsing and routing intelligence that made horizontal sharding possible without requiring application code changes.

Before Horizontal Sharding: Vertical Partitions Only

View interactive diagram on TechLogStack →

Interactive diagram available on TechLogStack (link above).

After: DBProxy + Logical/Physical Horizontal Sharding

View interactive diagram on TechLogStack →

Interactive diagram available on TechLogStack (link above).

Colos: The Developer-Facing Abstraction

The colo concept is what made horizontal sharding usable for product engineers. A colo is a named group of tables sharing a sharding key — for example, a files_colo contains figma_files, file_nodes, file_comments, and other tables all sharded by file_id. Within a colo, cross-table joins and full transactions are supported when restricted to a single shard key value. This matches how Figma's application code already accessed the database — most operations concerned a single file or a single user, not cross-colo data. The colo abstraction minimised the number of queries that needed to be refactored.

Scatter-gather: the necessary evil

Some queries don't have a shard key — a query like "get all recently modified files for an admin dashboard" has no natural file_id scope. DBProxy handles these with scatter-gather: fan the query out to every shard in parallel, collect results, merge and sort. It works correctly but is expensive — it multiplies read load by the number of shards. Figma's engineering team was explicit with product engineers about the scatter-gather tax, encouraging them to refactor access patterns before their tables were sharded. The shadow planning data showed exactly which queries would become scatter-gather — engineering teams had weeks to fix them before cutover.

Lessons

Separate logical sharding from physical sharding. Implementing sharding routing behaviour at the application layer — using views, config, or a proxy — before moving any physical data gives you weeks of production validation at essentially zero risk. When you flip to physical sharding, the routing is already proven correct. This two-phase approach is the biggest risk reducer in a horizontal sharding migration.
Colocations (groups of related tables sharing the same sharding key and physical shard layout, allowing cross-table joins and transactions within the group) are the abstraction that makes sharding survivable for product engineers. Without colos, horizontal sharding forces engineers to think about shard routing on every database query. With colos, most queries just work as they always did.
Use shadow traffic to define your query language before building your proxy. Figma's shadow planning framework let them empirically measure which query patterns existed in production before designing DBProxy. The proxy was built for real queries, not imagined ones — and the 10% excluded from support were known and manageable.
Know when existing tools don't fit your timeline. Figma evaluated CockroachDB, Spanner, TiDB, and Vitess — all good systems. They chose to build something custom because the migration risk to an unfamiliar storage layer under a months-long deadline was genuinely higher than building a scoped custom solution on their existing Postgres expertise. The build-vs-buy decision was made with real risk data, not intuition.
Design for rollback even after the migration completes. Figma maintained the ability to reverse physical shard splits after they happened. The unknown unknowns in a horizontal sharding migration are real — building in a reverse path at every phase is the engineering discipline that lets teams execute confidently rather than hold their breath.

Engineering Glossary

Colo (colocation group) — a named group of related tables that share the same sharding key and physical shard layout. Tables within a colo can be joined and queried transactionally as long as the join is on the sharding key. The developer-facing abstraction that makes horizontal sharding survivable without requiring query rewrites across the entire codebase.

DBProxy — Figma's custom Go query proxy with an AST parser, logical planner, and physical planner. Sits between the application layer and PgBouncer, intercepting queries, determining which logical shard they target, and routing them to the appropriate physical RDS instance. Built in-house because existing tools didn't fit Figma's query patterns and deadline.

Horizontal sharding — splitting a single large table's rows across multiple physical database instances based on a shard key. Allows any individual table to grow beyond the limits of a single machine. Contrasted with vertical partitioning, which splits tables across databases but leaves individual tables on single instances.

Logical sharding — implementing sharding routing behaviour at the application layer — via Postgres views, config, or proxy — without moving any physical data. Allows safe percentage-based rollout and instant rollback. The first phase of Figma's two-phase migration approach.

PgBouncer — a lightweight PostgreSQL connection pooler that sits between application code and the database, multiplexing many application connections down to a smaller pool of real database connections. Reduces connection overhead significantly at scale.

Physical sharding — actually moving data to separate database instances per shard. High risk; only attempted after logical sharding has been validated in production. In Figma's system, physical shard splits are reversible — physical shards can be re-pointed back to the original instance.

Scatter-gather — a query execution pattern where a query without a shard key is fanned out to all shards in parallel, with results collected and merged. Correct but expensive — multiplies read load by the number of shards. Should be minimised before sharding migration by refactoring access patterns.

Vacuum process (Postgres) — a critical background maintenance operation in Postgres that reclaims storage from deleted and updated rows. On tables with billions of rows and heavy write rates, vacuum can fall behind, causing bloated tables, degraded query plans, and in extreme cases transaction ID wraparound — a database-wide read-only emergency mode.

Vertical partitioning — splitting a single database into multiple smaller databases, each containing a logical group of related tables. Addresses cross-domain coupling and overall database size, but cannot help when individual tables are the bottleneck.

This case is a plain-English retelling of publicly available engineering material.

Read the full case on TechLogStack →

(Interactive diagrams, source links, and the full reader experience)

TechLogStack — built at scale, broken in public, rebuilt by engineers.

DEV Community