Figma · Databases · 18 May 2026
In 2020, Figma ran on a single Postgres instance on AWS's largest available machine. Four years later, that database had grown nearly 100x. Some tables had swelled to several terabytes and billions of rows. The Postgres vacuum process — the background job that keeps Postgres alive — was causing reliability incidents. They had months of runway left before hitting the IOPS ceiling. A small databases team had nine months to fix it.
- 100x DB growth since 2020
- Single instance → horizontal shards
- 9-month migration
- Billions of rows per table
- DBProxy built in Go
- Zero-downtime logical → physical sharding
The Story
We needed a bigger lever.
— — Sammy Steele, Tech Lead — Figma Databases Team, via Figma Engineering Blog
Figma's database story follows a pattern familiar to every fast-growing product company, but with stakes that were unusually high and a timeline that was unusually compressed. In 2020, Figma ran on a single Postgres database on AWS's largest available RDS instance. By the end of 2022, the team had done what most scaling playbooks suggest first: add read replicas, add a connection pooler (PgBouncer (a lightweight PostgreSQL connection pooler that sits between application code and the database, multiplexing many application connections down to a smaller pool of real database connections — reducing connection overhead significantly)), and vertically partition (splitting a single database into multiple smaller databases, each containing a logical group of related tables — for example, one database for Figma files data, another for organization data) the database into a dozen domain-specific shards. These steps bought them runway. They did not buy them enough runway.
The data was unambiguous. Certain tables — the ones tracking Figma files, user activity, and collaboration state — were growing at rates that would soon exceed what Amazon RDS could support in IOPS. Some of these tables already contained several terabytes and billions of rows. At that size, Postgres's vacuum process (a critical background maintenance operation in Postgres that reclaims storage from deleted rows and prevents the database from running out of 32-bit transaction IDs — if vacuuming falls behind, it can cause severe performance degradation and, in extreme cases, force the database offline) was beginning to cause reliability incidents — it was falling behind on the largest tables, unable to reclaim space fast enough to keep up with write volume. Vertical partitioning couldn't fix this: the smallest unit of vertical partitioning is a single table, and these individual tables were the problem.
THE VACUUM PROBLEM AT SCALE
Postgres must periodically vacuum tables to reclaim space from deleted and updated rows. This is not optional — if a table accumulates too many dead tuples, query performance degrades severely. On tables with billions of rows and high write rates, the vacuum process can fall behind the rate of new writes. When this happens, the database starts showing reliability symptoms: bloated tables, degraded query plans, and in extreme cases the risk of transaction ID wraparound — a catastrophic condition that forces Postgres into read-only emergency mode. Figma was seeing the early signs of this at scale.
Why Not CockroachDB, TiDB, Spanner, or Vitess?
Figma's databases team evaluated every obvious alternative before committing to building their own horizontal sharding layer. CockroachDB, TiDB, Google Spanner, and Vitess were all on the list. All were rejected for the same core reason: switching to any of them would have required a complex data migration across two different database stores simultaneously. With only months of runway remaining before hitting critical IOPS limits, a migration to an unfamiliar storage layer under deadline pressure was a risk the team couldn't accept. They had also accumulated significant operational expertise running RDS Postgres. That expertise would have to be rebuilt from scratch for any new system. The team instead chose to build horizontal sharding on top of their existing RDS Postgres infrastructure — not a generic solution, but one scoped precisely to Figma's data model and access patterns.
Problem
IOPS Ceiling and Vacuum Incidents Converge
By late 2022, Figma's largest tables had grown to several terabytes with billions of rows, and the Postgres vacuum process was causing reliability incidents on the highest-write tables. Projections showed the team would exceed RDS maximum IOPS within months. Vertical partitioning — splitting databases by domain — could not help because individual tables were the bottleneck, not cross-domain coupling.
Cause
Single-Table Ceiling: The Vertical Partitioning Limit
Horizontal sharding (splitting a single large table's rows across multiple physical database instances based on a shard key — allowing any individual table to grow beyond the limits of a single machine) was the only viable path. But implementing it on a complex relational data model, with hundreds of engineers writing queries, required solving three hard problems simultaneously: routing queries correctly, maintaining developer productivity, and enabling rollback if something went wrong.
Solution
Colos + Logical Sharding + DBProxy
The team invented three interlocking abstractions: 'colos' (colocation groups of related tables sharing a shard key), logical sharding via Postgres views (which allowed safe percentage-based rollout without moving any data), and DBProxy (a custom Go query proxy with an AST parser that routed queries to the correct physical shard). Together these allowed incremental, reversible rollout of horizontal sharding without disrupting product development.
Result
Nine Months, Nearly Infinite Scalability
The migration completed in nine months with zero downtime and the ability to roll back at any step. Future shard splits at the physical level are now transparent to application developers — after the initial upfront work to make a table compatible with horizontal sharding, all subsequent scale-outs happen in the infrastructure layer without any product team involvement.
🪄
The most elegant part of Figma's sharding approach was using standard Postgres views to implement logical sharding. A view like
CREATE VIEW table_shard1 AS SELECT * FROM table WHERE hash(shard_key) BETWEEN min AND maxlets Postgres behave as if data is already sharded — without any data moving. This made the logical sharding phase essentially free to roll back: change the view definition, flip the config, done.ℹ️
The Shadow Planning Framework
Before building DBProxy's query engine, the team needed to know which queries to support. They built a shadow planning framework that let engineers define potential sharding schemes for their tables, then ran those plans against live production traffic — logging the queries and plans to Snowflake for offline analysis. This gave them empirical data to design a query language covering the most common 90% of queries while deliberately excluding the rare worst-case patterns that would have made DBProxy impossibly complex.
The constraints the team placed on their query language were deliberate and principled. All range scan and point queries were supported. Cross-table joins were only allowed when both tables belonged to the same colo and the join was on the sharding key. Scatter-gather queries — those that must fan out to all shards because they lack a shard key — were supported but their use was actively discouraged because each scatter-gather effectively multiplies database load by the shard count. Application developers were encouraged to refactor scatter-gather access patterns before sharding their tables, using the shadow planning data to understand which of their queries fell into this category.
⚠️
The Five Goals They Refused to Compromise
Figma's team defined five non-negotiables before writing a line of sharding code: minimize developer impact (product engineers shouldn't need to rewrite queries), scale-out transparency (future shard splits invisible to application layer), no expensive backfills (no solution requiring moving terabytes before going live), incremental progress (percentage-based rollout at every step), and rollback at any stage — even after physical sharding. Every architectural decision was measured against these five goals.
ℹ️
The Postgres Vacuum Threat Nobody Talks About
Postgres uses a 32-bit transaction counter. Every write increments it. If the database ever gets close to the maximum 2^31 transactions without vacuuming reclaimed space, Postgres enters a read-only emergency mode called transaction ID wraparound — a database-wide shutdown to prevent data corruption. On tables with billions of rows and heavy write rates, falling behind on vacuuming is not a theoretical risk. Figma was experiencing real reliability incidents from vacuum lag on their largest tables. This was the alarm that confirmed horizontal sharding was urgent, not optional.
The Fix
The Three-Layer Solution
Figma's horizontal sharding solution had three distinct architectural components that worked together. Colos (colocations) were the conceptual layer — groups of related tables that shared the same sharding key and physical shard layout. Tables within a colo could be joined and queried transactionally as long as the join was on the sharding key. The sharding keys were chosen from a small set: user_id, file_id, or org_id — most tables at Figma could be naturally associated with one of these. Logical sharding was the rollout layer — using Postgres views to simulate sharding behavior without moving any data. DBProxy was the execution layer — intercepting queries, parsing them into an AST, determining which logical shard the query targeted, and routing it to the appropriate physical database.
- 100x — Database growth since 2020 — the scale that made vertical partitioning insufficient and horizontal sharding the only viable path forward
- 9 months — Total migration timeline from design to production completion — achieved with a small team under aggressive growth-driven deadline pressure
- 90% — Query coverage targeted by DBProxy's query engine — the pragmatic threshold that kept the proxy simple while covering the vast majority of production access patterns
- 0 — Application layer changes required for future shard splits — after initial table compatibility work, all subsequent scale-outs are transparent to product engineers
-- Logical sharding via Postgres views: the key insight
-- No data moves during logical sharding phase.
-- Tables behave as if sharded — just views on the same physical table.
-- Single physical table still holds all data:
-- CREATE TABLE figma_files (file_id UUID, org_id UUID, data JSONB, ...)
-- Logical shards created as views filtered by hash range:
CREATE VIEW figma_files_shard_0 AS
SELECT * FROM figma_files
WHERE hashtext(file_id::text) % 4 = 0;
CREATE VIEW figma_files_shard_1 AS
SELECT * FROM figma_files
WHERE hashtext(file_id::text) % 4 = 1;
-- Views accept both reads AND writes in Postgres:
-- INSERT INTO figma_files_shard_0 (file_id, data) VALUES (...);
-- → Postgres routes to the underlying table
-- → DBProxy validates the shard key is in the correct range
-- Physical sharding later:
-- Data is ACTUALLY moved to separate RDS instances per shard
-- DBProxy routing stays the same — application code unchanged
-- Rollback: re-point physical shard back to original instance
DBPROXY: THE QUERY ENGINE
DBProxy is a Go service sitting between the application layer and PgBouncer. Its query engine has three components: a query parser that transforms SQL into an AST, a logical planner that extracts query type and shard IDs from the AST, and a physical planner that maps logical shard IDs to physical database instances and rewrites queries accordingly. DBProxy also handles scatter-gather queries (fanning out to all shards and aggregating results), dynamic load-shedding, improved observability, and database topology management. Building it took months — but it was the only way to make sharding transparent to application developers.
ℹ️
Logical Before Physical: The Two-Phase Rollout
Figma's key migration insight was separating logical sharding from physical sharding. Logical sharding (Phase 1) makes the application behave as if tables are sharded — using views, updating DBProxy config — but all data still lives in one physical database. This can be rolled out as a percentage-based config change, validated against production traffic, and rolled back instantly. Physical sharding (Phase 2) actually moves data to separate RDS instances. Much higher risk — but by this point, the logical layer has been running in production for weeks, bugs are fixed, and the team has empirical confidence in the sharding correctness.
✅
The Rollback Guarantee: Even After Physical Sharding
Most horizontal sharding implementations are one-way migrations — once data is on separate physical instances, rolling back requires a complex reverse migration. Figma's team designed their system so that physical shard splits are reversible. They maintained the ability to point physical shards back to the original database instance while the new routing logic was validated. This reduced the risk of being stuck in a bad state when unknown unknowns inevitably occurred.
Figma's Three-Phase Database Scaling Journey: Before and After
| Phase | Architecture | Bottleneck Addressed | Runway Gained |
|---|---|---|---|
| 2020 | Single RDS Postgres instance | Initial growth | Moderate |
| 2021–2022 | Vertical partitioning (12 domain DBs) + read replicas + PgBouncer | CPU, read load, connection pool | ~1 year |
| 2023–2024 | Horizontal sharding via colos + DBProxy + logical/physical migration | Table-level IOPS ceiling, vacuum backlog, billions-of-row tables | Near-infinite scalability |
🔀
Scatter-Gather: The Necessary Evil
Some queries don't have a shard key — a query like 'get all recently modified files for an admin dashboard' has no natural file_id scope. DBProxy handles these with scatter-gather: fan the query out to every shard in parallel, collect results, merge and sort. It works correctly but is expensive. Figma's engineering team was explicit with product engineers about the scatter-gather tax , encouraging them to refactor access patterns before their tables were sharded. The shadow planning data showed exactly which queries would become scatter-gather — engineering teams had weeks to fix them before cutover.
Architecture
The before-state of Figma's architecture had application services talking directly to PgBouncer, which connected to RDS Postgres. Vertical partitioning meant multiple databases, but each database was still a single physical instance — and the largest individual tables still had no mechanism to distribute their rows across instances. DBProxy was inserted between the application and PgBouncer layers, adding the query parsing and routing intelligence that made horizontal sharding possible without requiring application code changes.
Before Horizontal Sharding: Vertical Partitions Only
View interactive diagram on TechLogStack →
Interactive diagram available on TechLogStack (link above).
After: DBProxy + Logical/Physical Horizontal Sharding
View interactive diagram on TechLogStack →
Interactive diagram available on TechLogStack (link above).
COLOS: THE DEVELOPER-FACING ABSTRACTION
The colo concept is what made horizontal sharding usable for product engineers. A colo is a named group of tables that share a sharding key — for example, the
files_colocontainsfigma_files,file_nodes,file_comments, and other tables all sharded byfile_id. Within a colo, cross-table joins and full transactions are supported when restricted to a single shard key value. This matches how Figma's application code already accessed the database — most operations concerned a single file or a single user, not cross-colo data. The colo abstraction minimized the number of queries that needed to be refactored.⚠️
The Scatter-Gather Tax
Queries without a shard key — those that need results from all shards — are handled by DBProxy's scatter-gather mechanism: the query is fanned out to all shards in parallel and results are merged. Scatter-gather is correct but expensive: it multiplies read load by the number of shards. Having too many scatter-gather queries would defeat the purpose of horizontal sharding. The shadow planning framework specifically identified scatter-gather patterns in the production query log before sharding, allowing teams to refactor the most frequent offenders before their tables were migrated.
✅
DBProxy: Six Months to Build, Indefinite Value
Building DBProxy — the Go service with an AST parser, logical planner, and physical planner — was the highest-risk engineering bet in the sharding project. It took months to build and required solving problems that existing tools had already solved in different ways. But the payoff was precise control: DBProxy understands Figma's specific query patterns , supports exactly the subset of SQL that Figma uses, and can be extended as Figma's needs evolve. A generic proxy would have required adapting Figma's code to its limitations. DBProxy was adapted to Figma's code.
Lessons
Figma's sharding story is widely cited because it did something genuinely hard — horizontally sharding a complex relational production database under deadline pressure — and documented the architecture decisions clearly enough for other teams to learn from. The lessons are about sequencing, abstraction, and the courage to build something custom when existing tools genuinely don't fit.
- 01. Separate logical sharding from physical sharding. Implementing sharding routing behavior at the application layer — using views, config, or a proxy — before moving any physical data gives you weeks of production validation at essentially zero risk. When you flip to physical sharding, the routing is already proven correct. This two-phase approach is the biggest risk reducer in a horizontal sharding migration.
- 02. Colocations (groups of related tables that share the same sharding key and physical shard layout, allowing cross-table joins and transactions within the group) are the abstraction that makes sharding survivable for product engineers. Without colos, horizontal sharding forces engineers to think about shard routing on every database query. With colos, most queries just work as they always did.
- 03. Use shadow traffic to define your query language before building your proxy. Figma's shadow planning framework let them empirically measure which query patterns existed in production before designing DBProxy. This meant the proxy was built for real queries, not imagined ones — and the 10% of queries excluded from support were known and manageable, not discovered as surprises in production.
- 04. Know when existing tools don't fit your timeline. Figma evaluated CockroachDB, Spanner, TiDB, and Vitess — all good systems. They chose to build something custom not out of arrogance but because the migration risk to an unfamiliar storage layer under a months-long deadline was genuinely higher than building a scoped custom solution on their existing Postgres expertise. The build-vs-buy decision was made with real risk data, not intuition.
- 05. Design for rollback even after the migration completes. Figma maintained the ability to reverse physical shard splits after they happened. The unknown unknowns in a horizontal sharding migration are real — building in a reverse path at every phase is the engineering discipline that lets teams execute confidently rather than hold their breath.
✅
The Post-Migration State: Scale Without Change
After the initial work to make a table horizontal-sharding-compatible, all future shard splits happen transparently. As a table grows again toward limits, the infrastructure team can split shards — updating the physical topology and DBProxy's routing config — without any product engineer touching their code. This is the payoff of the upfront investment: the database can now scale indefinitely at the infrastructure layer, decoupled from the application layer.
WHEN MONTHS OF RUNWAY MEANS NOW
Figma's team framed their problem as 'months of runway remaining' — meaning that if they did nothing, they would hit a hard scaling ceiling and likely experience database reliability incidents or outages within months. This framing was not catastrophizing; it was the math of their growth rate applied to their IOPS limit. The urgency drove the decision to build custom rather than migrate to an unfamiliar system. Teams facing similar trajectory should run this calculation early — months of runway sounds like plenty of time until the migration itself takes several months.
Figma's database grew 100x and a small team fixed it in nine months — which is either very good database engineering or very good use of Postgres views depending on who you ask, and the answer is both.
TechLogStack — built at scale, broken in public, rebuilt by engineers
This case is a plain-English retelling of publicly available engineering material.
Read the full case on TechLogStack → (interactive diagrams, source links, and the full reader experience).
Top comments (0)