- 2 years development time — from first internal prototype to full customer rollout
- Workspace-centric → org-wide — every DM, notification, search result, and unread count
- Thousands of APIs refactored to support org-level data access
- Rollout Sep 2023 → Mar 2024 — six-month controlled rollout after two years of development
- 3 features unlocked that were architecturally impossible before: unified DMs, org-wide Activity, Save it for Later
- Built within the existing Rails monolith — no microservices extraction required
Slack was built for teams in single workspaces. Enterprise customers were using it across dozens of workspaces simultaneously — and the architecture had never been designed for that. Every major enterprise feature was a workaround on top of a foundation that assumed one workspace per person. Slack spent two years rebuilding the foundation.
The Story
All software is built atop a core set of assumptions. As new code is added and new use-cases emerge, software can become unmoored from those assumptions. When this happens, a fundamental tension arises between revisiting those foundational assumptions — which usually entails a lot of work — or trying to support new behavior atop the existing architecture.
— Slack Engineering, via 'Unified Grid: How We Re-Architected Slack for Our Largest Customers'
Slack launched in 2013 with a beautifully simple data model: users belong to workspaces, workspaces contain channels, channels contain messages. For small teams using a single workspace, this model was perfect. For large enterprises that had grown to 50, 100, or 200 workspaces across departments, geographies, and business units, it was a prison. Every DM, every notification, every unread count, every search result was siloed by workspace. A VP with access to 80 workspaces had to remember which workspace a conversation was in, click to it, check notifications, return, and repeat — dozens of times per day.
Slack's team had been papering over the workspace-centric limitation for years with increasingly complex workarounds. The Connect (Slack's feature allowing users in different Slack organisations to message each other across workspace boundaries — built as an overlay on the workspace model) feature, multi-workspace management tools, org-wide settings — all workarounds that added complexity without fixing the fundamental architecture. The CTO and engineering leadership faced a classic build-it-now-or-keep-patching decision. They chose to build. The project was called Unified Grid, and it would require rebuilding the core data model, refactoring thousands of APIs, and redesigning both the backend and every client application — simultaneously.
Problem
Enterprise Users Drowning in Workspace Context-Switches
Slack's workspace-centric model forced enterprise users to manually navigate between dozens of workspaces to find conversations and check notifications. Key features like a unified DM inbox, an org-wide activity feed, and cross-workspace search were impossible within the existing architecture — not missing features, architecturally blocked features.
Cause
The Foundation Assumption Was Wrong for Enterprise
Slack's data model had been built on the assumption that almost all user data is particular to a single workspace. Ten years of feature development had embedded this assumption deep into database schema, API contracts, and client rendering logic. Supporting org-wide views required either a rewrite or an ever-growing layer of workarounds.
Solution
Prototype the Path: Build Incrementally, Prove Out
Rather than committing immediately to a full rewrite, Slack's team built a proof of concept using Unified Grid within internal tooling — Slack's own employees using it daily. Only after the POC validated the architecture and revealed what work was required did the team commit to a full rollout.
Result
Shipped After 2 Years: Rollout Sep 2023 → Mar 2024
Unified Grid rolled out to customers starting Fall 2023 and completed in March 2024. Features like the unified DMs tab, org-wide Activity tab, and Save it for Later became possible on a foundation that had been impossible on the workspace-centric model.
The Fix
The Technical Work: Thousands of APIs, One New Foundation
Slack's codebase contained thousands of API endpoints, database queries, and client rendering paths that assumed workspace-scoped data. Each had to be evaluated: does it need to be org-aware? If so, what's the migration path? In many cases, a query that fetched a user's DMs from a single workspace had to be replaced with a query that could aggregate DMs from all of the user's workspaces efficiently.
- 2 years — development duration from first prototype to full customer rollout
- 1000s — APIs, database queries, and permission checks updated to support org-wide data access
- Mar 2024 — full rollout completion date
- 3 features — unified DMs tab, org-wide Activity tab, Save it for Later — all architecturally impossible on the old model
# Simplified conceptual example of workspace-centric vs org-wide data access
# Real Slack uses Hack/PHP and complex distributed data systems
# OLD: Workspace-centric DM fetch
# User must specify which workspace — data is completely siloed
def get_dms_old(user_id: str, workspace_id: str) -> list:
return db.query(
"SELECT * FROM direct_messages "
"WHERE workspace_id = ? AND user_id = ?",
workspace_id, user_id # workspace_id required — siloed
)
# NEW: Org-aware DM fetch (Unified Grid)
# Returns DMs across all workspaces the user belongs to
def get_dms_unified(user_id: str, org_id: str) -> list:
# Query all workspaces the user belongs to in this org
workspaces = org_membership_service.get_workspaces(user_id, org_id)
# Aggregate DMs across all workspaces — unified inbox
# Sorted by recency, not by workspace — the user experience change
return dm_service.get_org_wide(
user_id=user_id,
workspace_ids=[ws.id for ws in workspaces],
sort_by='recency'
)
# Permission checks also needed org-level understanding:
# Old: can_access(user, workspace, resource)
# New: can_access(user, org, workspace, resource) — layered org context
# Every permission check in the codebase required evaluation and update
The executive concern: is it worth the cost?
The Unified Grid blog post is unusually candid about the organisational challenge: execs and engineering leadership were genuinely concerned about the cost. Was rebuilding the core architecture worth potentially thousands of engineer-weeks of effort? The team's answer was to build the proof of concept first, use internal data to demonstrate the benefits, and then make the case for full investment — rather than asking for two years of resources upfront on a bet.
The Rails monolith as change vehicle
Despite Slack's architectural evolution, the backend rewrite was implemented within the existing Rails monolith rather than as a separate service. This made incremental deployment easier — changes could be gated behind feature flags, rolled back quickly, and deployed through the existing CI/CD pipeline. The Unified Grid project is evidence that a monolith can accommodate fundamental architectural evolution without requiring a microservices extraction.
The migration cost that can't be avoided
Unified Grid required updating existing customers' Slack configurations, data migrations for org-level constructs, and client-side state invalidation when users upgraded. Some features required users to re-learn workflows they had developed over years with the old model. There is no such thing as a transparent foundational architecture change at production scale — some user-visible change is inevitable, and Slack had to manage customer communication throughout the rollout.
Architecture
Unified Grid's architecture changes span three layers of Slack's stack. The backend required new data models for org-level concepts, updated APIs with org-level context, and new query patterns that aggregate across workspaces. The desktop and mobile clients required redesigned rendering architectures that could display org-wide views alongside workspace-specific ones. The permission system required new layering to support org-level access controls on top of existing workspace-level controls. All three layers had to change simultaneously and stay in sync during the two-year rollout.
Before Unified Grid: Workspace-Centric Data Silos
View interactive diagram on TechLogStack →
Interactive diagram available on TechLogStack (link above).
After Unified Grid: Org-Wide Views Across Workspaces
View interactive diagram on TechLogStack →
Interactive diagram available on TechLogStack (link above).
Lessons
The "avoid rewrites" truism is a default, not a law. When your architecture's foundational assumptions have drifted far enough from actual usage that every new feature requires a workaround, the accumulated technical debt of workarounds may exceed the cost of rebuilding the foundation. Evaluate honestly. Don't use "rewrites are bad" as a reason to avoid a decision that actually needs to be made.
Prototyping the path (building a working but incomplete implementation of a major change, using it internally to validate the direction before committing to full scope) is the engineering equivalent of a staged rollout for architectural decisions. You don't commit the full budget until you have production evidence that the direction is right. Slack's internal dogfooding gave leadership evidence rather than speculation.
Permission systems need to evolve in lockstep with data models. Org-level access controls cannot be bolted onto workspace-level permission systems. When your user model gains a new organisational layer, your permission model must gain it too. This work is unglamorous, invisible to users, and absolutely required for enterprise security.
Client and backend architecture must change together. You cannot ship an org-wide backend while keeping workspace-centric clients. The full change is end-to-end: data model, API contracts, permission systems, desktop client, mobile client, web client. Planning the delivery sequence for a change this wide is as important as designing the architecture itself.
When the architecture prevents the product from serving its largest customers, the rewrite decision has already been made by the market. Unified DMs, org-wide Activity, cross-workspace search — these were features enterprise contracts were being written around. The business case for the rewrite was not abstract technical cleanliness; it was that the features could not exist without it.
Engineering Glossary
Org-wide — a data scope in Slack's Unified Grid architecture where data is visible across all workspaces within an organisation, rather than being scoped to a single workspace. Enabled by lifting data out of the workspace silo and adding an org-level layer to the data model, permission system, and client rendering paths.
Prototyping the path — Slack's term for building a working but incomplete implementation of a major architectural change, deploying it to internal users for daily use, and letting real usage surface gaps before committing to full scope. Contrasted with designing the complete architecture first and then building it.
Unified Grid — Slack's internal name for the two-year project that rebuilt the core data model, APIs, permission system, and client architecture to support org-wide data access across all of an enterprise customer's workspaces simultaneously.
Workspace-centric model — Slack's original data model where almost all user data — messages, channels, DMs, notification preferences, user profiles, unread counts — was scoped to a single workspace. Baked into thousands of database queries and API responses. The foundational assumption that Unified Grid replaced.
This case is a plain-English retelling of publicly available engineering material.
Read the full case on TechLogStack →
(Interactive diagrams, source links, and the full reader experience)
TechLogStack — built at scale, broken in public, rebuilt by engineers.
Top comments (0)