DEV Community

TechLogStack
TechLogStack

Posted on • Originally published at techlogstack.com on

Slack Rewrote Its Core Architecture for Enterprise — Because the Old One Was a Lie

Slack · Distributed Systems · 17 May 2026

Slack was built for teams in single workspaces. Enterprise customers were using it across dozens of workspaces simultaneously — and the architecture had never been designed for that. Every major enterprise feature was a workaround on top of a foundation that assumed one workspace per person. Slack spent two years rebuilding the foundation.

  • 2 years development time
  • Workspace-centric → org-wide
  • Thousands of APIs refactored
  • Beta → rollout Sep 2023–Mar 2024
  • Unified DMs, Activity, Save it for Later
  • Rails monolith + new grid clients

The Story

Slack launched in 2013 with a beautifully simple data model: users belong to workspaces, workspaces contain channels, channels contain messages. To view a different workspace, you click on it and context switches entirely. For small teams using a single workspace, this model was perfect. For large enterprises that had grown to 50, 100, or 200 workspaces across departments, geographies, and business units, it was a prison. Every DM, every notification, every unread count, every search result was siloed by workspace. A VP with access to 80 workspaces had to remember which workspace a conversation was in, click to it, check notifications, return, and repeat — dozens of times per day. The architecture was working against the users it was supposed to serve.

All software is built atop a core set of assumptions. As new code is added and new use-cases emerge, software can become unmoored from those assumptions. When this happens, a fundamental tension arises between revisiting those foundational assumptions — which usually entails a lot of work — or trying to support new behavior atop the existing architecture.

— — Slack Engineering — via 'Unified Grid: How We Re-Architected Slack for Our Largest Customers'

Slack's team had been papering over the workspace-centric limitation for years with increasingly complex workarounds. The Connect (Slack's feature allowing users in different Slack organizations to message each other across workspace boundaries — built as an overlay on the workspace model) feature, multi-workspace management tools, org-wide settings — all of them were workarounds that added complexity without fixing the fundamental architecture. The CTO and engineering leadership faced a classic build-it-now-or-keep-patching decision. They chose to build. The project was called Unified Grid , and it would require rebuilding the core data model, refactoring thousands of APIs, and redesigning both the backend and every client application — simultaneously.

THE WORKSPACE-CENTRIC ASSUMPTION

In Slack's original architecture, almost all data was particular to a single workspace : messages, channels, DMs, notification preferences, user profiles, unread counts. This assumption was baked into thousands of database queries, API responses, and client rendering paths. To build Unified Grid, every piece of data that needed to be visible across workspaces had to be lifted out of the workspace silo — a change that touched nearly every system in the stack.

🏢

Slack's largest enterprise customers operate across dozens to hundreds of workspaces. Expecting those users to manually context-switch between workspaces to find conversations, check notifications, or respond to DMs was creating real productivity friction. The Unified Grid project was not a technical exercise — it was a direct response to enterprise customer feedback.

Problem

Enterprise Users Drowning in Workspace Context-Switches

Slack's workspace-centric model forced enterprise users to manually navigate between dozens of workspaces to find conversations and check notifications. Key features like a unified DM inbox, an org-wide activity feed, and cross-workspace search were impossible within the existing architecture — not missing features, architecturally blocked features.


Cause

The Foundation Assumption Was Wrong for Enterprise

Slack's data model had been built on the assumption that almost all user data is particular to a single workspace. Ten years of feature development had embedded this assumption deep into database schema, API contracts, and client rendering logic. Supporting org-wide views required either a rewrite or an ever-growing layer of workarounds.


Solution

Prototype the Path: Build Incrementally, Prove Out

Rather than committing immediately to a full rewrite, Slack's team built a proof of concept using Unified Grid within internal tooling — Slack's own employees using it daily. Only after the POC validated the architecture and revealed what work was required did the team commit to a full rollout. Slack calls this 'prototyping the path.'


Result

Shipped After 2 Years: Rollout Sep 2023 → Mar 2024

Unified Grid rolled out to customers starting Fall 2023 and completed in March 2024. Features like the unified DMs tab, org-wide Activity tab, and Save it for Later became possible on a foundation that had been impossible on the workspace-centric model. The rewrite that everyone said you shouldn't do turned out to be necessary.


⚠️

The 'Avoid Rewrites' Truism — and When It Breaks

Software engineering wisdom universally advises against large rewrites. Slack's team explicitly acknowledged this truism — and then concluded that it did not apply to their situation. When the architecture of an application drifts far enough from how that application is used, rebuilding the core foundation becomes less risky than continuing to build complexity on top of a wrong foundation. The key question is not 'rewrites are bad' — it's 'how far has the drift gone?'

The Unified Grid project used a strategy Slack calls prototyping the path — building incrementally, proving out ideas in practice before committing to the full scope. Rather than designing the complete architecture and then building it, the team built a barely functional prototype of Unified Grid and deployed it to Slack's own internal teams. Using it daily for their own work surfaced what was broken, what the real user experience gaps were, and what the engineering challenges would be in production — all before the team had committed to building the entire thing. By Summer 2023, Unified Grid was stable enough that much of the company used it daily. By Fall 2023, external rollout began. By March 2024, it was complete.

🧪

Slack Dogfoods Its Own Infrastructure

Slack's engineers are among the heaviest users of Slack — including new features under development. The internal dogfooding of Unified Grid gave the team thousands of daily active users giving real feedback on a pre-production architectural overhaul. This feedback loop compressed the time between 'we thought this would work' and 'we know it doesn't work' from months to days.

The Internal Deployment Milestone

By Summer 2023, much of Slack's company was using Unified Grid for their daily work. This internal milestone was not just a technical success — it was an organizational one. Having thousands of Slack employees using a pre-release architectural overhaul daily meant real bug reports, real performance data, and real confidence that the system was ready for external customers.

WHAT HAD BEEN TRIED BEFORE

Slack had tried to alleviate the workspace-switching problem incrementally before committing to Unified Grid: Shared Channels allowed cross-workspace channel sharing. Connect enabled messaging across organizations. Various UI improvements consolidated workspace-switching. None of these fixed the fundamental architecture — they were features built on a wrong foundation that made the foundation slightly less painful without changing it.

🔍

Cross-Workspace Search: The Clearest User Pain

One of the most cited enterprise frustrations was search. Searching for a message required knowing which workspace it was in first , then switching to that workspace, then searching. Unified Grid enabled org-wide search that returned results from all accessible workspaces simultaneously. For users with 80 workspaces, this changed search from a manual process into a useful feature.


The Fix

The Technical Work: Thousands of APIs, One New Foundation

The scale of the Unified Grid engineering effort is difficult to convey without concrete numbers. Slack's codebase contained thousands of API endpoints, database queries, and client rendering paths that assumed workspace-scoped data. Each of these had to be evaluated: does it need to be org-aware? If so, what's the migration path? In many cases, a query that fetched a user's DMs from a single workspace had to be replaced with a query that could aggregate DMs from all of the user's workspaces efficiently. In other cases, entirely new data structures had to be introduced to represent org-level concepts that had never existed before.

  • 2 years — Development duration from first prototype to full customer rollout — reflecting the depth of the architectural changes required
  • 1000s — APIs, database queries, and permission checks updated to support org-wide data access rather than workspace-scoped data access
  • Mar 2024 — Full rollout completion date — the project that began as a proof-of-concept in 2021-2022 became a production reality across Slack's entire customer base
  • 3 features — Core Unified Grid capabilities delivered: unified DMs tab, org-wide Activity tab, and Save it for Later — all architecturally impossible on the old model
# Simplified conceptual example of workspace-centric vs org-wide data access
# Real Slack uses Hack/PHP and complex distributed data systems

# OLD: Workspace-centric DM fetch
# User must specify which workspace they want DMs from
def get_dms_old(user_id: str, workspace_id: str) -> list:
    # Every query is scoped to a single workspace
    return db.query(
        "SELECT * FROM direct_messages "
        "WHERE workspace_id = ? AND user_id = ?",
        workspace_id, user_id # workspace_id required — siloed
    )

# NEW: Org-aware DM fetch (Unified Grid)
# Returns DMs across all workspaces the user belongs to
def get_dms_unified(user_id: str, org_id: str) -> list:
    # Query all workspaces the user belongs to in this org
    workspaces = org_membership_service.get_workspaces(user_id, org_id)

    # Aggregate DMs across all workspaces — unified inbox
    # Sorted by recency, not by workspace
    return dm_service.get_org_wide(
        user_id=user_id,
        workspace_ids=[ws.id for ws in workspaces],
        sort_by='recency' # unified sort across workspace boundaries
    )

# Permission checks also needed org-level understanding:
# Old: can_access(user, workspace, resource)
# New: can_access(user, org, workspace, resource) — layered org context
Enter fullscreen mode Exit fullscreen mode

PROTOTYPE THE PATH: HOW SLACK DE-RISKED THE REWRITE

Slack's most important process decision for Unified Grid was building a working prototype used internally before committing to full scope. This is 'prototyping the path' — not a throwaway prototype, but a real functioning implementation used by real users on real data. The feedback from internal use surfaced problems that would have been catastrophic if discovered post-rollout. It also gave leadership confidence to commit the full engineering resources needed for the project.

The New Features That Became Possible

Unified Grid unlocked product features that were architecturally impossible before the migration: a unified DMs tab showing all DMs across all workspaces, an org-wide Activity tab showing all notifications in chronological order regardless of workspace, and Save it for Later aggregating saved items across workspace boundaries. These aren't incremental improvements — they're features that required the foundation to be correct before they could exist.

⚠️

The Executive Concern: Is It Worth the Cost?

The Unified Grid blog post is unusually candid about the organizational challenge: execs and engineering leadership were genuinely concerned about the cost. Was rebuilding the core architecture worth potentially thousands of engineer-weeks of effort? The team's answer was to build the proof of concept first, use internal data to demonstrate the benefits, and then make the case for full investment — rather than asking for two years of resources upfront on a bet.

ℹ️

Rolling Rollout: September 2023 to March 2024

Unified Grid wasn't released all at once — Slack used a controlled rollout over six months , starting with early access customers in Fall 2023 and expanding to the full customer base by March 2024. This allowed the team to find bugs under real enterprise load before every customer was affected, and to build customer success resources in parallel with technical rollout. The phased rollout of a two-year engineering project was itself a significant coordination effort.

⚠️

The Migration Cost That Can't Be Avoided

Unified Grid required updating existing customers' Slack configurations, data migrations for org-level constructs, and client-side state invalidation when users upgraded. Some features required users to re-learn workflows they had developed over years with the old model. There is no such thing as a transparent foundational architecture change at production scale — some user-visible change is inevitable, and Slack had to manage customer communication throughout the rollout.


Architecture

Unified Grid's architecture changes span three layers of Slack's stack. The backend required new data models for org-level concepts, updated APIs with org-level context, and new query patterns that aggregate across workspaces. The desktop and mobile clients required redesigned rendering architectures that could display org-wide views alongside workspace-specific ones. The permission system required new layering to support org-level access controls on top of existing workspace-level access controls. All three layers had to change simultaneously and stay in sync during the two-year rollout.

Before Unified Grid: Workspace-Centric Data Silos

View interactive diagram on TechLogStack →

Interactive diagram available on TechLogStack (link above).

After Unified Grid: Org-Wide Views Across Workspaces

View interactive diagram on TechLogStack →

Interactive diagram available on TechLogStack (link above).

PERMISSION LAYERS: ORG + WORKSPACE

One of the hardest architectural changes in Unified Grid was the permission system. Old permissions were: can this user access this resource in this workspace? New permissions are: can this user access this resource in this workspace within this org? Org-level admin controls needed to cascade down to workspace-level controls, override in some cases, and defer in others. Building a correct, auditable, performant permission system that understood both levels required careful design — org-level permission bugs in a product used by enterprises have serious security implications.

ℹ️

The Client Architecture Challenge

Backend changes were only half the work. Slack's desktop and mobile clients had been designed to render one workspace at a time. Unified Grid required clients to maintain state across multiple workspaces simultaneously , merge data streams from different workspace backends, and render org-wide views alongside workspace views without confusion. The client architecture work was as extensive as the backend work — and had to be shipped to every platform (Mac, Windows, Linux, iOS, Android) simultaneously.

The Rails Monolith as Change Vehicle

Despite Slack's architectural evolution, the backend rewrite was implemented within the existing Rails monolith rather than as a separate service. This made incremental deployment easier — changes could be gated behind feature flags, rolled back quickly, and deployed through the existing CI/CD pipeline. The Unified Grid project is evidence that a monolith can accommodate fundamental architectural evolution without requiring a microservices extraction.


Lessons

Unified Grid challenges the 'never do large rewrites' maxim with a documented counterexample. The lesson is not 'large rewrites are fine' — it's that the decision requires honest evaluation of how far architectural drift has gone and whether incremental patching is still viable.

  1. 01. The 'avoid rewrites' truism is a default, not a law. When your architecture's foundational assumptions have drifted far enough from actual usage that every new feature requires a workaround, the accumulated technical debt of workarounds may exceed the cost of rebuilding the foundation. Evaluate honestly. Don't use 'rewrites are bad' as a reason to avoid a decision that actually needs to be made.
  2. 02. Prototype the path before committing full resources to a rewrite. Build a working implementation, use it internally, and let real usage surface the gaps — before asking for two years of engineering investment. Slack's internal dogfooding of Unified Grid gave leadership evidence rather than speculation to justify the project's scope.
  3. 03. Permission systems need to evolve in lockstep with data models. Org-level access controls cannot be bolted onto workspace-level permission systems. When your user model gains a new organizational layer, your permission model must gain it too. This work is unglamorous, invisible to users, and absolutely required for enterprise security.
  4. 04. Client and backend architecture must change together. You cannot ship an org-wide backend while keeping workspace-centric clients. The full change is end-to-end: data model, API contracts, permission systems, desktop client, mobile client, web client. Planning the delivery sequence for a change this wide is as important as designing the architecture itself.
  5. 05. Prototyping the path (Slack's term for building a working but incomplete implementation of a major change, using it internally to validate the direction before committing to full scope) is the engineering equivalent of a staged rollout for architectural decisions. You don't commit the full budget until you have production evidence that the direction is right.

When 'Never Rewrite' Gets Overruled by 'Never Ship This Feature'

The decisive business case for Unified Grid was not abstract technical cleanliness — it was that Slack's largest enterprise customers could not get features they needed without the architecture change. Unified DMs, org-wide Activity, cross-workspace search — these were features enterprise contracts were being written around. When the architecture prevents the product from serving its largest customers, the rewrite decision has already been made by the market.

THE INCREMENTAL COMMITMENT MODEL

Unified Grid's development history — proof of concept, internal dogfood, limited beta, full rollout — illustrates an incremental commitment model for large engineering bets. At each stage, the team had evidence of progress before committing to the next stage's resource investment. This model de-risks large architectural bets by converting them from single go/no-go decisions into a series of smaller, evidence-gated decisions.

Slack spent two years building a feature that enterprise customers could have described in one sentence: 'show me all my messages, regardless of which workspace they're in' — and it turns out one sentence can hide a complete architectural overhaul.

TechLogStack — built at scale, broken in public, rebuilt by engineers


This case is a plain-English retelling of publicly available engineering material.

Read the full case on TechLogStack → (interactive diagrams, source links, and the full reader experience).

Top comments (0)