DEV Community

Rizwan Saleem
Rizwan Saleem

Posted on

Building a Feature Flag Control Plane for Modern Web Apps

Building a Feature Flag Control Plane for Modern Web Apps

Building a Feature Flag Control Plane for Modern Web Apps

A feature flag control plane lets you turn features on and off without redeploying your application, separate release timing from code shipping, and target behavior by user, segment, or environment. It is a strong system design topic because it touches low-latency reads, consistency, rollout safety, auditability, and operational guardrails.

Why this architecture matters

Feature flags are not just if statements in code; they become part of your delivery system once teams use them for staged rollout, experiments, and emergency kill switches. Microsoft’s guidance notes that flags separate feature release from deployment, while CloudBees highlights the need to keep toggle points and routers organized so the logic does not sprawl across the application.

A good control plane answers three questions quickly: “Is this feature enabled?”, “For whom?”, and “Why was this decision made?” It should do that with minimal request overhead and with enough traceability that you can debug a bad rollout after the fact.

Core requirements

Start by defining the product and operational requirements before choosing storage or architecture. A practical system should support global, environment-level, and user-specific flags, plus percentage rollouts, scheduled activation, and a safe expiration workflow so old flags do not become permanent technical debt.

Typical nonfunctional goals look like this:

  • Very fast read path, because flag checks may happen on every request.
  • Strong audit history, because flag changes affect production behavior.
  • Safe fallback behavior if the flag service is unavailable.
  • Low-friction rollout and rollback, including targeting and gradual exposure.

High-level design

A common architecture uses three layers: an admin API for writes, a storage layer for truth, and a delivery layer optimized for reads. The write path handles CRUD, approvals, audit logs, and validation, while the read path serves cached flag snapshots to app servers or edge clients with very low latency.

A clean mental model is:

  • Control plane, for creating and changing flags.
  • Distribution layer, for pushing new versions of flag data.
  • Evaluation layer, for deciding whether a request gets the feature.

That separation keeps writes flexible without making every request pay the cost of complex queries or joins.

Data model

Use a data model that supports both correctness and explainability. A minimal schema usually includes flags, flag_variants, segments, targets, and audit_events, with each change recorded as an immutable event for traceability.

Example relational schema:

CREATE TABLE feature_flags (
  id UUID PRIMARY KEY,
  key TEXT UNIQUE NOT NULL,
  description TEXT NOT NULL,
  enabled BOOLEAN NOT NULL DEFAULT FALSE,
  scope TEXT NOT NULL,  global, env, tenant
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
  expires_at TIMESTAMP NULL
);

CREATE TABLE feature_flag_rules (
  id UUID PRIMARY KEY,
  flag_id UUID NOT NULL REFERENCES feature_flags(id),
  rule_type TEXT NOT NULL,  user, segment, percentage, country
  rule_value TEXT NOT NULL,
  priority INT NOT NULL DEFAULT 0
);

CREATE TABLE feature_flag_audit (
  id UUID PRIMARY KEY,
  flag_id UUID NOT NULL REFERENCES feature_flags(id),
  actor_id UUID NOT NULL,
  action TEXT NOT NULL,  create, update, disable, delete
  before_json JSONB,
  after_json JSONB,
  created_at TIMESTAMP NOT NULL DEFAULT NOW()
);
Enter fullscreen mode Exit fullscreen mode

The important part is not just the tables, but the separation between the current desired state and the historical record of changes. That makes it possible to debug unexpected behavior later, especially during incident review.

Evaluation flow

The evaluation engine should be deterministic and predictable. A typical order is: check kill switch, check environment, check explicit user target, check segment membership, then fall back to percentage rollout. This ordering keeps emergency disables fast and makes rollout behavior easier to reason about.

A simple evaluation example in TypeScript:

type Context = {
  userId: string;
  email?: string;
  tenantId?: string;
  env: "dev" | "staging" | "prod";
};

type FlagRule =
  | { type: "user"; value: string }
  | { type: "tenant"; value: string }
  | { type: "percentage"; value: number };

function hashToPercent(input: string): number {
  let hash = 0;
  for (const ch of input) hash = (hash * 31 + ch.charCodeAt(0)) >>> 0;
  return hash % 100;
}

function isEnabled(flag: { enabled: boolean; rules: FlagRule[] }, ctx: Context): boolean {
  if (!flag.enabled) return false;

  for (const rule of flag.rules) {
    if (rule.type === "user" && rule.value === ctx.userId) return true;
    if (rule.type === "tenant" && rule.value === ctx.tenantId) return true;
    if (rule.type === "percentage") {
      const bucket = hashToPercent(`${flag}:${ctx.userId}`);
      if (bucket < rule.value) return true;
    }
  }

  return false;
}
Enter fullscreen mode Exit fullscreen mode

The rule engine should be boring on purpose. Predictability matters more than cleverness because every hidden edge case becomes a release risk.

Read path strategy

The best flag systems minimize live database reads during application traffic. A common approach is to publish a versioned snapshot of all active flags into Redis, an in-memory store, or a signed config bundle that application instances refresh periodically.

You can use a local cache with short TTL plus background refresh, or a push model where the control plane notifies clients on changes. The key is that request-time evaluation should be fast enough that teams can put flag checks on hot paths without fear.

Example Node.js request middleware:

import crypto from "crypto";

let cache: { version: string; flags: any } | null = null;

async function refreshFlags() {
  const res = await fetch("https://flags.example.com/snapshot");
  const version = res.headers.get("x-flags-version") || "unknown";
  const flags = await res.json();
  cache = { version, flags };
}

export async function flagMiddleware(req: any, res: any, next: any) {
  if (!cache) await refreshFlags();

  const flagKey = req.headers["x-flag-key"];
  const flag = cache!.flags[flagKey];

  req.flags = {
    checkoutRevamp: Boolean(flag?.enabled),
  };

  next();
}
Enter fullscreen mode Exit fullscreen mode

This design is resilient because the app can continue using its last known snapshot if the control plane is temporarily unavailable. That fallback behavior is one of the biggest practical reasons teams adopt feature flags in production.

Rollouts and targeting

Rollout logic is where feature flags become truly useful. Percentage rollout lets you expose a feature to, say, 5 percent of users, then 25 percent, then 100 percent, while user and tenant targeting supports internal testing and high-value customer enablement.

A practical rollout plan usually follows this sequence:

  1. Enable for internal staff.
  2. Enable for one tenant or one customer cohort.
  3. Increase percentage exposure gradually.
  4. Watch metrics and error rates.
  5. Promote or disable based on evidence.

A simple bucketing scheme should be stable across requests, which is why most systems derive the bucket from a user identifier plus flag key. That prevents users from “flipping” in and out of treatment as requests move between servers.

Safety and ops

Feature flags are operational tools, so they need guardrails. Cloud-native guidance emphasizes that flags let you activate features for specific users without redeploying, but that power should be paired with disciplined lifecycle management and cleanup.

Useful safeguards include:

  • Approval workflows for production flag changes.
  • Automatic expiration dates for temporary flags.
  • Audit logs for every change.
  • A kill switch that is easy to evaluate and hard to break.
  • Alerts when a flag has not been referenced or updated for a long time.

If a flag service fails, your application should default to a clearly defined safe state. For risky features, that often means “off”; for infrastructure-level safety toggles, it may mean using the last known good snapshot until recovery.

Failure modes

The hardest problems are usually not the first implementation, but the edge cases that appear under load or during incidents. Common failure modes include stale snapshots, split-brain config, inconsistent targeting, forgotten cleanup, and a flag that remains in the codebase after the rollout is complete.

A good incident story often looks like this:

  • A flag was turned on for 10 percent of users.
  • Metrics worsened for a specific segment.
  • The team disabled the flag globally within seconds.
  • The system kept serving the last cached snapshot to avoid a hard outage.

That combination of fast control and safe fallback is the real value of the architecture.

Deployment example

A simple deployment topology can be:

  • PostgreSQL for source of truth.
  • Redis for the current snapshot and fast reads.
  • Background publisher for snapshot generation.
  • Admin web app for change management.
  • SDKs in each service for local evaluation and caching.

In practice, each service loads the latest signed snapshot at startup, refreshes it in the background, and evaluates flags locally. That avoids turning your flag service into a runtime bottleneck for every app request.

Build it step by step

If you are implementing this from scratch, a good order is:

  1. Model flags, rules, and audit events.
  2. Build the admin API and change validation.
  3. Add snapshot generation and versioning.
  4. Add SDK-side caching and local evaluation.
  5. Add targeting, percentage rollout, and expiration.
  6. Add dashboards, alerts, and cleanup automation.

That sequence gets you to a usable system quickly without overbuilding the hardest parts too early. It also keeps the read path simple while you harden governance around changes.

Practical checklist

Before calling the system production-ready, check these items:

  • Evaluation is deterministic for the same user and flag.
  • The app can function on the last known snapshot.
  • Every change is auditable.
  • Temporary flags have expiry dates.
  • Rollout percentages are stable and repeatable.
  • Flag names are consistent and documented.

A feature flag platform is successful when it makes releases safer without becoming a new source of outages. If the architecture is clean, teams will use it naturally; if it is slow or confusing, they will bypass it and reintroduce risk.

-

Rizwan Saleem | https://rizwansaleem.co

Sources

Top comments (0)