sweet

Posted on Jun 19

Real User Monitoring: Measuring Web Performance in Production

#saas #analytics #performance #web

Lab tests (Lighthouse, CI benchmarks) tell you how your app performs on a test machine. Real User Monitoring tells you how your app performs for actual users on their devices, networks, and locations. RUM catches performance issues that lab tests never will — slow connections, memory pressure, ad blocker interference, and geographic variance. This guide covers the RUM implementation at tanstackship.com.

Lab vs Field Data

Aspect	Lab (Lighthouse)	Field (RUM)
Environment	Controlled (Moto G4, slow 3G)	Real user devices
Network	Simulated throttling	Actual connections (5G, 4G, 3G, WiFi)
Location	Single location	Global (330+ Cloudflare locations)
Device	Fixed device profile	All devices and form factors
Sample size	Single run per PR	Every page load
Detects	Optimization opportunities	Actual user experience issues
Missing	What real users experience	Controlled comparison

The truth: Lab data tells you what to fix. Field data tells you what users actually experience. You need both.

RUM Data Collection

Setting Up Web Vitals Collection

// src/lib/rum.ts
import { onLCP, onCLS, onINP, onTTFB, onFCP } from "web-vitals/attribution"

type VitalName = "LCP" | "CLS" | "INP" | "TTFB" | "FCP"

interface VitalReport {
  name: VitalName
  value: number
  rating: "good" | "needs-improvement" | "poor"
  id: string
  metadata: Record<string, string>
  deviceType: string
  connectionType: string
}

export function initRUM() {
  const vitals: Array<{ name: VitalName; fn: (metric: any) => void }> = [
    { name: "LCP", fn: onLCP },
    { name: "CLS", fn: onCLS },
    { name: "INP", fn: onINP },
    { name: "TTFB", fn: onTTFB },
    { name: "FCP", fn: onFCP },
  ]

  vitals.forEach(({ name, fn }) => {
    fn((metric) => {
      sendVital({
        name,
        value: metric.value,
        rating: metric.rating,
        id: metric.id,
        metadata: extractAttribution(metric),
        deviceType: getDeviceType(),
        connectionType: getConnectionType(),
      })
    })
  })
}

function extractAttribution(metric: any): Record<string, string> {
  if (metric.attribution) {
    // Extract useful debugging info
    const { element, url, fcp, ...rest } = metric.attribution
    return {
      ...(element && { lcpElement: element.tagName }),
      ...(url && { lcpUrl: url }),
    }
  }
  return {}
}

function getDeviceType(): string {
  const ua = navigator.userAgent
  if (/Mobi|Android/i.test(ua)) return "mobile"
  if (/Tablet|iPad/i.test(ua)) return "tablet"
  return "desktop"
}

function getConnectionType(): string {
  const conn = (navigator as any).connection
  return conn?.effectiveType ?? "unknown"
}

Sending RUM Data to the Backend

// Use sendBeacon for reliable delivery (survives page navigation)
function sendVital(report: VitalReport) {
  const payload = {
    ...report,
    pathname: window.location.pathname,
    timestamp: Date.now(),
  }

  if (navigator.sendBeacon) {
    navigator.sendBeacon("/api/vitals", JSON.stringify(payload))
  } else {
    fetch("/api/vitals", {
      method: "POST",
      body: JSON.stringify(payload),
      keepalive: true,
    })
  }
}

Server-Side Storage

// server/rum.ts
export const reportVital = createServerFn({ method: "POST" }).handler(
  async ({ request, context }) => {
    const data = await request.json()

    // Store in D1 for querying
    await context.env.DB.prepare(`
      INSERT INTO rum_metrics (
        id, name, value, rating, pathname,
        device_type, connection_type,
        country, metadata, created_at
      ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
    `).bind(
      data.id,
      data.name,
      data.value,
      data.rating,
      data.pathname,
      data.deviceType,
      data.connectionType,
      request.cf?.country ?? "unknown",
      JSON.stringify(data.metadata),
      data.timestamp
    ).run()

    return { received: true }
  }
)

Analyzing RUM Data

Querying by Metric

export const getRumDashboard = createServerFn({ method: "GET" }).handler(
  async ({}, { context }) => {
    // Overall metrics for the last 7 days
    const overall = await context.env.DB.prepare(`
      SELECT
        name,
        COUNT(*) as samples,
        APPROX_PERCENTILE(value, 0.5) as p50,
        APPROX_PERCENTILE(value, 0.75) as p75,
        APPROX_PERCENTILE(value, 0.95) as p95,
        SUM(CASE WHEN rating = 'good' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as good_pct
      FROM rum_metrics
      WHERE created_at > datetime('now', '-7 days')
      GROUP BY name
    `).all()

    // Breakdown by pathname (top 10 slowest)
    const byPath = await context.env.DB.prepare(`
      SELECT
        pathname,
        APPROX_PERCENTILE(CASE WHEN name = 'LCP' THEN value END, 0.5) as lcp_p50,
        APPROX_PERCENTILE(CASE WHEN name = 'INP' THEN value END, 0.5) as inp_p50,
        COUNT(*) as pageviews
      FROM rum_metrics
      WHERE created_at > datetime('now', '-7 days')
      GROUP BY pathname
      ORDER BY lcp_p50 DESC
      LIMIT 20
    `).all()

    return { overall: overall.results, slowestPaths: byPath.results }
  }
)

RUM Dashboard

RUM Dashboard (Last 7 Days)

Web Vitals Overview:
┌────────┬──────────┬──────────┬──────────┬─────────┐
│ Metric │ P50      │ P75      │ P95      │ % Good  │
├────────┼──────────┼──────────┼──────────┼─────────┤
│ LCP    │ 1,200ms  │ 2,100ms  │ 4,500ms  │ 78%     │
│ CLS    │ 0.02     │ 0.05     │ 0.15     │ 85%     │
│ INP    │ 80ms     │ 150ms    │ 350ms    │ 82%     │
│ TTFB   │ 150ms    │ 350ms    │ 900ms    │ 88%     │
└────────┴──────────┴──────────┴──────────┴─────────┘

Performance by Geographic Region:
┌─────────────┬─────────┬──────────┬──────────────┐
│ Region      │ P50 LCP │ P95 LCP  │ Slow % (>3s) │
├─────────────┼─────────┼──────────┼──────────────┤
│ US East     │ 900ms   │ 2,100ms  │ 3%           │
│ US West     │ 1,100ms │ 2,800ms  │ 5%           │
│ Europe      │ 1,300ms │ 3,200ms  │ 8%           │
│ Asia Pacific│ 2,100ms │ 5,500ms  │ 20%          │
│ South America│ 2,400ms│ 6,000ms  │ 25%          │
└─────────────┴─────────┴──────────┴──────────────┘

Top 5 Slowest Pages:
1. /dashboard/analytics (p50 LCP: 4.2s) — heavy charts
2. /products/listing (p50 LCP: 3.8s) — large images
3. /reports/export (p50 LCP: 3.5s) — slow API

Alerts from RUM Data

export const checkRumAlerts = createServerFn({ method: "GET" }).handler(
  async ({}, { context }) => {
    const alerts = []

    // Alert if LCP good percentage drops below threshold
    const lcpQuality = await context.env.DB.prepare(`
      SELECT
        COUNT(*) as total,
        SUM(CASE WHEN rating = 'good' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as good_pct
      FROM rum_metrics
      WHERE name = 'LCP' AND created_at > datetime('now', '-1 hour')
    `).first()

    if (lcpQuality && Number(lcpQuality.good_pct) < 70) {
      alerts.push({
        type: "rum_degradation",
        metric: "LCP",
        goodPct: Math.round(Number(lcpQuality.good_pct)),
        threshold: 70,
        severity: "high",
      })
    }

    // Alert if any specific page has p95 LCP > 5s
    const slowPages = await context.env.DB.prepare(`
      SELECT pathname, COUNT(*) as views
      FROM rum_metrics
      WHERE name = 'LCP'
        AND value > 5000
        AND created_at > datetime('now', '-1 hour')
      GROUP BY pathname
      HAVING views > 10
      ORDER BY views DESC
      LIMIT 5
    `).all()

    if (slowPages.results.length > 0) {
      alerts.push({
        type: "slow_pages",
        pages: slowPages.results,
        severity: "medium",
      })
    }

    return alerts
  }
)

Using RUM to Drive Optimizations

RUM Signal	Investigation	Optimization
High LCP on mobile	Check hero image size	Serve AVIF, preload hero, reduce image size
High CLS on product page	Check dynamic content insertion	Reserve space, fix font swap layout shift
High INP on dashboard	Profile main thread activity	Break up long tasks, lazy load charts
Poor APAC LCP	Geographic latency issue	Edge caching, CDN optimization
Poor TTFB on auth pages	Auth middleware overhead	Optimize session lookup, cache auth state

RUM Data Schema

-- migrations/rum_metrics.sql
CREATE TABLE IF NOT EXISTS rum_metrics (
  id TEXT PRIMARY KEY,
  name TEXT NOT NULL,         -- LCP, CLS, INP, TTFB, FCP
  value REAL NOT NULL,        -- Metric value in ms or score
  rating TEXT NOT NULL,       -- good, needs-improvement, poor
  pathname TEXT NOT NULL,     -- URL path
  device_type TEXT,           -- mobile, desktop, tablet
  connection_type TEXT,       -- 4g, 3g, 2g, slow-2g
  country TEXT,               -- Two-letter country code
  metadata TEXT,              -- JSON with attribution data
  created_at INTEGER NOT NULL
);

CREATE INDEX idx_rum_name ON rum_metrics(name);
CREATE INDEX idx_rum_created ON rum_metrics(created_at);
CREATE INDEX idx_rum_path ON rum_metrics(pathname);

RUM Implementation Checklist

[ ] Web Vitals library installed and initialized on all pages
[ ] RUM data sent via sendBeacon for reliable delivery
[ ] Server endpoint stores metrics in D1 or Analytics Engine
[ ] Sample rate configured (100% for initial setup, then reduce to 10-25%)
[ ] Dashboard built for p50/p75/p95 metrics
[ ] Geographic breakdown visible in dashboard
[ ] Pathname-level aggregation for slow page detection
[ ] Automated alerts for RUM degradation
[ ] Device type segmentation (mobile vs desktop)
[ ] Connection type tracking for network-aware optimization
[ ] Integration with CI pipeline — compare PR RUM vs production RUM
[ ] Historical data retention for trend analysis

Conclusion

Real User Monitoring bridges the gap between what you test in development and what your users experience in production. Without RUM, you are optimizing based on assumptions. With RUM, every optimization decision is backed by data from actual users.

The implementation is straightforward:

Collect Web Vitals from every page load
Store them in D1 or Analytics Engine
Build dashboards for visualization
Set alerts for degradation
Use the data to prioritize optimization work

For a production SaaS with RUM implemented across all pages, see tanstackship.com.

DEV Community