Streaming SSR Is Not a Free LCP Win

#nextjs #performance #react #webdev

Streaming SSR sounds like an obvious win. Instead of waiting for all server-side work to finish before sending any HTML, Next.js starts sending the page immediately and streams the rest as it becomes ready. Users see content sooner. LCP improves.

Except it doesn't always. The improvement depends entirely on where your LCP element lives in the component tree.

What streaming actually does

Without streaming, the server waits until the full page is rendered before sending the first byte of HTML. With streaming, Next.js sends an initial shell immediately, then flushes additional HTML chunks as Suspense boundaries resolve.

export default async function ProductPage({ params }) {
  // This data is needed for the shell — it blocks the initial response
  const category = await getCategory(params.categoryId);

  return (
    <div>
      <CategoryHeader category={category} />
      {/* This Suspense boundary streams in separately */}
      <Suspense fallback={<ProductListSkeleton />}>
        <ProductList categoryId={params.categoryId} />
      </Suspense>
    </div>
  );
}

CategoryHeader is in the shell — it arrives with the first response. ProductList streams in after its data resolves. The browser can start rendering and displaying the header while the product list is still loading on the server.

This genuinely helps perceived performance. The page feels interactive earlier because something meaningful is visible sooner. Whether it helps LCP depends on what that "something" is.

LCP only improves if the LCP element is in the shell

The browser measures LCP against when content actually appears on screen. If your LCP element — the hero image, the main heading, the product photo — is inside a Suspense boundary, the browser can't render it until that chunk streams in. Streaming didn't make it arrive faster. It just deferred other content.

I added Suspense boundaries to a product page expecting LCP to improve. It got worse by 400ms. The hero image was inside the product Suspense boundary, waiting for the same data that was blocking the page before. The browser couldn't start fetching the image until the streamed chunk arrived with the <img> tag. Streaming had delayed the only thing that mattered for LCP.

The rule is straightforward: the LCP element must be in the initial shell, or in a Suspense boundary that resolves before anything else. Everything else can stream in afterward.

Suspense boundary placement is the whole game

Designing a streaming page means deciding what goes in the shell and what gets deferred. The shell should contain:

The LCP element — always
Navigation and layout structure
Above-the-fold content that's cheap to fetch

Everything that's slow to fetch, personalized, or below the fold is a candidate for a Suspense boundary.

export default async function ProductPage({ params }) {
  // Fast query — product basics needed for LCP
  const product = await getProductBasics(params.id);

  return (
    <div>
      {/* Hero image is in the shell — LCP element renders immediately */}
      <ProductHero image={product.heroImage} title={product.name} />

      {/* Slow queries — personalized, below the fold */}
      <Suspense fallback={<ReviewsSkeleton />}>
        <ReviewSection productId={params.id} />
      </Suspense>

      <Suspense fallback={<RecommendationsSkeleton />}>
        <PersonalizedRecommendations userId={params.userId} />
      </Suspense>
    </div>
  );
}

getProductBasics fetches only what the shell needs — title, hero image, price. The slow queries for reviews and personalized recommendations happen in parallel behind their Suspense boundaries. The hero image is in the shell, so the browser can start loading it with the first response.

This requires splitting what was probably one database query into two: a fast query for above-the-fold data and a deferred query for everything else. That's the real cost of streaming — not technical complexity, but discipline about which data the shell actually needs.

Preloading resources before they stream

A hero image inside the shell still has to be fetched after the HTML arrives. On slow connections, that fetch adds meaningfully to LCP.

The <link rel="preload"> tag in the shell tells the browser to start fetching the image immediately, in parallel with parsing the rest of the HTML:

export default async function ProductPage({ params }) {
  const product = await getProductBasics(params.id);

  return (
    <>
      <link
        rel="preload"
        as="image"
        href={product.heroImage}
        fetchpriority="high"
      />
      <ProductHero image={product.heroImage} title={product.name} />
      {/* ... */}
    </>
  );
}

Next.js flushes <link> tags in the <head> before the body content. The browser sees the preload hint before it encounters the <img> tag in the shell, so the fetch starts earlier. For large images on slow connections, this can move LCP by several hundred milliseconds.

Measuring LCP on streaming pages

Lighthouse doesn't simulate streaming. It loads the page and measures LCP against the fully rendered result, which means it can't tell you whether your shell is arriving fast or whether streaming is actually helping.

Field data from real users is the only measurement that captures streaming behavior. A streaming page where the shell is fast will show LCP concentrated at a low value in the distribution. A streaming page where the LCP element was accidentally put inside a Suspense boundary will show LCP spread across a wide range — fast for users with low server latency, slow for everyone else.

Watching LCP in production after changes to Suspense boundaries is essential. A boundary that seemed like an obvious improvement in development — where all database queries complete in under 5ms — can significantly delay LCP in production where the queries take 80ms.

LCP regressions from streaming changes are subtle. They don't always show up in aggregate metrics immediately, because the regression often only affects users on slower connections or in regions with higher server latency. RPAlert alerts on LCP threshold crossings from real browsers within 60 seconds — the kind of segmented, real-user signal that catches these regressions before they appear in a weekly CrUX report. Given that streaming bugs tend to be environment-specific, having a production alert is the difference between catching a regression on Tuesday and noticing it the following Monday.

The mental model that makes streaming work

Streaming is a tool for getting the right content to the browser at the right time. The shell should be fast and contain what users immediately see. Suspense boundaries should contain what's slow, personalized, or below the fold.

When that split is done correctly, streaming genuinely improves perceived performance and often improves LCP. When the LCP element ends up on the wrong side of a Suspense boundary, streaming adds latency without improving anything the user notices.

The questions to ask before adding a Suspense boundary: what is the LCP element on this page, and is it going to be in the shell? If not, restructure the data fetching before restructuring the component tree.