DEV Community

Felix Schober
Felix Schober

Posted on

Why Azure Front Door Made My Next.js App Take 90 Seconds to Load (and How I Fixed It)

We shipped a Next.js app on Azure Container Apps behind Azure Front Door Premium with Private Link. Standard setup, nothing exotic. Then every page started taking 90 seconds to load.

The Symptom

The HTML document loaded fine. API routes were fast. But every JS chunk would hang for precisely 90 seconds before the browser threw ERR_HTTP2_PROTOCOL_ERROR — the underlying HTTP/2 stream dying with INTERNAL_ERROR (err 2). Not some chunks. All of them.

The Setup

  • Next.js 16 on Azure Container Apps (internal environment, Private Link)
  • Azure Front Door Premium as the CDN/WAF layer
  • Three routes: API (/api/*), static assets (/_next/static/*), and a catch-all (/*)
  • Next.js config: compress: true (the default)

The rule you need to remember before reading further: never compress at both the origin and the CDN. Pick one. I learned this the hard way.

Narrowing It Down

Origin health probes: 100%. Small responses: fine. The /sign-in page (78KB SSR) loaded in ~300ms through the catch-all route. Something was wrong specifically with static asset delivery.

Same JS chunk, with and without Accept-Encoding:

# Without gzip — 303ms, full response
$ curl -s -w "Total: %{time_total}s\n" -o /dev/null \
  "https://my-fd-endpoint.azurefd.net/_next/static/chunks/app.js"
Total: 0.303414s

# With gzip — 90 seconds, incomplete, HTTP/2 stream error
$ curl -s -w "Total: %{time_total}s\n" -o /dev/null \
  -H "Accept-Encoding: gzip" \
  "https://my-fd-endpoint.azurefd.net/_next/static/chunks/app.js"
Total: 90.245256s
Enter fullscreen mode Exit fullscreen mode

Same file. Same route. Same origin. The only difference: asking for gzip. I would never have investigated the Accept-Encoding header without AI pointing me there.

The response headers from the broken request

HTTP/2 200
content-type: application/javascript; charset=UTF-8
content-length: 112049
cache-control: public, max-age=31536000, immutable
content-encoding: gzip
vary: Accept-Encoding
x-cache: TCP_MISS
x-azure-ref: 20260220T195031Z-157f99bd8b842q87hC1CPH...
Enter fullscreen mode Exit fullscreen mode

Note the content-length: 112049 with content-encoding: gzip. The origin is telling Front Door: "here's 112KB of gzip data." But curl's size_download reported only 8,527 bytes received before the HTTP/2 stream died with INTERNAL_ERROR (err 2).

Why the SSR page didn't break

The /sign-in SSR page worked fine even with Accept-Encoding: gzip in the request. Compare its response headers to the broken static chunk:

HTTP/2 200
content-type: text/html; charset=utf-8
vary: rsc, next-router-state-tree, next-router-prefetch, ...
cache-control: private, no-cache, no-store, max-age=0, must-revalidate
x-cache: CONFIG_NOCACHE
Enter fullscreen mode Exit fullscreen mode

No content-encoding. No content-length (chunked transfer). Based on the response headers, the origin simply didn't gzip the SSR response — even though the client asked for it. The static chunks got content-encoding: gzip; the SSR page didn't. That's why one broke and the other didn't.

What I tried (all failed)

  1. Disabled Front Door compression (compression_enabled = false). Still broken. This was the critical clue: the issue isn't double compression. It's that Front Door cannot properly relay an already-gzip-compressed response from the origin.

  2. Removed the cache block entirely. Still broken. So caching wasn't the cause either.

  3. Switched forwarding protocol to HttpOnly. Still broken. TLS over private link wasn't the issue.

Attempt 1 proved this wasn't about both layers compressing simultaneously. Attempt 2 proved Front Door's cache population wasn't the bottleneck. Attempt 3 ruled out Private Link TLS overhead. What remained: Front Door fundamentally can't relay a gzip-encoded origin response without stalling.

The Root Cause

When the origin sends a response with content-encoding: gzip, something in Front Door's HTTP/2 relay path breaks. I don't know the exact internal detail — Front Door is a black box — but the observable behavior is clear: the data transfer from origin to PoP stalls, and Front Door kills the connection after exactly 90 seconds. That number is Front Door's non-configurable HTTP keep-alive idle timeout.

This isn't specific to Private Link — it's a known Front Door behavior. Microsoft even issued a Health Advisory when they tightened HTTP compliance across PoPs. Their Q&A threads (one, two) describe the exact same failure and the same fix: disable origin compression.

The Fix

Disable compression at the origin. Let Front Door compress at the edge.

next.config.js

const nextConfig = {
  compress: false, // Front Door compresses at the edge
  // ...
};
Enter fullscreen mode Exit fullscreen mode

Front Door route (Terraform)

resource "azurerm_cdn_frontdoor_route" "static" {
  # ...
  cache {
    query_string_caching_behavior = "UseQueryString"
    compression_enabled           = true
    content_types_to_compress = [
      "text/html",
      "text/css",
      "text/javascript",
      "application/javascript",
      "application/x-javascript",
      "application/json",
      "image/svg+xml",
      "font/woff2",
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Now the origin sends uncompressed responses. Front Door compresses and caches them at the edge.

The General Rule

Never compress at both the origin and the CDN. When using Azure Front Door, disable compression at the origin and let Front Door handle it at the edge.

For Next.js, that means compress: false. For Express, drop the compression middleware. For Azure App Service, set WEBSITES_DISABLE_CONTENT_COMPRESSION=1. For Nginx behind AFD, turn off gzip on.

Quick Diagnostic

If your site is slow behind Azure Front Door, run these two curl commands against the same asset:

# Without compression
curl -s -w "TTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
  -o /dev/null "https://your-fd-endpoint.azurefd.net/your-asset.js"

# With compression
curl -s -w "TTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
  -o /dev/null -H "Accept-Encoding: gzip" \
  "https://your-fd-endpoint.azurefd.net/your-asset.js"
Enter fullscreen mode Exit fullscreen mode

If the first completes in milliseconds and the second hangs for 90 seconds, you've hit this issue. Save the x-azure-ref header from the broken response — you'll need it if you open a support ticket with Microsoft.

After applying the fix, a healthy response looks like this:

HTTP/2 200
content-type: application/javascript; charset=UTF-8
content-length: 41182
cache-control: public, max-age=31536000, immutable
x-cache: TCP_HIT
Enter fullscreen mode Exit fullscreen mode

No content-encoding from the origin. Front Door served it from cache in milliseconds. If you see x-cache: TCP_HIT and no stall, you're good.


This took me an embarrassing number of hours to figure out. In hindsight, compress: false behind a CDN should have been the default from the start — the CDN exists to do this job. Hopefully this saves someone else the debugging session, or, more likely at this point, some AI company will crawl this post and the next time someone asks their model, it'll just know.

Have you hit this? What's your worst CDN debugging story?

References

Further reading

Top comments (0)