DEV Community

Cover image for When One Line of Code Works Everywhere - Except Behind CloudFront
Muhammad Ahsan
Muhammad Ahsan

Posted on

When One Line of Code Works Everywhere - Except Behind CloudFront

How a Single CDN Addition Broke Our App, and What It Taught Us About the Invisible Contract Between Infrastructure and Code


Your code doesn't run in a vacuum. It runs on infrastructure. And infrastructure has opinions.


The Setup That Should Have Been Identical

We had a production-grade AI chatbot platform a Next.js dashboard backed by a FastAPI server. Three AWS deployments, same codebase, same Docker image, same CI/CD pipeline.

Server A and Server B worked flawlessly. Users could open a chat, send a message, and get an AI response. Simple.

Server C did something different. When a user sent the very first message in a new chat:

  • The screen blinked
  • The message vanished
  • No response appeared
  • The network tab showed a cancelled request

But here's the maddening part if the user refreshed the page, the chat appeared in the sidebar. The backend had received and processed the message. The data was there. The UI just... self-destructed before it could render.

Same code. Same image. Three servers. One broken.


The Debugging Spiral

We started where any engineer would the code.

We checked the API layer. Intact. We checked the Redux store. Correct. We checked the authentication interceptors. No rogue logouts. We added debug logging to every layer component mount/unmount, axios interceptors, Redux dispatches.

The logs told us something strange: the entire React component tree was unmounting and remounting during the first message send. Not a re-render. A full destruction and reconstruction. The Redux store was being wiped clean mid-API-call.

This wasn't a React bug. This was a navigation event.


The Invisible Culprit

Here's the code that worked perfectly on two servers and catastrophically failed on the third:

// ChatBlank.tsx  The "new chat" screen
const handleSend = async (text: string) => {
  // 1. Create optimistic thread in Redux
  dispatch(addThread(optimisticThread));
  dispatch(setCurrentThread(tempThreadId));

  // 2. Navigate to the thread view
  router.replace(`/dashboard/chat?threadKey=${tempThreadId}`);

  // 3. Fire the API call
  const response = await chatApi.sendMessage({ question: text, ... });

  // 4. Update with real data
  dispatch(replaceThreadId({ oldId: tempThreadId, newThread: response }));
};
Enter fullscreen mode Exit fullscreen mode

On the surface, this is textbook optimistic UI. Create a temporary thread, navigate to it, fire the API, swap in real data when it arrives. And on Servers A and B, it worked exactly like that.

Server C had one infrastructure difference: CloudFront sat in front of Nginx.

Server A: Browser → Nginx → Next.js     ✅
Server B: Browser → Nginx → Next.js     ✅
Server C: Browser → CloudFront → Nginx → Next.js  💥
Enter fullscreen mode Exit fullscreen mode

The Chain Reaction

Here's what router.replace() actually does under the hood in Next.js App Router (v14+):

  1. It sends an HTTP request to the server with special headers: RSC: 1, Next-Router-State-Tree, Next-Url
  2. The server responds with a lightweight RSC payload (text/x-component) not full HTML
  3. React reconciles the diff and updates the DOM in-place

This is the React Server Components flight protocol. It's elegant and fast when those headers reach the server intact.

CloudFront's default Origin Request Policy (Managed-HostHeaderOnly) forwards exactly one header to the origin: Host. Every other header including RSC: 1 gets silently stripped.

So when router.replace() fired on Server C:

  1. Browser sends: GET /dashboard/chat?threadKey=abc with RSC: 1
  2. CloudFront strips RSC: 1, forwards a plain GET
  3. Next.js sees a normal page request, responds with full HTML (text/html)
  4. The browser receives HTML where it expected an RSC payload
  5. Next.js client-side router falls back to a hard navigation full page reload
  6. The Redux store is destroyed
  7. The in-flight sendMessage API call is cancelled
  8. The page remounts with a blank slate

The message reached the backend (it was already in-flight before the reload), but the response had nowhere to land. The component that dispatched it no longer existed.


The Real Lesson: Infrastructure Is a Runtime Dependency

The fix wasn't a one-line change. It required rethinking the assumption that was baked into the code:

"Client-side navigation is always a lightweight, in-place update."

That assumption holds when your infrastructure is transparent to the framework's protocol. The moment a CDN, proxy, or WAF modifies headers, the contract breaks and it breaks silently. No error. No warning. Just a fallback behavior that looks like a bug in your code.

The Fix: Decouple Navigation from API Calls

We moved the API call out of the component lifecycle entirely, into a Redux thunk. And we replaced router.replace() with window.history.replaceState() a browser-native API that updates the URL without triggering any server request.

// ChatBlank.tsx  After fix
const handleSend = async (text: string) => {
  // 1. Create optimistic thread in Redux
  dispatch(addThread(optimisticThread));
  dispatch(setCurrentThread(tempThreadId));

  // 2. NO router.replace()  let React re-render via Redux state change
  // 3. Fire API call at Redux level (survives component unmount)
  dispatch(sendFirstMessage({ text, model, user, tempThreadId }));
};
Enter fullscreen mode Exit fullscreen mode
// page.tsx  View switches on Redux state, not URL
const activeThreadId = threadKey || currentThreadId;

if (composeMode || (!threadKey && !currentThreadId)) {
  return <ChatBlank />;
}
return <ChatThread threadId={activeThreadId} />;
Enter fullscreen mode Exit fullscreen mode
// thunks/chat.ts  API call + URL update happens at store level
const sendFirstMessage = (params) => async (dispatch) => {
  const response = await chatApi.sendMessage({ ... });

  dispatch(replaceThreadId({ oldId: tempThreadId, newThread }));
  dispatch(setCurrentThread(response.external_chat_id));

  // Update URL only after API completes, using browser API (no RSC fetch)
  window.history.replaceState(
    null, '', 
    `/dashboard/chat?threadKey=${response.external_chat_id}`
  );
};
Enter fullscreen mode Exit fullscreen mode

What changed:

Before After
router.replace() triggers RSC fetch window.history.replaceState() – no server request
API call lives in component (dies on unmount) API call lives in Redux thunk (survives any re-render)
View switches based on URL View switches based on Redux state
URL updates reactively via useEffect URL updates imperatively after API success

The Broader Takeaway

This bug didn't appear in any test suite. It couldn't it only manifested when a CDN sat between the browser and the server. The code was correct. The infrastructure was correctly configured for a traditional web app. But Next.js App Router isn't a traditional web app. It has an implicit runtime protocol (RSC flight) that requires specific headers to pass through every layer of your infrastructure.

Three rules we now follow:

1. Treat your deployment topology as a code dependency.
If your code assumes router.push() does a lightweight RSC fetch, your infrastructure must forward RSC headers. Document this. Test this. Don't discover it in production.

2. Never couple API calls to navigation events.
If a user action requires both "navigate somewhere" and "call an API," those should be independent operations. Navigation can fail, be intercepted, or behave differently across environments. Your API call shouldn't be collateral damage.

3. The same Docker image on different infrastructure is NOT the same deployment.
A CDN, a WAF, a reverse proxy each one is a participant in your application's runtime behavior. "Works on my server" is the new "works on my machine."


The Humbling Part

The CloudFront distribution was added to Server C as a performance optimization faster static asset delivery, edge caching, DDoS protection. A routine infrastructure improvement. No one thought to check whether the CDN would interfere with a framework-level protocol that operates over HTTP headers.

It took a few hours of debugging across application code, Redux state, authentication flows, and network traces before we even looked at the infrastructure layer. The fix was ultimately ~40 lines of code. The diagnosis was the hard part.

Sometimes the most dangerous bugs aren't in the code you write. They're in the assumptions your code makes about the world it runs in.

Top comments (0)