How a Single CDN Addition Broke Our App, and What It Taught Us About the Invisible Contract Between Infrastructure and Code
Your code doesn't run in a vacuum. It runs on infrastructure. And infrastructure has opinions.
The Setup That Should Have Been Identical
We had a production-grade AI chatbot platform a Next.js dashboard backed by a FastAPI server. Three AWS deployments, same codebase, same Docker image, same CI/CD pipeline.
Server A and Server B worked flawlessly. Users could open a chat, send a message, and get an AI response. Simple.
Server C did something different. When a user sent the very first message in a new chat:
- The screen blinked
- The message vanished
- No response appeared
- The network tab showed a cancelled request
But here's the maddening part if the user refreshed the page, the chat appeared in the sidebar. The backend had received and processed the message. The data was there. The UI just... self-destructed before it could render.
Same code. Same image. Three servers. One broken.
The Debugging Spiral
We started where any engineer would the code.
We checked the API layer. Intact. We checked the Redux store. Correct. We checked the authentication interceptors. No rogue logouts. We added debug logging to every layer component mount/unmount, axios interceptors, Redux dispatches.
The logs told us something strange: the entire React component tree was unmounting and remounting during the first message send. Not a re-render. A full destruction and reconstruction. The Redux store was being wiped clean mid-API-call.
This wasn't a React bug. This was a navigation event.
The Invisible Culprit
Here's the code that worked perfectly on two servers and catastrophically failed on the third:
// ChatBlank.tsx The "new chat" screen
const handleSend = async (text: string) => {
// 1. Create optimistic thread in Redux
dispatch(addThread(optimisticThread));
dispatch(setCurrentThread(tempThreadId));
// 2. Navigate to the thread view
router.replace(`/dashboard/chat?threadKey=${tempThreadId}`);
// 3. Fire the API call
const response = await chatApi.sendMessage({ question: text, ... });
// 4. Update with real data
dispatch(replaceThreadId({ oldId: tempThreadId, newThread: response }));
};
On the surface, this is textbook optimistic UI. Create a temporary thread, navigate to it, fire the API, swap in real data when it arrives. And on Servers A and B, it worked exactly like that.
Server C had one infrastructure difference: CloudFront sat in front of Nginx.
Server A: Browser → Nginx → Next.js ✅
Server B: Browser → Nginx → Next.js ✅
Server C: Browser → CloudFront → Nginx → Next.js 💥
The Chain Reaction
Here's what router.replace() actually does under the hood in Next.js App Router (v14+):
- It sends an HTTP request to the server with special headers:
RSC: 1,Next-Router-State-Tree,Next-Url - The server responds with a lightweight RSC payload (
text/x-component) not full HTML - React reconciles the diff and updates the DOM in-place
This is the React Server Components flight protocol. It's elegant and fast when those headers reach the server intact.
CloudFront's default Origin Request Policy (Managed-HostHeaderOnly) forwards exactly one header to the origin: Host. Every other header including RSC: 1 gets silently stripped.
So when router.replace() fired on Server C:
- Browser sends:
GET /dashboard/chat?threadKey=abcwithRSC: 1 - CloudFront strips
RSC: 1, forwards a plain GET - Next.js sees a normal page request, responds with full HTML (
text/html) - The browser receives HTML where it expected an RSC payload
- Next.js client-side router falls back to a hard navigation full page reload
- The Redux store is destroyed
- The in-flight
sendMessageAPI call is cancelled - The page remounts with a blank slate
The message reached the backend (it was already in-flight before the reload), but the response had nowhere to land. The component that dispatched it no longer existed.
The Real Lesson: Infrastructure Is a Runtime Dependency
The fix wasn't a one-line change. It required rethinking the assumption that was baked into the code:
"Client-side navigation is always a lightweight, in-place update."
That assumption holds when your infrastructure is transparent to the framework's protocol. The moment a CDN, proxy, or WAF modifies headers, the contract breaks and it breaks silently. No error. No warning. Just a fallback behavior that looks like a bug in your code.
The Fix: Decouple Navigation from API Calls
We moved the API call out of the component lifecycle entirely, into a Redux thunk. And we replaced router.replace() with window.history.replaceState() a browser-native API that updates the URL without triggering any server request.
// ChatBlank.tsx After fix
const handleSend = async (text: string) => {
// 1. Create optimistic thread in Redux
dispatch(addThread(optimisticThread));
dispatch(setCurrentThread(tempThreadId));
// 2. NO router.replace() let React re-render via Redux state change
// 3. Fire API call at Redux level (survives component unmount)
dispatch(sendFirstMessage({ text, model, user, tempThreadId }));
};
// page.tsx View switches on Redux state, not URL
const activeThreadId = threadKey || currentThreadId;
if (composeMode || (!threadKey && !currentThreadId)) {
return <ChatBlank />;
}
return <ChatThread threadId={activeThreadId} />;
// thunks/chat.ts API call + URL update happens at store level
const sendFirstMessage = (params) => async (dispatch) => {
const response = await chatApi.sendMessage({ ... });
dispatch(replaceThreadId({ oldId: tempThreadId, newThread }));
dispatch(setCurrentThread(response.external_chat_id));
// Update URL only after API completes, using browser API (no RSC fetch)
window.history.replaceState(
null, '',
`/dashboard/chat?threadKey=${response.external_chat_id}`
);
};
What changed:
| Before | After |
|---|---|
router.replace() triggers RSC fetch |
window.history.replaceState() – no server request |
| API call lives in component (dies on unmount) | API call lives in Redux thunk (survives any re-render) |
| View switches based on URL | View switches based on Redux state |
| URL updates reactively via useEffect | URL updates imperatively after API success |
The Broader Takeaway
This bug didn't appear in any test suite. It couldn't it only manifested when a CDN sat between the browser and the server. The code was correct. The infrastructure was correctly configured for a traditional web app. But Next.js App Router isn't a traditional web app. It has an implicit runtime protocol (RSC flight) that requires specific headers to pass through every layer of your infrastructure.
Three rules we now follow:
1. Treat your deployment topology as a code dependency.
If your code assumes router.push() does a lightweight RSC fetch, your infrastructure must forward RSC headers. Document this. Test this. Don't discover it in production.
2. Never couple API calls to navigation events.
If a user action requires both "navigate somewhere" and "call an API," those should be independent operations. Navigation can fail, be intercepted, or behave differently across environments. Your API call shouldn't be collateral damage.
3. The same Docker image on different infrastructure is NOT the same deployment.
A CDN, a WAF, a reverse proxy each one is a participant in your application's runtime behavior. "Works on my server" is the new "works on my machine."
The Humbling Part
The CloudFront distribution was added to Server C as a performance optimization faster static asset delivery, edge caching, DDoS protection. A routine infrastructure improvement. No one thought to check whether the CDN would interfere with a framework-level protocol that operates over HTTP headers.
It took a few hours of debugging across application code, Redux state, authentication flows, and network traces before we even looked at the infrastructure layer. The fix was ultimately ~40 lines of code. The diagnosis was the hard part.
Sometimes the most dangerous bugs aren't in the code you write. They're in the assumptions your code makes about the world it runs in.
Top comments (0)