We Learn From Mistakes: A Real-World CI/CD Debugging Story
When you're a developer working with modern stacks like Strapi, Next.js, and Docker, setting up a CI/CD pipeline feels like a milestone. But sometimes, that smooth automation dream turns into a long night of debugging. This is the story of how I chased down a strange API error—and what I learned along the way.
⚙️ The Setup
Our stack:
- Strapi backend
- Next.js frontend
- Dockerized deployment
- CI/CD pipeline for auto-deployments
- Self-hosted Linux server
Everything ran smoothly on local development, and even on Vercel (for testing the frontend alone). But once deployed on our self-hosted server, the app broke.
🧨 The Error
Next.js was throwing a cryptic error when trying to fetch data from Strapi:
API or validation error: TypeError: fetch failed
at async s (.next/server/chunks/507.js:1:806)
at async m (.next/server/app/categories/page.js:1:747) {
[cause]: [Error [ConnectTimeoutError]: Connect Timeout Error] {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}
It didn't make sense. Why would fetch fail only on the deployed version and not on local or Vercel?
🔍 The Investigation
Here's how I tackled it step-by-step:
✅ Step 1: Reproduce Locally
I ran the same Next.js app with Strapi in Docker locally and everything worked fine—even when using Postman or curl to test the API. So, it wasn't a code issue.
🧱 Step 2: Check Docker Networking
Maybe Docker containers couldn't talk to each other? I made sure both services were on the same Docker network and updated API calls to use:
http://strapi:1337
But the error still persisted. 😩
🔎 Step 3: Look Into the Build Artifacts
Next.js stores route logic inside .next/server/app. I inspected .next/server/app/categories/page.js to see if something was off. But everything seemed normal.
⏱️ Step 4: Increase Timeouts and CORS Configs
I suspected timeout issues and CORS misconfigurations. So I:
- Increased the fetch timeout in Next.js
- Rechecked Strapi's
middleware.jsfor proper CORS headers
Still nothing.
🧪 Step 5: Strip Docker from the Equation
Out of desperation, I ran the entire stack without Docker on the same self-hosted server. And guess what? It worked.
🎯 The Real Culprit: Network Restrictions
This confirmed it: the issue wasn't with the code, or Docker, or even the CI/CD pipeline. It was a network access issue on the host server. Specifically, the server had restrictions that blocked container-to-container communication or DNS resolution for internal Docker service names.
🛠️ The Fix
The solution was simple once I understood the problem:
✅ I configured Strapi with a real domain name, like https://api.myapp.com, and updated the frontend to fetch from that instead of http://strapi:1337.
After that, I redeployed using the same Docker + CI/CD pipeline—and everything worked. 🥳
🔚 Summary
| Problem | Solution |
|---|---|
fetch failed error with timeout on deployed Next.js |
Diagnosed as a network-level issue on self-hosted server |
Docker service name (strapi) didn't resolve |
Replaced with real domain name |
| CI/CD + Docker setup was fine | But host server's networking caused the fetch to fail |
📝 Key Takeaways
- Docker networking can behave differently across environments. What works locally might not work on production.
- Don't overlook your host's firewall, DNS, or networking rules.
- Real domain names can often solve connectivity issues, especially across containers and services.
- Debugging isn't always about code—sometimes, it's about the infrastructure.
Hope this saves someone else a few hours (or days) of head-scratching!
Let me know if you've hit similar issues—I’d love to hear how you solved them.
Top comments (0)