You've spun up two containers on a custom bridge network. DNS works. Ping works. But curl to your application returns "Connection refused" or just hangs. I've debugged this exact scenario a dozen times across ML inference APIs talking to Redis, FastAPI services querying vector databases, and monitoring sidecars trying to scrape metrics.
The problem isn't networking — it's that your application isn't actually listening where you think it is.
Why ping works but HTTP doesn't
When you ping redis from the app container, Docker's embedded DNS resolver translates that name to the container's IP on the bridge network. ICMP packets flow through without issue because ping operates at the network layer. No ports, no listeners, just "is this IP reachable?"
HTTP requires a process actively listening on a specific port. If your application binds to 127.0.0.1:8000 instead of 0.0.0.0:8000, it only accepts connections from localhost inside that container. Traffic from another container hits the network interface, finds nothing listening, and the kernel sends back a TCP RST.
Here's what actually happens when you run curl http://app:8000 from the redis container:
- DNS resolves
appto something like172.18.0.2 - TCP SYN packet travels to that IP on port 8000
- If the app is bound to
127.0.0.1:8000, the kernel checks: "Is there a socket listening on172.18.0.2:8000?" Answer: no. - Kernel replies with RST (connection refused) or drops the packet (timeout)
Verify what your application is actually bound to
SSH into your app container and check what's listening:
docker exec -it app netstat -tlnp
You'll see output like:
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN 1/python
That 127.0.0.1:8000 is your problem. The application is only reachable from inside its own container. You need 0.0.0.0:8000:
tcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN 1/python
If you're running a FastAPI app with Uvicorn, the default host is 127.0.0.1. You must explicitly set it:
# main.py
import uvicorn
from fastapi import FastAPI
app = FastAPI()
@app.get("/health")
def health():
return {"status": "healthy"}
if __name__ == "__main__":
# This will NOT work for inter-container communication
# uvicorn.run(app, host="127.0.0.1", port=8000)
# This binds to all interfaces
uvicorn.run(app, host="0.0.0.0", port=8000)
Flask, Django's runserver, and most development servers have the same issue. Flask's app.run() defaults to localhost. Django requires python manage.py runserver 0.0.0.0:8000.
The ports mapping red herring
The ports: - "8000:8000" line in your compose file publishes the container's port 8000 to the host's port 8000. This is for external access — like hitting http://localhost:8000 from your laptop.
Inter-container communication on the same network bypasses port publishing entirely. Containers talk directly via the bridge network's private IP space. If you removed ports: - "8000:8000", containers could still reach each other (assuming the app binds to 0.0.0.0).
I've seen engineers spend hours tweaking port mappings when the issue is purely the bind address.
Real debugging session
You're inside the redis container trying to reach the app:
# This works (DNS resolution)
nslookup app
# This works (network layer)
ping app
# This fails (application layer)
curl http://app:8000
# curl: (7) Failed to connect to app port 8000: Connection refused
Now exec into the app container and check listeners:
docker exec -it app sh
netstat -tlnp | grep 8000
# tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN 1/python
There it is. Fix the bind address in your application code, rebuild the image, restart the container. Run netstat again:
netstat -tlnp | grep 8000
# tcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN 1/python
Now curl from redis works.
Other causes (less common but real)
Firewall rules inside the container. If you're running iptables or ufw inside a container (don't), they can block incoming traffic even when the app binds correctly. I've seen this in custom ML inference images where someone copied firewall configs from a VM setup.
Application-level issues. Your app might be crashing on startup, listening briefly, then dying. Check logs: docker logs app. If you see the server start message followed by a Python traceback, that's your issue — not networking.
Wrong protocol. This sounds dumb but I've debugged it twice: your app listens on HTTPS (TLS required), you're curling plain HTTP. Or the app expects HTTP/2 and your client sends HTTP/1.1. Both scenarios time out or fail in confusing ways.
SELinux or AppArmor. On some Linux distributions, mandatory access controls can block container-to-container traffic even on the same network. Check dmesg | grep -i denied after a failed connection attempt.
The correct compose file
Here's what your setup should look like for a typical FastAPI + Redis stack:
services:
app:
build: .
container_name: app
networks:
- mynetwork
ports:
- "8000:8000" # Host access only
environment:
- REDIS_HOST=redis
- REDIS_PORT=6379
depends_on:
- redis
redis:
image: redis:alpine
container_name: redis
networks:
- mynetwork
# No ports needed unless you want host access to Redis
networks:
mynetwork:
driver: bridge
And your application must bind to 0.0.0.0. For Uvicorn in production, I run it via command override:
app:
build: .
command: uvicorn main:app --host 0.0.0.0 --port 8000
This makes the bind address explicit in the deployment config, not buried in application code where the next developer might miss it.
Why this matters for AI infrastructure
Every LLM inference API I've deployed follows this pattern: FastAPI frontend talking to a vector database (Qdrant, Milvus), a Redis cache, and sometimes multiple model containers. When one component can't reach another, the entire request pipeline fails.
The symptom — "Connection refused" — looks like a networking problem. The fix is almost always a bind address configuration in your Python code. I've watched engineers add custom network configs, adjust MTU settings, and rebuild Docker networks when they needed to change one line in uvicorn.run().
Test inter-container communication immediately after writing your compose file. Don't wait until you're debugging a failed inference request in production.
This post is an excerpt from Practical AI Infrastructure Engineering — a production handbook covering Docker, GPU infrastructure, vector databases, and LLM APIs. Full book with 4 hands-on capstone projects available at https://activ8ted.gumroad.com/l/ssmfkx
Originally published at fivenineslab.com
Top comments (0)