1 — Health check path mismatch (most common)
-
ELB is checking
/health
, but instance A:- doesn’t serve that path,
- serves it under a different route (e.g.,
/status
), - or requires authentication to access it.
Instance returns 404 / 403 / 500 → ELB marks it unhealthy.
2 — Wrong port or protocol
- ELB checks port 80 (HTTP), but app on instance A listens on 8080.
- Or ELB expects HTTP, but instance only responds on HTTPS.
- → ELB sees connection refused / timeout.
3 — Security group or NACL blocking traffic
-
ELB’s health check traffic can’t reach instance A because:
- Instance’s security group doesn’t allow inbound traffic from the ELB on the health check port.
- Or NACLs block it.
→ Health check packets dropped.
4 — Application slow or erroring under load
-
App on instance A:
- Responds too slowly (longer than ELB’s health check timeout).
- Returns 5xx errors intermittently (crash, memory leak, DB issue).
→ Health check fails while other instances may still pass.
5 — Instance-specific misconfiguration
-
Instance A might have:
- Wrong version of the app deployed.
- Dependency missing.
- Config file pointing to wrong DB.
- Local firewall (iptables/ufw) blocking health check traffic.
→ Only that one instance fails health checks.
6 — Health check thresholds
- ELB requires X successful responses before marking “healthy”.
- If instance A is flaky and fails 2/3 probes → still marked unhealthy.
- Others might pass consistently.
7 — OS or networking issue
-
Instance A’s OS/network stack may be unhealthy:
- High CPU load.
- Network interface issues.
- Misconfigured route table.
ELB can connect to others but not A.
✅ Key takeaway
If ELB marks instance A unhealthy, even though you think it’s “fine”:
- Start with health check configuration (path, port, protocol).
- Then verify network access (SG, NACL, firewall).
- Finally, check application logs on that instance for errors, slowness, or path mismatches.
Top comments (0)