DevOps Day 0
- Networks are unreliable by default
- Packets can be lost, delayed, duplicated, or reordered
- Reliability is something protocols add, not something networks guarantee
This single idea explains:
Retries
Timeouts
Slowness
“Random” failures
- IP addresses identify machines
Every device on a network needs a unique IP
IP = who the machine is
Private IPs are used internally
Public IPs are used on the internet
Routers use NAT to map many private IPs to one public IP
If two machines share an IP → communication breaks.
- Ports identify services
One machine can run many services
Each service listens on a port
Only one service per port at a time
Addressing a service means:
IP + Port
Example:
192.168.1.10:443
- Communication requires three things
For one machine to talk to another:
Correct IP (machine)
Correct Port (service)
Open network path (routing + firewall)
If any fail → no communication.
- TCP vs UDP (two reliability strategies)
UDP
Fast
No delivery guarantee
No retries
No ordering
Low overhead
Used when:
Speed matters more than accuracy
Retrying is cheap
Examples:
DNS
Video streaming
Metrics
TCP
Reliable
Guaranteed delivery
Ordered data
Retries lost packets
Slower due to overhead
Used when:
- Correctness matters
Examples:
HTTP/HTTPS
Databases
APIs
SSH
- H*ow failures look in real systems*
UDP failures
Missing data
Flaky behaviour
Fast failures
TCP failures
Increased latency
Hanging requests
Silent slowness
UDP fails fast. TCP fails slow.
- DNS uses UDP by default
DNS chooses UDP because:
Requests are small
Connection setup is expensive
Retrying is cheaper than waiting
DNS can fall back to TCP when needed (large responses, DNSSEC).
- Three possible connection outcomes (critical)
When connecting to IP:PORT:
✅ Success
Service is up
Network path is open
❌ Connection refused
Machine reachable
No service listening on that port
➡ App / port issue
⏳ Connection timeout
Service unreachable
Packet blocked or lost
➡ Network / firewall issue
- Listening scope matters
A service can listen on:
localhost→ accessible only on same machine0.0.0.0→ accessible from anywhere
This causes:
“It works on my machine”
Core Day 0 Mental Model (remember this)
IP = who
Port = what
Protocol = how
Firewall = whether
DNS = convenience
If you can reason through these, you can debug most outages.
Top comments (0)