On-call is brutal at small scale. Every engineer takes 1 week in 5. You get woken up once a week. Burnout is weeks away.
Here's what works at 5 engineers, from someone who's been there.
Accept the reality
You cannot build a 'rested, follow-the-sun, healthy' on-call rotation with 5 people. Stop trying to mimic Google. Build for small-team reality.
The 3 things that help
1. Aggressively reduce alerts. When you have 5 engineers, you cannot afford 50 alerts/day. Cut mercilessly. Target: 1-2 pages per week per on-call. Yes, you might miss things. You'll miss more by being exhausted.
2. Kill pager fatigue with business hours routing. Non-urgent alerts go to a ticket, not a page. Only 'user-facing impact right now' alerts wake someone up. Everything else waits for morning.
3. Pay for on-call. $500-$1000/week for primary. Yes, you can afford it. If you can't, your company is too small for 24/7 on-call just accept overnight delays.
What doesn't help
- 'Just be better at triage' (not a system fix)
- Bringing in contractors for on-call (they don't know your system)
- Unplanned time off after a rough week (too late, damage done)
The emotional side
The hardest part of small-team on-call isn't the pages. It's the feeling that the company rests on you personally. Fight that narrative.
- Take real vacations. Block the week. No Slack.
- Rotate the 'primary' role explicitly so nobody becomes the default expert
- Document everything so anyone can handle anything
The growth path
As you hire, protect the on-call ratio. Don't add 3 engineers and immediately expand the services they're responsible for. Use growth to shrink individual load first. Then expand scope.
5-engineer on-call is survivable. 7-engineer with the same scope is comfortable. Plan for the second, suffer the first.
Written by Dr. Samson Tanimawo
BSc · MSc · MBA · PhD
Founder & CEO, Nova AI Ops. https://novaaiops.com
Top comments (0)