As Ben pointed out in his post, "I've been de facto on call 24 hours a day since starting dev.to but this weekend I'm going camping", we clearly don't have a fair schedule for handling late night or weekend emergencies, yet.
I know some teams rotate weekend duty and get a weekday off here or there, but I'd really like some real world recommendations or what to or not to do.
Top comments (16)
Our team runs a Tuesday to Tuesday weekly schedule for a primary and secondary on-call person. We have around 8 people on call, so we get a 2 week break in-between.
If we have something important on while on-call, other team members have been great at picking those days up.
It's also important to add, we are paid for being on call, whether we get a call or not.
If someone is on vacation, they are on vacation, no on-call duty for them :)
Well I work remotely. It really depends on what the actual issue is, as our teams are split into different sub categories of what we do. Generally if a machine fails and stop sending data we'll all get a notification to our phone saying its gone down, after 10 minutes if no one has responded and more start to go down an alarm goes off from my phone that screams alert in my face. If the time comes and we can't be on for this we're able to set it so it wont alert us at all, but only if we're not near our computer.
What team do you have? in my office, critical moments like that might happen when the server is down or there are bugs that make a site not running properly. Usually, all members of the team are always on hand when things happen like that. And when someone was on vacation, there will always be other team members who willing to replace.
We believe that a vacation is a necessity and will make someone become more productive, and when there are members of our team who were on vacation, as much as possible we will not interfere with him/her. so take your 'me time' :)
Here are a few recommendations that have worked well in environments I've worked in the past :
My team had a frank discussion with management about overtime and management decided that it could affort 10 hours of overtime/dev/year. This amounts to "we fix problems during regular business hours".
That's ok for us, because we're the government and our publicly facing software generally serves a group of people who have 10 days to get us the information. Assuming they don't wait until day 9 on a Saturday, slow weekend response is ok.
We do pay for a 24 hour help desk with a binder full of answers to common questions that people can call, which reduces the need for devs.
We talked about what we would do if 24/7, 365 responsiveness ever became necessary. Our feelings were that having enough devs to do 3 shifts, a la real 24 hour factory operations were probably the optimal solution, but failing the budget for that, rotating 2 weeks on, 2 weeks off on call duty seemed next best.
Every dev goes on-call 24/7 for one-week.
Every dev is on-call one week every N weeks, where N = devs on team.
Exceptions are made as necessary, usually just one-day swaps but occasionally full-week swaps are necessary.
Managers at my workplace are on a rotating weekly schedule with just one individual on call after-hours. Person in possession of the on call cell phone is responsible for triaging issues. If something can wait until the next business day, then a normal support ticket gets created on behalf of the caller with the incident particulars. On the other hand, if the matter is a true emergency, then our internal knowledgebase contains a series of "who to call if X happens" lists to get the correct non-management people in play.
The call tree with "who to call if X happens" is very helpful for incident response!
At one of my last companies we used to get an extra days pay in our pay checks every month and we would rotate being "on call" every weekend it would be someone else's turn. At my last company we used to substitute weekend work days for PTO so if you worked weekends for a few weeks straight you could save up days and get an extra vacation. One of our team members took like a month off with saved up PTO (Paid time off)
I'm currently on a team that supports a critical government app responsible for people's pay cheques. We have a phone that the on call person carries and only get calls from the help desk when they get a call. We do two weeks on and have anywhere from 2 to 5 people on rotation depending on the current team size. There is extra pay for "standby" and also hourly payment for doing work outside of business hours. It works well and all the devs are onboard.
Same as many here. On call roster is 1 week shifts of being secondary followed by 1 week shift of primary followed by a week off. It is really easy to manage via PagerDuty. They have calendar integrations and an easy way to manage overrides for particular days. It is important to have levels of on call so that people know that in an emergency there are other people to rely on as well.
We all switch around weekends a lot when people have different events to go to. I think its best to allow flexibility but keep a solid schedule.
Also, pay people for the days they are on-call. These are extra hours that they have to work and they deserve compensation for it. It also makes it easier to switch out days since there is an incentive.
We have just started our on-call shift for about 2 weeks ? Previously, I don't agree on having on-call shift, preferring "mutual responsibility" approach. But now I started to feel it's bad for everyone, as that mean everyone can't "disconnect" from work at all time. So We discussed how the on-call shift should look like - 2 days per shift, or one week shift etc. We're a team of 10 but only 5 being put on call for now. In the end we decided for 2 days shift as 1 week seem too long for 1 person to take on.
We have a python script that printing the on-call roster for the following week but plan to have one month schedule ahead, as that seem easier to plan your day when you know what day you'll be on-call for the month.
p/s: I'm currently on-call, that's why I came here 😁
I mainly sleep during my shift mainly I don't have work lol
We have no real system for this at my work. It's whoever is available, willing, and most capable that usually gets called.
There's no magic recipe I guess, the trick is to have a big team working on rotation. There have been times I used to support two technologies handling issues for both on the same time 😏
I don't have anything to add to the conversation, just wanted to point you in the direction of this publication which has an entire issue on "on call": increment.com/on-call/