For me, being on call is sort of nice and fun, but I think it's hard to make plans and sometimes affects my work-life balance. What do you miss about being on call?
Cofounder of honeycomb.io, coauthor of Database Reliability Engineering. Operations engineer, DBA, systems engineer, SRE, devops, etc. On call since I was 17.
The only good diff is a red diff.
A guilty little secret of mine is that I love firefighting. Some people jump out of airplanes ... I love the panic and glee of knowing everything is on fire, the company might not survive if I don't fix it correctly. I love high pressure and high stakes.
... But I will normally deny this (if sober), because in modern day infrastructure you are supposed to hate firefighting and want to write software all day.
On call sucks for a lot of people. It's a shame. I really believe it doesn't have to, it should be a net plus. Sounds like you're at least partway there. :)
Hah, totally feel you about loving firefighting. There's a real thrill of it for me, too, and it's really fun just being on the edge of the seat, where every solution you push out might be the one!!!
I might be partway there, not sure yet honestly. I can never reason out how it can be a net plus though, since (at least in my mind) it's a net-plus only if something broke and it was solved, right?
Cofounder of honeycomb.io, coauthor of Database Reliability Engineering. Operations engineer, DBA, systems engineer, SRE, devops, etc. On call since I was 17.
The only good diff is a red diff.
Net plus? Because for a week you have justifiable cause to run down any rabbit hole, do extravagant perf tuning, and other shit that normally isn't important enough to preempt your project work. :)
Cofounder of honeycomb.io, coauthor of Database Reliability Engineering. Operations engineer, DBA, systems engineer, SRE, devops, etc. On call since I was 17.
The only good diff is a red diff.
Um.. the worst outage of my life was probably at Second Life around 7 years ago. We tried upgrading the primary from 4.1 to 5.0; all the secondaries had been upgraded painlessly, and all the benchmarks said 5.0 was faster. When we upgraded, the grid stumbled to recover; we ended up being mostly down for over 24 hours, and losing all that data when we had to roll back to the last good 4.1 secondary (due to binary incompatibility, couldn't roll back in place).
I spent a year developing capture/replay software for mysql and testing various configiurations and workloads before finally upgrading successfully.
And what did I learn? To be desperately paranoid of all database upgrades, and assume that anything that can go wrong, will go wrong.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
For me, being on call is sort of nice and fun, but I think it's hard to make plans and sometimes affects my work-life balance. What do you miss about being on call?
A guilty little secret of mine is that I love firefighting. Some people jump out of airplanes ... I love the panic and glee of knowing everything is on fire, the company might not survive if I don't fix it correctly. I love high pressure and high stakes.
... But I will normally deny this (if sober), because in modern day infrastructure you are supposed to hate firefighting and want to write software all day.
On call sucks for a lot of people. It's a shame. I really believe it doesn't have to, it should be a net plus. Sounds like you're at least partway there. :)
Hah, totally feel you about loving firefighting. There's a real thrill of it for me, too, and it's really fun just being on the edge of the seat, where every solution you push out might be the one!!!
I might be partway there, not sure yet honestly. I can never reason out how it can be a net plus though, since (at least in my mind) it's a net-plus only if something broke and it was solved, right?
Net plus? Because for a week you have justifiable cause to run down any rabbit hole, do extravagant perf tuning, and other shit that normally isn't important enough to preempt your project work. :)
I kind of also love firefighting. What's one of the worst fires you've dealt with? What did you learn?
Um.. the worst outage of my life was probably at Second Life around 7 years ago. We tried upgrading the primary from 4.1 to 5.0; all the secondaries had been upgraded painlessly, and all the benchmarks said 5.0 was faster. When we upgraded, the grid stumbled to recover; we ended up being mostly down for over 24 hours, and losing all that data when we had to roll back to the last good 4.1 secondary (due to binary incompatibility, couldn't roll back in place).
I spent a year developing capture/replay software for mysql and testing various configiurations and workloads before finally upgrading successfully.
And what did I learn? To be desperately paranoid of all database upgrades, and assume that anything that can go wrong, will go wrong.