DEV Community

Discussion on: I'm Charity Majors, Ask Me Anything! [FINISHED]

Collapse
 
mipsytipsy profile image
Charity Majors

A guilty little secret of mine is that I love firefighting. Some people jump out of airplanes ... I love the panic and glee of knowing everything is on fire, the company might not survive if I don't fix it correctly. I love high pressure and high stakes.

... But I will normally deny this (if sober), because in modern day infrastructure you are supposed to hate firefighting and want to write software all day.

On call sucks for a lot of people. It's a shame. I really believe it doesn't have to, it should be a net plus. Sounds like you're at least partway there. :)

Collapse
 
andy profile image
Andy Zhao (he/him)

Hah, totally feel you about loving firefighting. There's a real thrill of it for me, too, and it's really fun just being on the edge of the seat, where every solution you push out might be the one!!!

I might be partway there, not sure yet honestly. I can never reason out how it can be a net plus though, since (at least in my mind) it's a net-plus only if something broke and it was solved, right?

Thread Thread
 
mipsytipsy profile image
Charity Majors

Net plus? Because for a week you have justifiable cause to run down any rabbit hole, do extravagant perf tuning, and other shit that normally isn't important enough to preempt your project work. :)

Collapse
 
angaither profile image
Andrea Gaither

I kind of also love firefighting. What's one of the worst fires you've dealt with? What did you learn?

Thread Thread
 
mipsytipsy profile image
Charity Majors

Um.. the worst outage of my life was probably at Second Life around 7 years ago. We tried upgrading the primary from 4.1 to 5.0; all the secondaries had been upgraded painlessly, and all the benchmarks said 5.0 was faster. When we upgraded, the grid stumbled to recover; we ended up being mostly down for over 24 hours, and losing all that data when we had to roll back to the last good 4.1 secondary (due to binary incompatibility, couldn't roll back in place).

I spent a year developing capture/replay software for mysql and testing various configiurations and workloads before finally upgrading successfully.

And what did I learn? To be desperately paranoid of all database upgrades, and assume that anything that can go wrong, will go wrong.