DEV Community

sudolecture
sudolecture

Posted on

SWEs are ruining SRE

Preface: This is written from an anonymous account. Why? In some way, I fear retribution. I'd like to think our industry is better than that, but that's not always the case.

I am a Systems Engineer living in a city with a growing tech environment. Throughout my career I've had the opportunity to work in traditional Network Engineering as well as helping build fairly large scale systems with some nifty automation. I made it my goal to solve pain for developers so they could deliver faster and with more confidence. I like to think I've succeeded in that, even when my ideas weren't picked up immediately, they were usually eventually adopted in organizations.

That said, I have been relegated to my city. My city lacks a strong startup culture and therefore lacks many of the stock option and bonus packages that are being more commonly handed to engineers in the tech space. These are important keys to the ability of a tech worker to be able to retire, a simple 401k often isn't enough especially if you finally started making market average around 30 years old. In this last round of interviewing that is what I set my sights on.

Before I started applying to companies I knew Software Engineers (SWE's) in Site Reliability Engineering (SRE) were a problem. I got the first inkling of that when I interviewed at a large professional social network and was failed when I didn't know how to daemonize a Python application. Now that I write in Go day to day, I don't know why a competent software engineer would even ask this, but that is the question that failed me.

Further concern was drawn when I was at DevOps Days and witnessed a talk by a SRE-SWE talking about their SRE journey at a large Bitcoin firm. When asked, this SRE said that Systems Engineers (SE's), didn't have a place in SRE because he thought not being able to code is a disqualifier.

For those that don't know, Systems Engineers certainly know how to code, but we're probably a bit clumsy about TDD, we probably can't architect a huge application for you. I spend a lot of my time writing CLI's, I've even added features to microservices, and to be honest lately I've done a great deal of backend development. That said, it's not what I'm great at. I like building automation for things, often the mechanics of software in that realm are much simpler. I can listen to an event loop or I can run a job and it'll compensate for the weak points in any given stack.

While I would like to report that things have changed, they have only gotten worse. Companies have latched on to the saying, "SRE is operations through the lens of a software engineer" with zero additional context. One company I was told I executed TDD wrong. Which, they were right. I made the code readable before I wrote tests for it - to a software engineer that's obvious. I didn't know if what I was doing would alter the behavior of the code. To a Systems Engineer, that's not so obvious, but could be corrected with some slight nudging. I was given no Systems Engineering test.

The next company soured my grapes quite a bit. I was quite literally given a connect four (it was connect three) algorithm to play. Later I was given an architecture whiteboarding session where inside the interview I was given zero feedback. Afterward, the feedback I was given was that I'd overcomplicated something and that I failed the programming portion. Where, in the realistic world, do we play connect four in infrastructure or is architecture less than a conversation among engineers? Architecture has always been a back and forth conversation between proposal, rejection, and acceptance. Architecture without real-time feedback is just playing arts and crafts on a whiteboard in an echo chamber of biases.

The importance here is how an SRE-SE reacts to concerns (which ironically, helps keep SRE proactive) - do they ignore them, or do they formulate a plan? SRE is often not a one-stop shop for solutions, instead, it often requires many iterations before a problem is totally solved.

System Engineers need to know how to code. They need to know testing practices. I fully support that. But passing on a System Engineering job candidate because they can't perfectly craft the internals of a Java application, complete with perfect TDD, OOP, Factories and project structure is foolish.

While it may not seem obvious to some, these are symptoms of a club mentality. The idea that SRE only consists of Software Engineers needs to die in a fire and quickly. I have never gone into an interview seeking to stump a Software Engineer with a test they would never see in the course of their work; could you imagine if I had some CS graduate change the shell of a user on Linux? Maybe even make them explain to me the difference between the Bourne Shell and the Bourne Again Shell? What about asking a Senior Software Engineer explain what Linux containers consist of or how every namespace works?

That'd be stupid and petty. So, stop doing the same thing to Systems Engineers.

The goal in the pursuit of interviewing is to make the candidate shine with as many colors as possible. Thus, you now are choosing between two candidates who you've seen the best sides of, not one who sat through an algorithm class and one that fought back a panic attack while you made them live-process and solve an algorithm exercise for the first time in front of you while you dangled the carrot of compensation and benefits.

The long term effect here is that I, and others, are kept from jobs that reward with RSU's and bonuses. It's protecting the ivory tower of sorts and in this vein it's what groups championing diversity have been clamoring about this entire time. Your methods for interviewing and hiring aren't just ineffective, they're grossly incompetent.

In the short term, companies lose out on those who might understand why getting namespaces right is important. The people who can show you the nuances of why having a heavy base container image but smaller application layers is more effective than targeting overall small image sizes in some situations are now being weeded out by questions about matrices and imaginary line drawing. Try asking your top software engineer to debug an issue with two nodes being unable to communicate across a network, and then ask an average systems engineer to do the same. Now do it with containers networked by IPTables. Tell me your findings.

The Software and Systems Engineering skillsets are built on the same foundation but diverge in specialization. SWE's can do my job given enough time. I can do their job given enough time. My best experience in a balance of quality and productivity was being paired with a SWE as a Systems Engineer. In that setting we performed magic, our occupational flaws were overridden by common goals and a new, bigger toolbox to pull tricks from. One without the other is simply a new version of the same broken machine we had before.

Top comments (1)

Collapse
 
mrlarson2007 profile image
Michael Larson

Us software engineer face similar obstacles in interviewing. The sad truth is most of the industry sucks at interviewing, and the best thing that they can come up with is putting you in room with a whiteboard and 45 minutes to solve a problem. We effectively are optimizing for people that can solve coding challenges on a whiteboard in 45 minutes.

There are companies that are trying different things like using behavioral questions more or even having a day where you can work with the people you would be working with.

When I am interviewing, I never use gotcha questions. I am more interested if I can work with this person on a team to build great software.

While I am a big proponent of TDD, I am not a fanatic. If someone shows aptitude and is willing to learn that is what I am looking for.