DEV Community

Luke Bearl
Luke Bearl

Posted on

The Bus Factor

This article originally appears on Luke's blog

I've been meaning to write this for quite a while, as the bus factor is something I've (literally) run into in my career. For those of you not familiar with it, the "Bus Factor" is basically an informal measure of resiliency of a project to the loss of one or more key members. It's basically the programming version of the old adage "Don't put all your eggs in one basket".

Story Time

Some years ago I was a software development intern at a large company in Milwaukee, Wisconsin. The team I was on was broken into a U.S. development team, an offshore dev team in India and an offshore QA team in China. We had daily scrum meetings at 8 AM every morning so that the US and Indian teams could participate all in one. One day we got word that one of the senior-most developers had literally been hit by a bus while crossing the street (thankfully he made a full recovery, but it certainly slowed down that part of the team as he was out for 8 weeks or so).

How to Reduce the Bus Factor

I'm sure people can (and probably have) written entire books on the subject of reducing the bus factor, and spreading knowledge around through the entire team. Spreading knowledge is really the key element in bus factor reduction.

How many people currently work on a team where one or two people are basically the wizards who secret spells make critical things happen (like deployments or provisioning infrastructure assets, or SSL certificates, or any of the other million things that need to be done in order to make software work)? I know I've worked on several teams where that happened. I've also worked with people who wanted to increase the bus factor as they thought it gave them better job security (a notion I strongly disagree with).

In my experience one of the best ways to reduce the bus factor is to maintain an internal wiki where developers and administrators can document processes for anything which they are going to do more than once (and sometimes it's good to document things that are being done once as well). Another great idea is to regularly schedule cross training (n+1 isn't only a good idea for infrastructure, developers and admins should have a bit of redundancy as well).

Ethics

I personally feel that there is an ethical responsibility for all engineers to be transparent in what they do. I never want to be the only person capable of doing something, instead I do my best to make sure that anything I do, which may ever need to be done again, is documented at least well enough that someone can probably piece it together. Doing this ensures that if I am ever hit by a bus the rest of my team won't have to try to figure out the magical incantations I have developed in order to do a number of things.

Conclusion

At the end of the day reducing the bus factor is good for your team. You never know when you or one of your colleagues are going to end up no longer being available to work (they might be hit by a bus, or it might be something more mundane like taking a new job, or leaving for a few months for a sabbatical or maternity/paternity leave). As an engineer and a member of a team you have an ethical obligation to ensure that you are both sharing processes and techniques you've developed with your colleagues, and also trying to learn those processes and techniques from your colleagues.

Top comments (21)

Collapse
 
courier10pt profile image
Bob van Hoove

Thanks for posting. I'm familiar with the issue..

In my experience, a healthy level of knowledge sharing is one of the hardest things to achieve within a team. Especially if you start using 'unfamiliar' techniques or libraries.

I'm a fan of README files at the solution level, containing:

  • a brief description of the project
  • prerequisites
  • instructions on how to get the project running (if anything special)
  • information on how to deploy it / where it resides

With this information you can run the project and make a change. And when the time comes to make changes in the unfamiliar part you can learn it by copying and pasting from Stack Overflow :P

Wrapping up, README won't solve your 'bus-factor' problem but it's a cheap way to alleviate some of the pain.

Collapse
 
lukebearl profile image
Luke Bearl

README files are good, but they have the same issue as any other documentation: developers don't like updating (obviously there are exceptions to that generalization).

Collapse
 
einenlum profile image
Yann Rabiller

I wrote an article about how to make developers update frequently their documentation and how to detect regularly the bus factor.

Hope it can bring some new ideas :)

Thread Thread
 
lukebearl profile image
Luke Bearl

Thanks! That's a fantastic idea and one I'll definitely bring up with my team.

Collapse
 
courier10pt profile image
Bob van Hoove • Edited

I think that README suggestion mostly stems from past annoyances with projects that make you figure it out all yourself :)

You mentioned your experience with wikis. I wonder, how did that work out? Was it well adopted? What sort of information would you put in there?

For one thing I like that it's hypertext.

Thread Thread
 
lukebearl profile image
Luke Bearl

Overall wikis have been good. We use the full Atlassian Suite so Confluence interops with everything else. The big problem is that the wiki isn't linked to the code at all, so things can change and people always forget to update. The only real solution we've found is to try and be vigilant about Documentation. Since we do code review on all branches it might make sense to try and have the reviewers also catch any documentation updates that are required.

Collapse
 
trickvi profile image
Tryggvi Björgvinsson

I used to use the term bus factor but people have found that too gloomy (they don't want to think about death). I've therefore started using the term lottery factor instead when talking about other people's bus factor: "If people win the lottery and walk out the door with not a care in the world".

Although, I still think about it as the bus factor when thinking about my own work.

Collapse
 
dal_geek profile image
David Lord

My mentor calls it "getting hit by the lotto bus".

Collapse
 
kaydacode profile image
Kim Arnett 

Great thoughts!

Having a team wiki is great for tasks like generating certificates, team accounts etc.

Also, I'm a fan of team collaboration. Code reviews, open seating, whatever works for your team. Have the junior devs sit with the senior devs and talk about what each are doing. Have demos. Be open. You should never be completely in the dark about anything going on throughout the team. :)

Collapse
 
ben profile image
Ben Halpern

Jess often utters the phrase "bus factor" to me, as either a reminder to get caught up on documentation etc. or my propensity to walk around New York City in a cloud of thought as buses wiz by.

In Canada the cars just stop for you 🇨🇦 😝

Collapse
 
ex_y profile image
Ashe

Only if you're a pedestrian 😝

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

I'm not a big fan of technical documentation on code except where absoultely necessary. What's worse that having no documentatino is having misleading and outdated documentation, especially when you can't ask the author anymore.

To reduce the bus factor, and just because it's good coding, I recommend:

  • automate all tasks that are done frequently and can be reasonably automated (for manual steps, have the automated one write them out)
  • unit tests and integration tests of course
  • small branches, so in worst case you lose only a bit of work
Collapse
 
tedfernandess profile image
Ted Fernandes

Good topic Luke Bearl.
I really liked this post.

The idea of documentation is crucial on a organization.
Many people don't like it. But it's important.

It ends up being more important to ourselves.

Another technique that can help reduce the bus factor, is "Parity". Working in pairs can help reduce this bus factor.

Sometimes it seems wasteful of resources, having two people in one task. But for critical cases it's worth it.

Recently, I wrote about the common mistakes people make on the daily scrum.
Take a look at this:

tfsolucoes-tecnologicas.com/14-com...

I hope you like it. And maybe it will help with a good communication on your team work. Communication is crucial.

Collapse
 
lukebearl profile image
Luke Bearl

Hi Ted, very good points. I agree that every team should look at every technique that they can to see if there is anything out there that will work well for that team in order to reduce the reliance on a few people.

I enjoyed your article on issues with daily scrum as well :).

Collapse
 
nickebbitt profile image
Nick Ebbitt

I like the term “inverse bus factor”, in other words, “how many people need to be hit by a bus before the project starts to run smoothly?” 😂

Collapse
 
lukebearl profile image
Luke Bearl

Fingers crossed most of us don't work in organizations like that.

Collapse
 
damianfekete profile image
Andrei Damian-Fekete

The term is to "increase the bus factor", not reduce it.

Collapse
 
weswedding profile image
Weston Wedding

Guess it depends on the definition.

If the "bus factor" is how vulnerable your organization or project is to someone leaving or become unavailable, I think you'd want to reduce it that factor.

Collapse
 
damianfekete profile image
Andrei Damian-Fekete

Do you have other definition than en.m.wikipedia.org/wiki/Bus_factor ?

Thread Thread
 
weswedding profile image
Weston Wedding • Edited

Yes, the one I just described. The one the article also appears to be operating on.

Generally speaking, I don't think it is a rigorously defined concept. I've never heard it used in the way the Wikipedia article suggests; indeed, the "rare alternative definition" it suggests is more common parlance in the few times it has come up in any kind of conversation.

Collapse
 
walker profile image
Walker Harrison • Edited

what you call the bus factor some people call job security:
twitter.com/thepracticaldev/status...