I've written about early integration and why it leads to better features. While this is true you might not want to roll out changes to all users at once. For instance, you might be working on a feature for a very specific target group and want to make sure that once you release it for everyone there are no major hick-ups. At the same time, you don't want the source code to diverge and rot on a long-lived branch. Enter stage: feature toggles.
Feature toggles allow us to reintegrate new features early but only activate them for certain customers at a time. This can be used for beta testing, A/B testing, and more.
That's great until some of our POs required every new feature to be behind a toggle. This might sound like a reasonable idea but let me explain why I think you should try to keep the number of feature toggles to a minimum.
This post is not anti-feature-toggles. It's about being aware that feature toggles aren't the silver bullet that people want them to be. While they can be extremely useful they can also be used to amplify bad behavior in a broken system.
When you work with features toggles you'll need to be aware that:
- Feature toggles introduce another layer of complexity into your system,
- they can hide that more work is in progress than you think, and
- that you should favor incremental changes over big features
Have you heard about cyclomatic complexity (also known as McCabe complexity)? In a nutshell, this metric tells you how many distinct paths there are in your code1. For instance, if you have one
if-statement then there are two possible paths.
I think the most basic lecture I give to less experienced engineers is how to reduce the amount of
if-statements in code. Because the more conditionals there are the harder it is to reason about the code.
Now, if you're on the lookout for this kind of complexity on a method level, then you should definitely be on the lookout on the feature level. Each feature you're toggling adds a condition to your application. One path where the feature is active and another one where it isn't.
What this means is that you'll need to add integration tests for all possible scenarios where different features overlap. This can add significant effort to your development process. It can also increase the chance of defects that are harder to trace because your production system might have many different feature configurations.
When you're not releasing a feature for everyone this usually means that it is not completely done. Maybe you want to evaluate whether a certain hypothesis holds or you need to check whether the performance of the feature is at a level that you're confident with.
Unfortunately, toggled features do not take up space2 in most development boards I've seen teams use. That means that there is no WIP limit on non-GA features. Oof. If that's the case then we might have removed bottlenecks from our development process but we've added a monster of a bottleneck at the end. This is problematic because we're doing a lot of work and not all users can see it. If not all users can use it you're going to miss edge cases. When you're missing edge cases you create defects.
Long story short: a feature that is toggled cannot be considered done. When a feature is not done this should prevent the team from starting something else. If this isn't the case you have a problem.
Before you add a feature toggle so that you can hide a new, big feature behind it consider the following:
You probably should not use feature toggles to hide the fact that you're doing big bang releases3. Yes, you're already integrating the feature code back into your codebase. But: if no user can use it and, therefore, run the code then how much integration is really happening?
Use a feature toggle when you're unsure whether to add the feature at all. Then build a minimal version and test it with a dedicated user group. Based on that result you either throw it away or continue (in which case you can probably already remove the initial toggle). If you're simply hiding regular development behind a toggle then I'd say you're doing it wrong.
Is the feature disruptive? Are you, for instance, altering a core user flow inside your application? Then that's a good reason to not just roll it out to everybody but to do a small test run (see the section above).
However, if you're adding a new feature which does not impact the general user flow then why are you hiding it? You might prevent your users from being more productive faster. If there's nothing to test then don't hide it.
Are you not sure whether the feature makes sense at all? Then why are you already writing code for it? That's a pretty expensive experiment you're running! Paper prototypes or tools like Figma for high fidelity mock-ups might be a better alternative.
Oh boy. I guess this combines both previous sections. Where should I start?
🚨 Feature toggles will not fix a broken development process! 🚨
That felt good... If you have a problem with too many bugs in production then I'd recommend doing one or more of the following:
- define a zero-bug policy,
- do TDD (not preaching, read the article 😉),
- make sure that continuous improvement is part of your process,
- fix problems instead of symptoms,
- integrate as early as possible
I'm running out of posts to link here but that list is not exhaustive. The key takeaway is that releasing features (either by not having the code online or hidden behind a toggle) will not solve your bug problem.
What do you think? Am I crazy? have you seen feature toggles being used for the wrong reasons? I'd love to hear your stories! Reach out to @philgiese on Twitter!