Feature Flags and Analytics: Putting your best features forward.

#devops #testing #monitoring #analytics

Feature flags give you the flexibility to decouple releases from deployments. They give you the ability to toggle features on and off based on any number of criteria. But the question often remains, how do you know when it’s time to use them? How do you know when it’s safe to roll a feature out to the next target segment or all users? How do you know that a feature is misbehaving and it’s time to disable it? Application monitoring and analytics can help answer these questions and greatly increase the efficacy of your feature flag strategy. Let’s explore a few common feature flag use cases and how analytics can help developers make fast and informed decisions.

Early testing. One of the best uses of feature flags is to test features early in the development process. Effective testing against real user data in actual live end-user environments is critical. All the synthetic tools in the world can’t replace the real thing. When a feature is ready for limited availability or preview (or even Beta) feature flags can be used to make that feature available in production to a targeted subset of users. Often companies choose to first segment by internal employees and QA. The next subset would likely be those who have opted in as Beta users. Then you would advance to a targeted deployment of 5%, 10%, 50% or 100% for example to a group that may have criteria like; are new users from North America who use a specific feature in your application.

Late-breaking integrations. We’ve all been there. A feature makes it into a release by the skin of its teeth. Maybe you didn’t get the chance to test it as rigorously as you’d hoped before the release dropped. Some find the idea of testing in production exciting, but if that’s not you, never fear. You can wrap that late-breaking feature in a feature flag. If you discover buggy behavior, simple use the kill-switch and turn the feature off or pause it. This alleviates you from having to roll back a release for a feature you know might be shaky.

The analytics edge. For both these use cases you can use analytics to understand the user experience. Is the feature performing like you expect? When traffic starts hitting it, is the feature responsive? Does the feature improve or hurt my KPIs? Understanding these facets of user experience will help shape the way you iterate on features before deploying to the larger audience at GA. A good APM solution can give you this insight.

Canary deployment. In a canary deployment scenario a feature is released to a small targeted subset of users to see if it’s ready for prime time. The users on the canary deployment are gradually increased and tested along the way. If something goes awry, simply reroute the traffic back to the other version. Once all the users have been migrated, the canary deployment becomes the current software version. CloudBees Rollout supports this with gradual rollout.

The analytics edge. How do you know if something has gone awry? You want to know what’s going wrong and how systemic it is before pushing everyone to this release. Moreover, you really need to understand root cause. You can use APM and log analytics to help you understand metrics such as crash rates, error rates and latency. A good log analytics solution will be able to correlate metrics such as these to uncover root cause.

A/B testing. Similar to a canary deployment, A/B testing begins testing a feature with a subset of users and based on performance rolls it out to the rest of the population. However a Canary Release is more about feature stability and readiness. A/B testing is testing some hypotheses about user behavior. For instance, I may have a hypothesis that placing a product tour in a trial experience before a user connects their application will result in more conversions to a paying customer than if I were to place the same product tour after the user connects their application. This is an experience I can test with a feature flag. The experience that has a causal relationship with the outcome I desire is what gets rolled out en masse.

The analytics edge. Analytics for A/B testing can help you identify the causal relationship you are looking for. This is an especially exciting topic when you consider the application of A/B testing with behavior-based cohorting. One of our favorite tools, Amplitude, can create user segments by watching how users interact with your product rather than by title, industry etc. It allows us to test groups by how they behave, not who we think they are.

Analytics is an important part of a good feature flag management strategy. Whether you are integrating with APM solutions, log analytics or business intelligence software alongside your feature flags you are lightyears ahead of those still using feature flags as the “Oh sh*t! Turn it off! Turn it off!” button.

Top comments (2)

Mike Mahon • May 21 '20

let me get this straight, I can define criteria for feature acceptance that can gradually rollout exposure of my API's functionality to a percentage of my traffic's response based on individual interaction with the feature and +/- to my criteria? This is suddenly getting inception-y for me ... that's a good case for cost savings and feature justification too. Save costs killing off less traveled routes or identify & improve site traffic flow. Lots of potential there to both improve & explore uxui and save potentially on ops cost.