Product experimentation is becoming increasingly relevant in today’s data-driven world. Teams are continuously experimenting with different versions of their features and products to increase understanding of their user experience and increase the developers’ ability to improve upon it. This can translate to more items in a customer’s cart, more time spent on a page, and ultimately increasing your company’s revenue. Gathering data on how a new or existing feature impacts your KPIs will help you achieve your goals and make sure you release the best possible version to your users.
But experimentation can be a scary word. It sounds like something that will be costly to implement and difficult to do well. Whether that’s true or not (we built our experimentation platform to be easy and drive trustworthy results, so you never have to worry about cost and quality) how do you know if implementing experimentation is right for your organization, right now? Luckily, there are signs you can watch out for that will help you make this decision. Let’s go through them.
Using data to drive the decisions you make in your product is the foundation of experimentation. You will start with a hypothesis based on prior experiments or releases you have done and will look to validate that hypothesis. The theory will consider feature requests, discussions with the team, features in your backlog, and your intuition based on their knowledge and experience. They then use this information to build a theory of how a feature will perform and impact your metrics. You choose whether or not to release a feature with the data that you collect.
The advantage of this approach is that you can deliver much smaller iterations and learn from the incoming data. Think of the painted door concept. Let’s say you have two doors. One that is yellow, and one that is blue. The yellow door represents a feature that is existing and is already in production. The blue door is just a placeholder for a feature that you are thinking about building. Instead of building a complete feature from end to end, you build only the suggestion of such a feature, the blue door, and measure how many people try to use it and interact with it. This will give you an idea of if it’s worth it to put in the effort actually to build out the feature.
For example, let’s say you are the product manager of an e-commerce site. Currently in production, you have a contact us form which prompts users to enter their info in a form and submit it. You want to add a livechat feature to be able to help users in real time. Instead of searching for a third-party livechat tool, having your developers build it, and go through a whole release process, you can create a button on the bottom of the page that says “Live Chat” or “Chat with Us” and see how many people click on it. In reality, it would initially just take them to the contact page, but it would give you a better idea of if people are actually interested in using livechat.
In a do no harm test, you look at the experiment results to better understand what is happening with a feature. While experimentation typically involves deciding whether or not to roll out a feature to your entire userbase, do no harm tests allow you to look at the results and simply stay informed about the impact a feature is having. Your reason for measuring a release in this way may not be because you have a hypothesis to prove. You are simply trying to understand the impact of a release and are planning to roll it out regardless of the results. These results are not intended to drive whether or not you fully roll a feature out.
There are two advantages to this approach. The first is that you can leverage guardrails. You check to see if harm was done (if there was a degradation in your metrics), where the issues were occurring, and identify the aspects of implementation that are causing problems that need to be addressed.
The second advantage is that you are able to quantify the actual impact of your release to both look good for your bosses (driving metrics, showing increased traffic through KPIs, etc) and change the impact on the product. This is helpful because the next time you want to steer things in a particular direction, you can decide to do more in that area. For example, if you monitor your metrics in a do no harm experiment and learn that a particular feature is doing great things for retention, then next time you want to do something around retention, you can use what you learned here to give you a better idea. Even though it didn’t drive your release decision, you make more informed plans in the future.
With experimentation, you are able to make informed decisions by exposing less of your userbase to a new feature, and by doing so, you are able to release faster and continuously iterate on your releases. You feel comfortable with failure because you treat it as a learning experience. This gives you a safety net because you know what’s happening with your releases and use that to drive your release decisions. If you measure each release, you can confidently know when a release is having an issue, and consequently, ramp down that release without guessing where the issue occurred.
Let’s take an example of a percentage rollout to 5% of your users. Chances are, you won’t see a large increase in revenue from that small percentage, but by measuring based on release data, you are able to see those differences much more clearly and identify any issues without having to expose your whole userbase. Then, when the time comes to ramp up the percentage, you will do so much more confidently.
If we look back at the painted door concept, you can find much smaller changes and validate whether this is a useful thing to build out without having to invest significant time on it. You can commit just hours of engineering work rather than months planning it all out and not knowing if it will even drive interest in your users. When you use a kill switch with this approach, you become more comfortable moving faster because you don’t need a code deploy to revert a change in production – you can turn anything on or off with the click of a button in case something goes wrong.
The benefits of feature flags are clear: you slowly ramp releases, release safely, and choose when to make releases happen. If you’re using feature flags, you are already making decisions based on how your feature performed in the early release stages. You target a specific subset of users (generally your internal testing team), test with them, and make decisions based on that. If you’re already doing the work to see if a release is going well for a subset of your population, it requires a lot of work to determine if an issue is tied to a particular release. Why put in more unnecessary effort?
You should use an experimentation platform to combine the experimentation data with feature release data itself so that you can get an easily digestible dashboard. Here, you’ll be able to use statistics to understand more about your release in an informed way. It’s much easier and can be directly tied to whether or not you decide to kill or ramp-up your feature in the future.
At the end of the day, you can never completely guess exactly what a user will do in your application. Collecting data and making data-driven decisions will make sure you provide the best possible user experience. When you use feature flags, you are already measuring releases and you’re already doing experimentation. Don’t make it harder for yourself!
Here at Split, feature flags and experimentation go hand-in-hand! These powerful tools are the future of software development, and our platform empowers organizations to drive clear business impact every day. Ready to learn more? Check out these resources:
- The Benefits of Feature Flags in Software Development
- Understanding Experimentation Platforms E-Book
- Pros and Cons of Canary Release and Feature Flags in Continuous Delivery