Out-of-the-Box Experimentation on High-Volume Data with Amazon S3 and Split

#features #datapipeline #datadriven #experimentation

At Split, we know feature flags and data are key to serving the evolving needs of today’s modern consumer. Companies must deliver new features safer and faster while also leveraging all their data to measure whether each feature is actually meeting customer needs. However, when you have data coming in from all corners of your business, combining all of it to extract customer insights becomes exceedingly difficult. Unifying feature flags and data is core to how Split enables our customers to be successful software companies. Our platform plugs into your existing data pipelines, making it easy to bring all your data together, no matter where it lives.

Split + Amazon S3

Split’s new data integration with Amazon S3 is the first of its kind allowing companies to bring high-volume customer data directly from their S3 bucket into a feature delivery platform. Once configured, Split can reliably and safely ingest millions of customer events per minute as Parquet files from S3. This saves engineers the trouble of managing large data transfers, which often involves manually repackaging data in smaller chunks to facilitate ingestion via an API. With a backup copy of every data file in S3, engineers can also avoid data lost in transit and easily reconcile what data Split has received in the case of a pipeline break.

Upon its arrival from S3, your customer data (such as engagement, behavioral, and transactional data) is automatically combined with feature flag data using our patented attribution logic to measure the impact of every new feature on all your metrics. Whether you’re tracking adoption metrics to determine the usability of a new feature or conversion metrics to compare outcomes between feature flag treatments in an experiment, Split will calculate statistically rigorous results your team can confidently take action on.

Attribution Logic with Split

Our attribution logic can even consume pre-existing customer data from S3, enabling out-of-the-box experimentation for feature flag users just getting started on experimentation. Using historical data, Split can retroactively calculate metrics on feature flag treatments. This also means you can ask new questions of running or concluded experiments without reconciling data, rebuilding analytical models, or losing time restarting the experiment. Whether you forgot to add a metric before or decided to add one later, Split can ingest the underlying data from S3 and generate statistically significant results in minutes.

As you review your experiment results in Split, your next question will inevitably be why specific metrics changed due to a given feature flag treatment. To help answer this question, Split will also send feature flag data to S3 early next year.* Once in S3, feature flag data can be combined with all other relevant data for deeper analyses in your preferred data warehouse or BI tool. By experimenting with new features in Split and validating results in a customer or product analytics platform, your product and engineering teams can align on where features can improve, what to build next, and ultimately how to drive business goals with every release.

*To participate in the beta for our outbound S3 integration, please contact us at hello@split.io.

Learn More About Feature Flags and Your Data

As always, if you’re looking for more great content like this, we’d love to have you follow us on Twitter @splitsoftware and subscribe to our YouTube channel.

DEV Community

Out-of-the-Box Experimentation on High-Volume Data with Amazon S3 and Split

Split + Amazon S3

Attribution Logic with Split

Learn More About Feature Flags and Your Data

Top comments (0)