DEV Community

Cover image for Data Driven Development for Complex Systems, Part 1
Caito_200_OK
Caito_200_OK

Posted on • Edited on

Data Driven Development for Complex Systems, Part 1

Part 1 of 4: Introduction + Foundational Concepts

For most of us, if something is generic and common sense, we’re less likely to implement it in our lives with much intention. This could be a vague, personal health habit we’re trying to adopt like going on more walks, or it could be certain best practices in our software development process.

We all know it’s a lot easier to start going on t hose walks if your life isn’t too hectic. The more complex your life gets, particularly if that complexity involves integrating with other people’s schedules — the more that daily walk may need to be formalized. And just like when you get to the point where you need to start blocking off “walk” on your calendar, same goes for your software. The more complex your integration points get with other people or teams, the more important it is to be that much more structured about your development process.

You’re now probably thinking “yes this is very obvious, so why would you write a whole blog series on this??”

… which brings us to…

Image for post

This series is the result of the combined experience from myself and others that I’ve been collecting over the last several years.

This experience comes from the success and sometimes hilariously bad failures in our relationships with data- and metrics-driven development.

These posts, and the examples in it, revolve specifically around advanced stream processing applications. However, the concepts and best practices that I will cover also apply to other software development scenarios — whether they also have highly complex integration points, or are much simpler.

So, this series is for you if you want:

  • A review of some of the best data- and metrics-driven development tactics for stream processing, and how to efficiently leverage them.
  • To use these tactics to streamline and automate a lot of the non-coding overhead that tends to come with these sorts of development scenarios.
  • Some quantitative returns on investment to show your boss if you need to convince them to let your team implement these practices.

Image for post

Firstly, data- and metrics-driven development can mean a lot of different things to different people, so, I’ll start with some of the foundational concepts and terms that I’ll be using. Next, I’ll cover the two specific principles I’ll be focusing on for stream processing.

Foundational Concepts + Terms

If you search for data driven development, you’ll mostly find information about data-driven design and business metrics. There’s a big overlap between these two, as essentially the main focus here is to use data to drive the design of a product that customers actually want to buy and use.

It’s important to note the differences though, because that changes how you use your metrics and who is using your metrics internally.

  • Business metrics refers to anything related to internally measured success (like team velocity, sprint burndown, lead time, etc). But they can also include external information, like what many people refer to as “vanity metrics.” Vanity metrics are often things like “our application is processing 3 million messages a day” and are typically intended for press releases and blog posts.
  • Data driven design relates specifically to the end user experience. As the name implies, the main focus here is to use data — typically data involving usage patterns and user feedback — in order to better understand, and design for the needs of that user.
  • Data driven development is often lumped together with one of these, or used as an umbrella term. I do think it can still be appropriate as an umbrella term, but my focus in this series is where it represents observability data about your system that is used to alter the development process or roadmap.

Image for post

Designed using Canva

Data-Driven, Data-Informed, Data-Aware

Secondly, there is a set of principles that come more directly from data-driven design, but are applicable to data-driven development as well, and will come up later in this series.

This is the idea of a 3 tiered approach that defines the type of tactics, AND metrics that you use and how they guide your product. These approaches are categorized by their relationship with quantitative or qualitative data.

  • Data-driven design outside of its use as an umbrella term, this is as a specific approach is centered purely around quantitative data. This process is applicable to aspects of the software design like performance improvements.
  • Data-informed design weighs and utilizes qualitative and quantitative data equally and usually together to make a design decision.
  • Data-aware design is typically focused more on directly qualitative information. However, quantitative analysis is still often used as a base, or to pair with customer use stories, detailed feedback, or discussion-based risk/reward analysis. This category is also often thought of as the one that takes the larger picture into account, which is why you’ll often see this diagram portrayed as a circle.

Image for post

Next Level Principles

There are three data-driven principles that will be at the core of this series. The first of these is a generic concept that I’ll be reviewing in in the context of how it can best be leveraged for these use cases. The other two principles are a fusion and restructuring of other concepts and ideas that exist within the wider data-driven development/design group. One of the big goals with each of these tactics is leveraging them as sustainable, low-effort cycles.

  • Knowing your normal: this is one of those practices that in my experience has a high, measurable return on investment but is frequently neglected. This is essentially the practice of keeping your metrics meaningful, iterative, and accessible enough that abnormalities are as clear to your developers as they are to your alerting system. I won’t have a whole post on this, but it will be an important aspect of the following articles.
  • Metrics-driven-metrics cycle: I started this term as a joke but it stuck, as it’s really the most accurate description of this concept. This refers to the practice of continuously using current observations about your application’s behavior to improve how you observe your application. And consequently, creating a more sustainable relationship between these metrics and your development priorities.
  • Metrics as a shared language: this refers to setting up a particular process around your metrics cycle. This process is aimed at leveraging these modular, meaningful metrics in a way that can help streamline many of the people-related challenges inherent in systems that have complex integration points and dependencies.

Image for post

The following posts will cover the “Metrics Driven Metrics” cycle and “Metrics as a Shared Language” (respectively) in more depth with real world examples. The last post will be a more technical dive into the examples.

Next in this series

  • Part 2: The Metrics-Driven-Metrics Cycle — arriving on March 11, 2021
  • Part 3: Metrics as a Shared Language — arriving on March 18, 2021
  • Part 4: Hands-On- Monitoring for Stream Processing — arriving in April, 2021

Related talks

Find me

Twitter: @caito_200_ok
Web: http://caito-200-ok.com/

Top comments (0)