DEV Community

Cover image for Why Your Data Warehouse Should Be the Foundation of Your CDP
Team RudderStack for RudderStack

Posted on • Originally published at rudderstack.com

Why Your Data Warehouse Should Be the Foundation of Your CDP

The Customer Data Platform has reached a tipping point fueled by increasing demand for useful customer data across the stack and the rise of major innovations in data tooling to meet that demand. The traditional CDP was primarily built for marketing activation use cases and is technologically incapable of integrating across the modern data stack. No longer can these systems deliver on promises like the single customer view or real-time activation of customer data.

But we're in a new era of customer data management ushered by the modern data stack. In this era, the warehouse is at the center, data teams build and manage the data layer on their own infrastructure, and valuable data is made available across the stack, no matter the tools downstream teams are using (including marketing CDPs).

A brief history: defining the customer data platform

In 2013, David Raab, Founder of the CDP Institute, recognized confusion building around an emerging technology - tools that promised the coveted single customer view. These tools built customer profiles by stitching data from various sources together and enabled predictive modeling on the resulting dataset. They fueled marketing with comprehensive customer information faster than ever before, but there was a lot of variance in the features of each tool, and no one really knew what to call them. Excitement around the new technology was high, but clarity was low.

So, David decided to put a stake in the ground. He published a blog post recognizing the new category and eventually launched the CDP Institute. The CDP institute created a definition for the CDP based upon a set of common consumer expectations driven, notably, by marketing use cases. In his original blog post, David hit the nail on the head, noting that "'customer' shows the scope extends to all customer-related functions, not just marketing". But because marketing is the tip of the spear when it comes to customer data, and because the 2013-2020 period was defined by the explosion of the "mar-tech" landscape, CDPs have always been inextricably linked to marketing use cases.

Today, the CDP institute defines the CDP as "packaged software that creates a persistent, unified customer database that is accessible to other systems."

The modern, warehouse-first data stack delivers all the value suggested by this definition but differs on a key point: the location and ownership of the "persistent customer database."

The original intent, but a new approach

The ultimate goal of the traditional CDP was to provide a function-agnostic platform that created value across the organization. But these CDPs failed to live up to the promise because they really didn't unify data, they created a data silo---a problem that became increasingly painful as the complexity of the data stack increased. They also fell short when it came to sharing customer profiles "with any system that need[ed] it". In other words, most traditional CDPs, while good at marketing messaging, are closed systems that confine data value, not share it across the stack.

That's not necessarily the fault of the CDPs---the market wanted to drive customer engagement for marketing use cases, so the CDPs focused on building for that demand instead of making data integration a first-class citizen.

So, while many CDPs are great for engagement, the need for centralization and integration has become increasingly acute. This means data teams must explore new architectures to liberate data and provide integration flexibility for constantly changing toolsets.

Fortunately, the tool of choice to enable that flexibility already exists: the cloud data warehouse. It makes total sense. The warehouse has the most complete picture of data, and the tooling around it enables fully customizable data flows.

We spoke with David Raab as we did our research for this post, and he made a salient point: "modern warehouses, such as Snowflake, use more flexible data stores and can do more things, including much of what would typically be done in a CDP."

Warehouse technology advancements mean the limitations of the traditional CDP can be overcome, but this requires a different approach: building the entire customer data stack around the warehouse, not a third-party marketing CDP. And that's our mission at RudderStack: to enable data engineers to easily move data across the entire stack while maintaining and enriching a complete set of customer data in their cloud data warehouse.

Leveraging warehouse-first architecture, you get the features required to build and activate real-time, unified customer profiles without needing to store any data with a third-party vendor or subject your stack to their technological limitations. This bring-your-own warehouse approach allows you to build a CDP on top of infrastructure you're already familiar with and invested in. Here's why companies are building CDPs with this new architecture:

  • Flexibility & accessibility - when all of your raw customer data lives in your own data warehouse, it's easily accessible to everyone, and usage isn't subject to vendor-specific limitations
  • More complete customer data sets - because you own your warehouse, you can combine internal customer data (like transactional data) that you wouldn't send to a 3rd party system
  • More advanced use cases - your warehouse is connected to your other customer data infrastructure that runs functions like data science, so you can enable more advanced use cases (as opposed to relying on vendor-provided models, etc.)
  • Enhanced data privacy and governance - utilizing your existing data warehouse as your customer data store means one less tool where you have to deal with security concerns
  • Cost savings - it's cheaper to store your data in the warehouse you're already paying for than it is to pay your CDP to store it for you (again)

Built for engineering vs. built for marketing

While observing that modern data warehouses are capable of doing many of the things typically done in a CDP, David made another excellent point in our conversation. He said "it's not just a matter of dumping your data into Snowflake. There's still plenty of data cleaning, transformation, and unification needed to make the data truly useful." In other words, the warehouse isn't a CDP on its own. Enter, the data engineer, the key to making all the customer data useful.

The marketing CDP was designed to make it easy for non-technical teams to gather and use customer data. The CDP institute articulates this: "The packaged nature of the [CDP] makes it much easier to deploy and change as new needs arise. Corporate IT must cooperate to set up and maintain the CDP but most technical resources are usually provided by the vendor or an agency hired by marketing."

That approach made sense for marketing use-cases when the limited technology made moving data painful and expensive---CDPs provided value by making data accessible to non-technical teams by abstracting away the technical reality of data movement and unification. But this approach leads to a view where engineering and IT are seen as necessary obstacles that non-technical teams must make cooperate in order to get what they want.

But, in reality, engineering is best-suited to own the implementation and management of the CDP. This allows them to actually supercharge the efforts of other teams because engineering:

  • Controls core technical infrastructure that's directly related to the customer experience (and subsequently, customer data)
  • Understands the technical requirements for data security and is ultimately held responsible for keeping customer data secure
  • Can centralize all customer data in a modern data warehouse or data lake
  • Can integrate every tool in the organization's data stack and enable real-time use cases

So, at RudderStack we're building a CDP for developers because our mission is to make data engineers, scientists, and developers the heroes of their companies by providing every team with rich customer data.

The future is warehouse-first and best of both worlds

With modern data warehouses and data lakes paving the way, we believe we're at the beginning of a seismic shift in customer data management. The next five years will be full of innovation in the space and the modern data warehouse will sit at the center.

In this new era, companies won't have to ditch their marketing CDP for a warehouse-first approach. The most advanced companies will leverage the warehouse-first approach to get more value out of their marketing CDP by driving it with more complete, enriched data from the warehouse.

And with IT/engineering positioned as a strategic partner, teams organization-wide will solve harder problems and unlock new opportunities. Data engineers will no longer be seen as an obstacle to cooperate with, they'll become the heroes of their organizations.

Try the warehouse-first approach today

Test out our event stream, ELT and reverse-ETL pipelines. Use our HTTP source to send data in less than 5 minutes, or install one of our 12 SDKs in your website or app. Get started.

Top comments (0)