Igor Lukanin for Cube

Posted on Sep 14, 2022 • Originally published at cube.dev

What are data apps?

#data #analytics

Previously, we’ve described the parts of headless BI, taken an in-depth look at the data modeling layer, and explored one use case for headless BI: embedded analytics. This week, let’s take a step back and look at the category of data applications.

But first…

What are data applications?

“Data apps” is an umbrella term for a category of interactive tools that use data to deliver insight or automatically take action. When we talk about data apps, we frequently cite the examples of recommendation engines, data visualization built into applications, and customized internal reporting tools for business teams.

Isn’t this just embedded analytics?

Embedded analytics takes the kind of exploration that used to happen in dashboards and legacy BI tools, and injects it directly into the applications that internal teams and external customers already use. Headless BI facilitates building embedded analytics more quickly. But embedded analytics is just the beginning.

Despite being more accessible and customized than traditional dashboards, embedded analytics is still primarily a tool for data exploration. By contrast, data applications are capable of data explanation: highlighting trends, surfacing insights, making recommendations. This type of application entails a dynamic, purpose-built user experience, and it is typically developed by software and data engineers, not business analysts.

What are some use cases of data applications?

The first type of data applications is an embedded data app. Think of this as the evolution of embedded analytics, but unlike embedded analytics’ static dashboards, embedded data features tend to be highly customized, dynamic, and purpose-built. These applications surface insight within the native user experience of another application.

A business’s internal data products and portals are a second kind of data applications. Unlike traditional or embedded exploration dashboards, this type of data application is purpose-built for a specific business unit, and is built with relevant business context. These applications’ custom interactivity allows business users to receive insights without mastering data analysts’ workflows.

The third type of data applications are end-consumer-facing applications. These may be built for customers, partners, or shareholders, and they are not dissimilar from internal applications—but they tend to require a finer level of design polish and customization. Additionally, this type of app must be built for higher performance, reflecting consumers’ expectations of speed.

How are data applications built?

By their nature, data apps require recourse to large quantities of data. This has been made possible by the rise of the cloud data warehouse and an ever-growing ecosystem of data ingestion, governance, transformation, and orchestration tools.

But given their complexity and power, data apps generally are built by engineering teams, and they require integration with modern engineering workflows, including version control, testing, and continuous integration and deployment practices.

Building from scratch

Embedding data app functionality into a larger application generally requires building from scratch. What is the architecture of such a solution?

Data store

Naturally, a data application starts with the data—and the basis of the modern data stack is the cloud data warehouse. This can be a general purpose data warehouse like Snowflake or a real-time tool like Firebolt, ClickHouse, or Materialize.

Headless BI layer

A crucial component of a data app is the headless BI layer. Specifically, a major piece of this is access control integrated with the warehouse’s security controls, because embedded analytics always require multitenancy. A second piece is advanced caching. This is because the data warehouse is a great candidate for a backend, but itself does not support highly concurrent queries with sub-second latency that modern data consumers expect.

The BI layer is also where data modeling is handled, to ensure that a data app’s users consume the same data definitions as users of other internal or external applications. Data modeling and metrics definition should be handled once, and this must be up-stack from every application or dashboard.

Data is then made available via diverse APIs—e.g., SQL, GraphQL, and REST—to be consumed by…

A hybrid presentation layer

For the high customization expected of an embedded data application, and when front-end teams are looped in, different charting libraries can be used. These range from D3 to Chart.js and Highcharts. These most likely will be natively integrated with frontend application frameworks like React or Angular.

Working with a framework

For the second and third types of data applications, the initial layers of the data stack are the same—i.e., the base layer is a data warehouse, followed by a headless BI layer for data modeling, access control, caching, and application APIs.

For the user interface, however, there’s typically less customization required. This creates the opportunity to take advantage of the new category of no code / low code tools like Appsmith and Retool, which can be used to quickly build analytics interfaces.

There also are data application frameworks that are helpful here: tools like Plotly Dash and Streamlit make it possible to turn data scripts into shareable web applications without the need for front-end development.

What's next?

As it gets easier to build customized experiences, the number and types of data apps will proliferate—but the use case for a basic dashboard-centric experience won’t go away. There will always be cases where needs are best met with traditional charts, or when the quick turnaround requires making something available without tapping engineering resources for help. For these, embedded analytics are and will remain the best choice.

What’s exciting, though, is all of the new opportunities that the modern data application stack makes available. Opportunities for working with ever greater quantities of data, with ever greater complexity, will only grow.

DEV Community