DEV Community

Nikola Popov
Nikola Popov

Posted on

Building a Metadata-Driven Runtime API Platform for Analytics Systems

Building a Metadata-Driven Runtime API Platform for Analytics Systems

1. Business Problem

When I joined this project as a consultant, I almost said no. The client had already gone through several attempts before I joined the project, and the scope looked too complex to be practical. But I sat down, mapped out the risks, aligned expectations, and took it. That decision turned into one of the most technically interesting projects of my consulting career.

The company was building analytics solutions for different customers. Data flowed through ETL pipelines, machine learning models, and custom analytics processes. The final result was a massive amount of processed data that had to be exposed via web applications.

Since every customer had different requirements, database schemas, and ways to display information, we couldn't just build one backend and reuse it. The frontend was a Single Page Application that required a huge number of APIs. As a result, development teams repeatedly had to implement data models, APIs, filtering, sorting, pagination, and CRUD operations from scratch. Every new project meant supporting new schema changes and analytics requirements, so development and maintenance costs just kept growing.
It's worth noting that this project started around 2020. Today, there are mature platforms such as Hasura, PostGraphile, Supabase GraphQL, and other metadata-driven API solutions that solve parts of this problem. At the time, however, these options were either unavailable in our technology stack, lacked some of the analytical capabilities we needed, or were not mature enough for our use case. As a result, building a custom platform was the most practical option.

2. Why Existing Solutions Failed

Before looking at GraphQL, the client tried a custom SQL-over-HTTP approach. The idea was to let clients send SQL-like requests while the backend validated table and column names, data types, and query structure. The system parsed the incoming requests and rebuilt the SQL internally.

Technically it worked, but we were worried about security, long-term maintenance, and how the platform would evolve in the future.

We also evaluated GraphQL as an API layer. The concept was a good fit, but the available .NET GraphQL frameworks still required too much manual schema and resolver code. It just added another abstraction layer without really saving development time.

So, the goal changed: we needed to create a platform that could generate APIs automatically and minimize manual backend work.

3. Architecture

Why a Framework Instead of a Library

One of our earliest decisions was to build a framework, not a library. The difference is simple: a library is passive (the app decides when to call it), but a framework is active (it controls the execution flow and gives you extension points for custom logic). This is classic Inversion of Control (IoC). Since we needed a consistent way to build APIs across multiple apps, a framework was the right choice.

Modular Architecture

I wanted to avoid a tightly coupled platform core from day one. We split the platform into independent modules: Metadata Engine, GraphQL Schema Generator, Query Engine, Filter Engine, Mutation Engine, Statistics Engine, and Configuration Engine.

Each module had one clear responsibility and could evolve independently. The rule was simple: solve a problem in exactly one place. Logic should never be duplicated between modules.

Evolution of the Platform

We took a bottom-up approach to building the platform. We tackled the riskiest part first - the piece that could actually make or break the whole project. That first step was automatic GraphQL schema generation. Doing this early not only proved the core concept but also let us demo a working prototype to the client right away.
After that, we added the Query Engine, filtering, and sorting. Later came aliases, field hiding, database-driven configuration, and full GraphQL generation directly from the DB schemas. Finally, we wrapped up with usage statistics and caching.
This incremental approach let the business get real value after every stage, rather than waiting for a massive final release.

4. Metadata Model

A key architectural decision was creating our own internal metadata model. All platform components used it. Metadata could come from Entity Framework models, database schemas, custom classes, or configuration tables.
The platform supported two operating modes. It could generate metadata from existing Entity Framework models, or it could work directly from database schemas without any application code. In the latter case, users only needed to provide one or more connection strings, and the platform automatically discovered tables, relationships, and data types.

People often ask: why not just use Entity Framework classes directly? The answer is performance and flexibility.

We measured performance and found that using Reflection at runtime was too slow. So, we only used Reflection to extract the metadata initially. After that, everything worked with our optimized internal model stored in dictionaries and lookup tables. This gave us much better performance, source independence, and made the system way easier to maintain.

5. Query Engine

The Query Engine is the core of the platform.

We used Reflection heavily to analyze EF models and database metadata, but we intentionally kept it out of the runtime execution paths. For dynamic query generation, we used Expression Trees to build LINQ expressions at runtime without generating actual source code.

The engine produced IQueryable objects, which other modules could easily extend with filters, sorting, projections, or aggregations. The same metadata was also used for mutation generation. CRUD operations could be generated automatically when a table exposed a primary key. For legacy databases where primary keys were not properly defined, keys could be configured through metadata. GUID-based keys could even be generated automatically during insert operations. This allowed different modules to participate in query generation while keeping the code composable.

Since the platform was built for analytical workloads, we supported standard analytical models. The most common was the Star Schema (a large fact table connected to first-level dimension tables). We also supported the Snowflake Schema with more complex hierarchies, which was frequently used for configuration systems, self-referencing entities, hierarchical data, and system management screens.

6. Dynamic Result Schemas

Later, we faced a new challenge: analytical users needed aggregations like GROUP BY, SUM, AVG, MIN, MAX, COUNT, and nested aggregations.

Traditional GraphQL assumes that the output schema is known in advance. However, analytical queries often challenge this assumption because the result shape depends on the requested grouping, aggregation, and selected metrics.

The schema still had to be described to GraphQL, but instead of being hardcoded at compile time, it was generated from metadata at runtime. Because of this, we separated filtering and output models. Filtering continued to rely on EF and database metadata, while output models became dynamic and were generated based on the requested analytical operation.

Interestingly, I didn't know about the JSON Table Schema specification when we started the project. Our internal design just naturally evolved into something very similar. Later, we tweaked a few field names to match the official spec.

Aggregation support was built directly into the Query Engine. The system could generate dynamic output structures based on the requested grouping and aggregation operations. This meant analytical queries could return structures that weren't even known at compile time.

7. Performance

We thought about performance from day one.

Query Projection: The platform analyzed GraphQL requests and generated projections containing only the requested fields. If a column wasn't in the query, it wasn't in the SQL. This saved a lot of database load, and we successfully tested it on tables with over a million rows.

Statistics Collection: We tracked which fields, filters, and query patterns were used most often to help optimize indexes. Since our main users were frontend and full-stack devs (not DBAs), we made sure the performance data was presented in a way application developers could easily understand.

Caching: For paginated datasets, the platform could cache current and next pages. This made navigation much snappier when users moved between nearby pages.

8. Results

In the end, we built much more than just a GraphQL generator. It became a full Metadata-Driven Runtime API Platform.

It handled automatic GraphQL and CRUD generation, database-first and EF support, complex filtering, Star/Snowflake schemas, dynamic outputs, self-hosted configuration APIs, and caching. The platform went into production across multiple systems and drastically cut down backend development effort.

A few years later, the client asked me for a small update. It was really satisfying to see that the platform was still actively used in both old and new projects.

In practice, the platform could expose entirely new databases through GraphQL with little or no backend code. In many cases, providing a connection string and configuration metadata was enough to generate a functional API.

9. Architectural Lessons Learned

Commercial vs Custom Framework: At the time, no commercial framework did what we needed. Even if one existed, it would probably be bloated with features we didn't need. For internal systems with clear requirements, a focused custom framework is often simpler and more efficient.

Reusability: My rule of thumb: if you write the same logic for the third time, itโ€™s a signal that you need an abstraction. Reusable modules should come from real patterns, not theoretical assumptions.

Interface Design: Designing interfaces is hard. They should be easy to use right and hard to use wrong. Finding the right abstraction level is often harder than writing the code itself.

Documentation: Frameworks outlive applications. Good documentation is part of the architecture; without it, you just pass the complexity directly to the next developer.

Cost of Flexibility: We were honest with the client about the complexity from the start. The hardest part wasn't GraphQL itself. The challenge was combining Reflection, Expression Trees, IQueryable, and the internals of both EF and GraphQL.NET. This kind of system requires senior developers, not just typical CRUD coders.

GraphQL.NET 6 โ†’ 7 Migration: A good example of this complexity was the migration from GraphQL.NET 6 to 7. It looked like a minor version bump, but it took about a third of the time it took to build the whole platform originally. At the same time, Entity Framework was also evolving. Thatโ€™s the reality of building frameworks: the hardest part isn't building it, it's keeping it in sync with the ecosystem around it.

Top comments (0)