The Algorithm That Killed 10,000 Lines of API Boilerplate (Building a Dynamic Query Compiler)

#algorithms #architecture #opensource #typescript

(This is Part 3 of my series on building scalable infrastructure. If you missed them, check out Part 1: Bridging Drizzle & TanStack and Part 2: The Engine-Adapter Pattern).

Most backend engineers spend their entire careers writing "Switchboard APIs". You know the type: an endpoint that receives ?include=posts, checks an if statement, and manually adds a SQL JOIN.

It is tedious, it is brittle, and frankly, it is boring.

If you are a tool creator, you shouldn't be writing switchboards. You should be writing compilers.

When I built the engine for TableCraft, I didn't want to write endpoints. I wanted to build an HTTP-to-SQL compiler that could dynamically resolve infinitely nested database relations and construct complex B-Tree optimized cursor paginations on the fly.

Here are the exact algorithms and architectural patterns I used to do it. If you build infrastructure, SDKs, or internal developer platforms, read this carefully.

🧠 Algorithm 1: Recursive Relation Graph Traversal (The AST Compiler)

The hardest problem in dynamic APIs is nested relational filtering.

Imagine a URL query like this:
?filter[posts.author.company.name][eq]=Acme

To execute this safely in Drizzle ORM, you have to traverse from the Users table, to Posts, to Authors, to Companies.

You cannot hardcode this. You need a Depth-First Search (DFS) algorithm that walks the ORM's relational graph.

Here is the architectural pattern to solve this:

1. The Tokenizer

First, we tokenize the string path posts.author.company.name into an array: ["posts", "author", "company", "name"].

2. The Recursive Resolver

Next, we write a recursive compiler. It takes the token array and checks the Drizzle Schema metadata at every node to ensure the path actually exists.

import { getTableRelations } from "drizzle-orm";

function resolveRelationPath(
  currentTable: AnyPgTable, 
  pathTokens: string[], 
  schema: Record<string, any>
): { targetTable: AnyPgTable, columnName: string } {

  // Base Case: We reached the final token (the column name)
  if (pathTokens.length === 1) {
    return { targetTable: currentTable, columnName: pathTokens[0] };
  }

  // Recursive Step: Extract the next relation
  const [relationName, ...remainingTokens] = pathTokens;

  // Introspect Drizzle's hidden relation metadata
  const tableRelations = getTableRelations(currentTable);
  const relationDef = tableRelations?.[relationName];

  if (!relationDef) {
    throw new Error(`Security Exception: Invalid relation path '${relationName}'`);
  }

  // Find the next table in the schema dictionary
  const nextTable = schema[relationDef.referencedTableName];

  // Recurse deeper into the graph
  return resolveRelationPath(nextTable, remainingTokens, schema);
}

The "Wow" Factor: This algorithm makes your API infinitely scalable. Whether the user joins 1 table or 10 tables, the compiler resolves the abstract syntax tree safely. If a hacker tries ?filter[posts.passwords.hash], the graph traversal halts because the relation definition isn't exposed in the safe schema.

🔥 Algorithm 2: Dynamic Keyset Pagination (The Math)

Offset pagination (OFFSET 10000 LIMIT 50) is a database killer. It requires Postgres to scan and discard 10,000 rows.

The enterprise solution is Keyset Pagination (Cursor Pagination). But generating dynamic keyset SQL for multi-column sorting is a mathematical nightmare.

If a user sorts by [-createdAt, id], the SQL requires tuple comparison:
WHERE (created_at, id) < ('2023-10-01', 504)

How do you generate this dynamically from a URL string?

The Vector Comparison Algorithm

In TableCraft, the engine parses the sort array and the cursor object, then builds a vector array for Drizzle.

import { sql } from "drizzle-orm";

function buildDynamicCursor(
  sortFields: { column: AnyColumn, direction: 'asc' | 'desc' }[],
  cursorData: Record<string, any>
) {
  // 1. Extract the column objects and cursor values
  const columns = sortFields.map(f => f.column);
  const values = sortFields.map(f => cursorData[f.column.name]);

  // 2. Generate the Tuple SQL dynamically
  // If direction is 'desc', we use '<', if 'asc', we use '>'
  const operator = sortFields[0].direction === 'desc' ? '<' : '>';

  // 3. Inject raw SQL into the Drizzle AST
  return sql`(
    ${sql.join(columns, sql`, `)}
  ) ${sql.raw(operator)} (
    ${sql.join(values.map(v => sql`${v}`), sql`, `)}
  )`;
}

By abstracting this into the engine, the frontend engineer just passes a base64 encoded cursor string. The backend parses it, feeds it to the Vector Comparison Algorithm, and generates a perfectly optimized B-Tree index scan.

Zero sequential scans. Zero manual SQL.

🛑 The Paradigm Shift for Creators

If you are building an open-source tool, you have to stop thinking about endpoints and start thinking about state machines and compilers.

When you build a recursive AST compiler like the one inside TableCraft, you aren't just writing a library. You are giving developers a primitive engine they can wrap in any framework they want (Hono, Express, Next.js).

This is what separates basic wrappers from god-tier infrastructure.

I open-sourced the entire engine. If you want to see how the recursive relation builder and dynamic pagination vector math is actually wired up to the Drizzle ORM core, dig into the repository:

👉 Deep Dive the Compiler Architecture in TableCraft (GitHub) (jacksonkasi1/TableCraft)

🔄 What's Next in the Series?

We’ve covered the Engine-Adapter pattern and the Compiler Algorithms. But what happens when the user types a global search query and expects instant results across 5 deeply nested, joined tables?

In the next article, I’m going to break down The Cross-Relational Search Engine—how to dynamically build generic full-text search SQL vectors without hardcoding a single WHERE clause.

If you are serious about backend architecture, hit the Follow button so you don't miss it.

To the other infrastructure creators out there: What is the most complex algorithmic challenge you've had to solve in your core engine? Drop it in the comments below, and let's debate the best approaches.