hellowwworld

Posted on Jan 19

Prisma ORM at Raw SQL Speed? Just Convert JSON to String

#javascript #database #webdev #discuss

SQL Access Performance: A 50-Year Trade-off

When IBM introduced SQL in 1974, database access was straightforward. Submit a query string, receive results. No intermediate layers existed.

This directness created problems. Changing column names required manual searching across codebases. User input needed manual escaping to prevent SQL injection. Database migrations between systems required query rewrites due to dialect variations.

These issues led to Object-Relational Mappers in the mid-1990s. The concept: represent tables as programming language objects. Write code in your language instead of SQL strings. Let the framework translate.

Early ORM Adoption (1996-2005)

Hibernate released for Java in 2001. Rails incorporated ActiveRecord in 2004. Code transformed from scattered SQL strings to object-oriented patterns:

customer = Customer.find_by_email("customer@example.com")
orders = customer.orders
line_items = order.line_items

This provided type safety and refactorability.

Performance issues appeared immediately.

N+1 Query Problem

A standard pattern emerged:

orders = Order.limit(10)  # 1 query
orders.each do |order|
  puts order.customer.name # 10 more queries
end

ORMs loaded each related record separately. 2 queries became 11. This degraded application performance on larger datasets.

Solution: eager loading via Order.includes(:customer).limit(10)

Developers now needed to understand both their programming language and ORM-specific optimization techniques. The abstraction leaked implementation details.

Stack Overflow accumulated questions about ORM query optimization. The promised SQL abstraction added complexity instead of removing it.

Query Builders Appear (2010-2015)

Knex.js launched in 2012 with different goals: make SQL composable without hiding it.

knex('products')
  .where('category', 'electronics')
  .andWhere('price', '>=', 100)
  .join('inventory', 'products.id', 'inventory.product_id')
  .select('products.*', 'inventory.quantity')

Cleaner than string concatenation. More flexible than full ORMs. Still had method chaining runtime overhead and no type safety - column name typos caused runtime errors.

MongoDB gained adoption with a different pitch: store JSON directly, avoid ORM impedance mismatch entirely.

Companies migrated to NoSQL. SQL appeared outdated.

Production incidents accumulated: data inconsistencies, orphaned records without foreign keys, corrupted state without transactions, full collection scans without query planning.

Teams migrated back to PostgreSQL. Data had schema structure - they'd just been enforcing it poorly in application code.

Observation: SQL databases were slow due to layers between application and database, not SQL itself.

GraphQL Attempted Solution (2015-2020)

Facebook released GraphQL in 2015. Clients specify exact data requirements in one request.

query {
  company(id: 1) {
    name
    employees(limit: 10) {
      fullName
      department { name }
    }
  }
}

Resolvers needed to translate GraphQL into database queries. This created the same N+1 problems:

Company: {
  employees: (parent) => db.query('SELECT * FROM employees WHERE company_id = ?', parent.id)
}
Employee: {
  department: (parent) => db.query('SELECT * FROM departments WHERE id = ?', parent.department_id)
}

DataLoader was created for query batching. Query complexity analysis prevented expensive queries. Persisted queries avoided parsing overhead. The ecosystem built layers to solve layer-created problems.

GraphQL's flexibility required rate-limiting because users could request excessive data.

TypeScript-Based ORMs (2018-2024)

TypeORM (2016) and MikroORM (2018) added decorators and type safety:

@Entity()
class Product {
  @PrimaryGeneratedColumn()
  id: number;

  @Column()
  sku: string;

  @ManyToOne(() => Category, category => category.products)
  category: Category;
}

Compile-time type safety. Traditional ORM model with classes, active records, eager/lazy loading.

Drizzle ORM (2022) used SQL-like syntax with TypeScript inference:

const result = await db
  .select()
  .from(products)
  .where(eq(products.inStock, true))
  .leftJoin(categories, eq(products.id, categories.product_id))

Closer to SQL. Better performance than traditional ORMs. Still had runtime query building overhead from method chaining.

Prisma appeared.

Prisma's Architecture (2018-2024)

Prisma 1.0 released in 2018. Prisma 2.0 in 2020 introduced a different model. No classes, decorators, or method chaining. Schema file generates a client:

model Order {
  id          Int      @id
  orderNumber String   @unique
  items       Item[]
}

Declarative query API:

prisma.order.findMany({
  where: { status: 'PENDING', total: { gte: 100 } },
  include: { items: { where: { inStock: true } } },
  orderBy: { createdAt: 'desc' },
  take: 10
})

TypeScript knows all fields, relations, and filters. Column renames produce TypeScript errors everywhere. No runtime surprises.

Developers adopted it. Type safety without codegen scripts. Migrations without manual SQL. Functional IntelliSense.

Performance questions appeared in GitHub issues. Pattern: queries 2-5x slower through Prisma versus raw postgres.js or pg.

Prisma team improved performance across versions. Version 7 rewrote the engine in TypeScript for better characteristics.

Gap remained. Not due to poor engineering. The architecture required a translation layer.

Alternative Approach

Prisma's query format resembles SQL:

{
  where: { status: 'PENDING', total: { gte: 100 } },
  include: { items: { where: { inStock: true } } },
  orderBy: { createdAt: 'desc' },
  take: 10
}

Maps 1:1 to SQL:

where → WHERE clause
include → JOIN or subquery
orderBy → ORDER BY
take → LIMIT

The declarative JSON alone is insufficient. Prisma provides DMMF (Data Model Meta Format) - complete schema metadata:

{
  models: [{
    name: "Order",
    fields: [
      { name: "id", type: "Int", isRequired: true },
      { name: "orderNumber", type: "String", isRequired: true },
      { 
        name: "items", 
        type: "Item",
        isRelation: true,
        relationName: "OrderItems",
        foreignKey: ["orderId"],
        references: ["id"]
      }
    ]
  }]
}

DMMF specifies:

Field existence and types
Relation structure and connections
Foreign key mappings
Required/optional fields
Array/scalar types

Without DMMF, SQL generation requires guessing. With DMMF, it's deterministic. Query JSON + schema metadata = complete SQL generation instructions.

Direct SQL generation from this combination eliminates the query engine and protocol. Query JSON + DMMF → SQL string.

Not novel - Drizzle already builds SQL directly. But Drizzle requires code-based schema. Prisma's DMMF generates from schema file, maintaining single source of truth.

Hypothesis: if query JSON + DMMF contains everything for SQL, the translation layer is overhead.

Implementation

Pattern matching implementation:

WHERE clauses: Recursive tree walking. { AND: [...] } → (...) AND (...). Operators map: gte → >=, contains → LIKE '%value%', in → IN (...).

Relations: Each include → JSON aggregation subquery. DMMF provides foreign key mappings. Postgres: json_agg(), SQLite: json_group_array(). Database handles the work instead of application-side loading and joining.

Pagination: Convert cursor + orderBy into WHERE conditions positioning after cursor. More complex than OFFSET, stable when data changes.

Dialects: Postgres/SQLite differ in arrays, case-insensitive search, JSON. Abstract to functions: arrayContains(), caseInsensitiveLike(). SQL builder calls functions without dialect knowledge.

Result: ~3000 lines of pure functions. Same input → same output. Enables prebaking queries at build time.

Implementation Steps

Step 1: Install

npm install prisma-sql postgres

Step 2: Add three lines

import { PrismaClient, Prisma } from '@prisma/client'
import { speedExtension, convertDMMFToModels } from 'prisma-sql'
import postgres from 'postgres'

const models = convertDMMFToModels(Prisma.dmmf.datamodel)
const sql = postgres(process.env.DATABASE_URL)
const prisma = new PrismaClient().$extends(speedExtension({ postgres: sql, models }))

// Existing code unchanged
const orders = await prisma.order.findMany({
  where: { status: 'PENDING' },
  include: { items: true }
})

No code changes needed.

Behavior:

findMany, findFirst, findUnique, count, aggregate, groupBy → Direct SQL
create, update, delete, transactions → Prisma engine (unchanged)

Existing code works:

// Now 2-7x faster
await prisma.order.findMany({ where: { status: 'PENDING' } })

// Unchanged (uses Prisma engine)
await prisma.order.create({ data: { orderNumber: 'ORD-12345' } })

// Complex includes work, faster
await prisma.order.findMany({
  where: { status: 'PENDING' },
  include: {
    items: {
      where: { inStock: true },
      include: { product: true }
    }
  }
})

Optional debugging:

const prisma = new PrismaClient().$extends(
  speedExtension({ 
    postgres: sql, 
    models,
    debug: true,
    onQuery: (info) => {
      console.log(`${info.model}.${info.method}: ${info.duration}ms`)
    }
  })
)

Optional generator mode:

Schema annotation for hot queries:

/// @optimize {
///   "method": "findMany",
///   "query": { "where": { "status": "" }, "take": "$take" }
/// }
model Order {
  id     Int    @id
  status String
}

Run prisma generate. Prebaked queries have ~0.03ms overhead instead of ~0.2ms.

Three lines to add. Zero lines to change.

Performance Results

Testing 137 queries comparing Prisma v7, Drizzle ORM, direct SQL generation:

PostgreSQL (selected):

Query Type	Prisma v7	Drizzle	Direct SQL	vs Prisma	vs Drizzle
Simple WHERE	0.34ms	0.24ms	0.17ms	2.0x	1.4x
Complex conditions	6.90ms	5.58ms	2.37ms	2.9x	2.4x
With relations	0.72ms	N/A*	0.41ms	1.8x	-
Nested relations	14.34ms	N/A*	4.81ms	3.0x	-
Multi-field ORDER BY	2.38ms	1.54ms	1.09ms	2.2x	1.4x

SQLite (selected):

Query Type	Prisma v7	Drizzle	Direct SQL	vs Prisma	vs Drizzle
Simple WHERE	0.23ms	0.10ms	0.03ms	7.7x	3.3x
Complex conditions	3.87ms	1.85ms	0.93ms	4.2x	2.0x
Relation filters	128.44ms	N/A*	2.40ms	53.5x	-
Multi-field ORDER BY	0.59ms	0.43ms	0.37ms	1.6x	1.2x

* Drizzle lacks equivalent patterns - requires manual subqueries

vs Prisma: 2-7x speedup from bypassing query engine.

vs Drizzle: Drizzle builds SQL directly but has method chaining overhead. Direct approach is 1.4-3.3x faster for comparable queries. Drizzle can't easily express nested relation loading - manual subqueries or multiple roundtrips required.

Relation handling: Largest gap. Prisma's include syntax is powerful but slow through engine. Drizzle lacks equivalent - manual joins required. Direct SQL generation with JSON aggregation subqueries maintains developer experience at raw SQL performance.

Speedup from execution path length:

Prisma: JavaScript → Query Engine Protocol → Engine Translation → SQL Generation → postgres.js → Database

Drizzle: JavaScript → Method Chain Resolution → SQL Building → postgres.js → Database

Direct (runtime): JavaScript → SQL Generation (~0.2ms) → postgres.js → Database

Direct (generator): JavaScript → Map lookup (~0.03ms) → postgres.js → Database

Two fewer layers than Prisma. One fewer than Drizzle. Generator mode eliminates SQL building.

Generator Mode Details

Runtime SQL generation: ~0.2ms per query. For frequent queries, this is eliminable.

The @optimize directive specifies which query shapes to prebake:

/// @optimize {
///   "method": "findMany",
///   "query": {
///     "where": { "status": "" },
///     "orderBy": { "createdAt": "desc" },
///     "take": "$take",
///     "skip": "$skip"
///   }
/// }
model Order {
  id        Int      @id
  status    String
  createdAt DateTime
}

Key point: Actual values in directive are irrelevant. They're placeholders describing query structure:

"status": "" - Any value works. Indicates "filter by status exists"
"take": "$take" - $ prefix is convention. "take": 10 works identically
"orderBy": { "createdAt": "desc" } - Literal defining structure

What matters:

Which fields (status, createdAt)
Which operators (equality for status, ordering for createdAt)
Which parameters from runtime (take, skip, status value)

Becomes parameterized query. Generator creates:

const QUERIES = {
  Order: {
    findMany: {
      '{"where":{"status":""},"orderBy":{"createdAt":"desc"},"take":"$take","skip":"$skip"}': {
        sql: 'SELECT * FROM orders WHERE status = $1 ORDER BY created_at DESC LIMIT $2 OFFSET $3',
        params: [],
        dynamicKeys: ['status', 'take', 'skip']
      }
    }
  }
}

SQL has three parameters ($1, $2, $3). All values from runtime:

const args = { 
  where: { status: 'PENDING' },  // $1
  orderBy: { createdAt: 'desc' },
  take: 10,    // $2
  skip: 20     // $3
}

// Normalize to match prebaked key
const key = normalizeQuery(args) 

// O(1) Map lookup
const prebaked = QUERIES.Order.findMany[key]

if (prebaked) {
  const params = extractDynamicParams(args, prebaked.dynamicKeys)
  // → ['PENDING', 10, 20]

  return execute(prebaked.sql, params)
}

Security: All values parameterized. SQL injection impossible - values never enter SQL string, passed as separate parameters to database driver.

Overhead: Query normalization + Map lookup + parameter extraction. ~0.03ms.

Execution paths:

Prisma: JavaScript → Engine Protocol → Translation → SQL Generation → Parameter Binding → postgres.js → Database

Drizzle: JavaScript → Method Chain → SQL Building → Parameter Binding → postgres.js → Database

Runtime mode: JavaScript → SQL Generation (0.2ms) → Parameter Binding → postgres.js → Database

Generator mode: JavaScript → Map lookup (0.03ms) → Parameter Binding → postgres.js → Database

For endpoints with 50 queries:

Prisma: 50 × engine overhead
Drizzle: 50 × method resolution overhead
Runtime mode: 50 × 0.2ms = 10ms overhead
Generator mode: 50 × 0.03ms = 1.5ms overhead

Queries execute at database speed. Preparation overhead becomes negligible.

Limitations

Read-only optimization. Writes (create, update, delete) use Prisma. Query engine handles transactions, optimistic locking, cascading deletes. That complexity is necessary.

Requires postgres.js or better-sqlite3. Built specifically for these drivers due to speed and predictability.

Doesn't optimize queries. Missing indexes or inefficient query logic remain slow. This removes overhead from well-structured queries only.

Generator mode needs prebaking. Hot queries must be known ahead of time. Dynamic queries use runtime generation.

Additional maintenance. Prisma's query engine is battle-tested with full team support. This is a side project with associated risks.

Pattern Recognition

This isn't breakthrough work. It's pattern recognition.

Prisma's DMMF + query JSON format contains everything for SQL generation. Translation layer between them adds overhead without adding capability. Removing that layer is efficient.

Same pattern exists elsewhere:

Drizzle: SQL-like syntax compiles to SQL directly. No ORM abstraction.

Kysely: TypeScript query builder mirroring SQL structure. Minimal translation overhead.

Slonik: Tagged template literals that are SQL with parameter interpolation.

Common thread: when query format mirrors SQL structure, heavy abstraction is unnecessary overhead.

Prisma's contribution: DMMF - complete, structured database schema from single source of truth. Makes deterministic SQL generation possible without parallel schema definitions.

Generator mode extends this: if SQL generation is deterministic, it happens at build time. Query structure is prebaked, values parameterized, overhead negligible.

Use Cases

This is a performance optimization for specific scenarios, not a Prisma replacement:

High-traffic read queries
APIs with many database calls per request
Serverless functions where cold start matters
Applications measuring query performance

For standard CRUD apps with modest traffic, Prisma's ease of use likely outweighs 2x query performance. Convenience justifies the milliseconds.

For applications at database limits, milliseconds compound to seconds. Optimization becomes worthwhile.

Goal isn't proving Prisma is slow. Goal is offering an option when performance becomes critical without sacrificing type safety.

Observations

DMMF is critical. Structured schema metadata enables deterministic SQL generation. This works because Prisma solved schema representation.

Pure functions scale. Entire SQL builder uses pure functions. Makes testing trivial, enables build-time query prebaking.

Query structure ≠ query values. @optimize directive proves query shape (fields, operators) determines SQL structure. Actual values are always parameterized, never baked.

Abstractions should match reality. When abstraction format mirrors target format, translation layers add overhead without value.

Build-time optimization compounds. Generator mode proves deterministic transformations can move to build time, eliminating runtime overhead while maintaining security through parameterization.

ORMs aren't inherently slow. Slowness from translation complexity, not concept. Simplifying translation brings ORM convenience near raw SQL performance.

Prisma team built quality tooling. This explores whether read query execution could be more direct while keeping everything else Prisma provides.

Install: npm install prisma-sql postgres

Source: https://github.com/multipliedtwice/prisma-to-sql

Removing translation overhead while keeping Prisma's type safety and DMMF. Three lines to add, zero lines to change, 2-7x faster reads.

DEV Community