DEV Community

ahmet gedik
ahmet gedik

Posted on

Bun.js and Node.js Runtime Benchmark for a Video Metadata API

Our video metadata endpoint serves the same JSON shape a few hundred times a second: take a video ID, look up title, channel, duration, thumbnail set, and a handful of related IDs, then return it. On DailyWatch the canonical version of that endpoint is PHP 8.4 reading from SQLite with an FTS5 index, fronted by LiteSpeed and Cloudflare. It is fast and boring, which is exactly what I want from a hot path. But every quarter someone on the team asks whether we should move the read API to a JavaScript runtime so the frontend and the API share one language, and every quarter the question quietly turns into "Bun or Node?"

Instead of arguing about it from blog posts, I built the smallest honest version of our metadata endpoint twice — once tuned for Bun, once for Node — pointed them at the same SQLite file, and hammered both with an identical load generator. This is what I found, including the parts that made me keep PHP for the canonical path.

What the endpoint actually does

The real handler is not a hello world. It does three things that dominate latency:

  • A primary key lookup for the video row.
  • A small IN (...) query for related videos (usually 8 IDs).
  • JSON serialization of a nested object with about 30 fields.

No network calls, no upstream APIs. That matters, because a lot of "Bun is 3x faster" benchmarks measure an empty HTTP handler, where the only thing being timed is the runtime's socket accept loop. Real APIs spend their time in the database driver and the JSON encoder, not in accept(). So the benchmark had to include a real SQLite read against a real file — the same videos.db our cron jobs populate, copied to a tmpfs so disk seek noise did not pollute the numbers.

Here is the canonical PHP version, lightly trimmed, so the JS ports have something faithful to copy:

<?php
// metadata.php - the shape both JS runtimes have to match
declare(strict_types=1);

function fetch_metadata(PDO $db, string $videoId): ?array {
    $stmt = $db->prepare(
        'SELECT id, title, channel, duration_s, published_at, thumb_json
         FROM videos WHERE id = :id LIMIT 1'
    );
    $stmt->execute([':id' => $videoId]);
    $row = $stmt->fetch(PDO::FETCH_ASSOC);
    if ($row === false) {
        return null;
    }

    $rel = $db->prepare(
        'SELECT id FROM video_related WHERE source_id = :id LIMIT 8'
    );
    $rel->execute([':id' => $videoId]);
    $related = $rel->fetchAll(PDO::FETCH_COLUMN);

    return [
        'id'        => $row['id'],
        'title'     => $row['title'],
        'channel'   => $row['channel'],
        'duration'  => (int) $row['duration_s'],
        'published' => $row['published_at'],
        'thumbs'    => json_decode($row['thumb_json'], true),
        'related'   => $related,
    ];
}

$db = new PDO('sqlite:/dev/shm/videos.db');
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$db->exec('PRAGMA query_only = ON');
$db->exec('PRAGMA mmap_size = 268435456');

header('Content-Type: application/json');
echo json_encode(fetch_metadata($db, $_GET['id'] ?? ''));
Enter fullscreen mode Exit fullscreen mode

Under LiteSpeed with OPcache warm, this handler returns in about 0.4 ms of server time for a cache miss, and Cloudflare absorbs the rest. The goal for the JS ports was simply: can either of them beat that without giving up correctness?

The Bun version

Bun ships its own SQLite driver, bun:sqlite, written in Zig and bound directly into the runtime. There is no node-gyp build step and no third-party native module to keep current. It also exposes prepared statements that you reuse across requests, which is the single most important optimization for this workload — preparing a statement per request roughly doubled my latency in early runs.

// server.bun.js - run with: bun run server.bun.js
import { Database } from "bun:sqlite";

const db = new Database("/dev/shm/videos.db", { readonly: true });
db.exec("PRAGMA mmap_size = 268435456");

// Prepare ONCE, reuse across every request.
const getVideo = db.query(
  `SELECT id, title, channel, duration_s, published_at, thumb_json
   FROM videos WHERE id = ?1 LIMIT 1`
);
const getRelated = db.query(
  `SELECT id FROM video_related WHERE source_id = ?1 LIMIT 8`
);

function metadata(id) {
  const row = getVideo.get(id);
  if (!row) return null;
  const related = getRelated.all(id).map((r) => r.id);
  return {
    id: row.id,
    title: row.title,
    channel: row.channel,
    duration: row.duration_s,
    published: row.published_at,
    thumbs: JSON.parse(row.thumb_json),
    related,
  };
}

Bun.serve({
  port: 3001,
  fetch(req) {
    const id = new URL(req.url).searchParams.get("id") ?? "";
    const data = metadata(id);
    if (!data) return new Response("null", { status: 404 });
    return Response.json(data);
  },
});

console.log("bun listening on :3001");
Enter fullscreen mode Exit fullscreen mode

Two things stand out here. Bun.serve is the native HTTP server, not an Express-style abstraction, so there is no middleware stack between the socket and your handler. And bun:sqlite returns plain objects you can hand almost directly to Response.json. The whole file is one process, no clustering, no flags.

The Node version

Node 22 finally has a built-in SQLite module (node:sqlite), still marked experimental, plus the mature better-sqlite3 native addon. I tested both. To keep the comparison fair I used the same synchronous prepared-statement pattern and Node's built-in http server rather than Express, because adding Express would benchmark Express, not Node.

// server.node.js - run with: node --experimental-sqlite server.node.js
import { createServer } from "node:http";
import { DatabaseSync } from "node:sqlite";

const db = new DatabaseSync("/dev/shm/videos.db", { readOnly: true });
db.exec("PRAGMA mmap_size = 268435456");

const getVideo = db.prepare(
  `SELECT id, title, channel, duration_s, published_at, thumb_json
   FROM videos WHERE id = ? LIMIT 1`
);
const getRelated = db.prepare(
  `SELECT id FROM video_related WHERE source_id = ? LIMIT 8`
);

function metadata(id) {
  const row = getVideo.get(id);
  if (!row) return null;
  const related = getRelated.all(id).map((r) => r.id);
  return {
    id: row.id,
    title: row.title,
    channel: row.channel,
    duration: row.duration_s,
    published: row.published_at,
    thumbs: JSON.parse(row.thumb_json),
    related,
  };
}

createServer((req, res) => {
  const url = new URL(req.url, "http://localhost");
  const data = metadata(url.searchParams.get("id") ?? "");
  if (!data) {
    res.writeHead(404).end("null");
    return;
  }
  res.writeHead(200, { "content-type": "application/json" });
  res.end(JSON.stringify(data));
}).listen(3002, () => console.log("node listening on :3002"));
Enter fullscreen mode Exit fullscreen mode

The code is almost identical, which is the point. Any difference in throughput now comes from the runtime and its driver, not from how I wrote the handler.

The load generator

I did not want to benchmark with a Node-based tool, because then the client's runtime becomes a variable. I wrote a tiny Go load generator instead — Go's goroutine model makes it trivial to saturate a local server, and its scheduler is completely independent of whatever I am testing. It fires a fixed number of concurrent workers, replays a list of real video IDs, and records per-request latency.

// loadgen.go - go run loadgen.go http://localhost:3001 60 30000
package main

import (
    "bufio"
    "fmt"
    "io"
    "net/http"
    "os"
    "sort"
    "strconv"
    "sync"
    "sync/atomic"
    "time"
)

func main() {
    base := os.Args[1]
    workers, _ := strconv.Atoi(os.Args[2])
    total, _ := strconv.Atoi(os.Args[3])

    ids := loadIDs("ids.txt")
    client := &http.Client{Timeout: 5 * time.Second}
    var done int64
    lat := make([]time.Duration, total)
    var wg sync.WaitGroup

    start := time.Now()
    for w := 0; w < workers; w++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for {
                i := atomic.AddInt64(&done, 1) - 1
                if i >= int64(total) {
                    return
                }
                id := ids[int(i)%len(ids)]
                t0 := time.Now()
                resp, err := client.Get(base + "/?id=" + id)
                if err == nil {
                    io.Copy(io.Discard, resp.Body)
                    resp.Body.Close()
                }
                lat[i] = time.Since(t0)
            }
        }()
    }
    wg.Wait()
    elapsed := time.Since(start)

    sort.Slice(lat, func(i, j int) bool { return lat[i] < lat[j] })
    fmt.Printf("rps=%.0f p50=%v p99=%v\n",
        float64(total)/elapsed.Seconds(),
        lat[total*50/100], lat[total*99/100])
}

func loadIDs(path string) []string {
    f, _ := os.Open(path)
    defer f.Close()
    var out []string
    s := bufio.NewScanner(f)
    for s.Scan() {
        out = append(out, s.Text())
    }
    return out
}
Enter fullscreen mode Exit fullscreen mode

Running the same binary against both servers means the only thing that changes between runs is the port. I used 60 concurrent workers and 30,000 requests per run, repeated five times per runtime, discarding the first run as warmup, on a 4-core cloud box with the database on tmpfs.

The numbers

Here is the median of the warm runs. Latency is wall-clock from the Go client, so it includes loopback overhead that is identical for both.

  • Bun 1.1 + bun:sqlite: ~58,000 rps, p50 0.92 ms, p99 2.7 ms
  • Node 22 + better-sqlite3: ~41,000 rps, p50 1.34 ms, p99 4.1 ms
  • Node 22 + node:sqlite: ~36,000 rps, p50 1.51 ms, p99 5.3 ms
  • PHP 8.4 + LiteSpeed (reference): ~33,000 rps, p50 1.6 ms, p99 6.0 ms

Bun was roughly 40% ahead of Node-with-better-sqlite3 on throughput and noticeably tighter at the tail. The built-in node:sqlite module trailed the mature better-sqlite3 addon, which surprised me less after reading that it is still stabilizing. The gap between Bun and Node here is real but it is not the 3-4x figure you see in empty-handler benchmarks — once a real SQLite read and JSON encode dominate the request, the runtime's HTTP layer is a smaller slice of the pie.

A few things I want to be honest about:

  • The win shrinks as work grows. When I added a deliberately heavier query (FTS5 search instead of a primary-key lookup), Bun's lead fell to about 12%, because both runtimes were now waiting on the same SQLite engine.
  • Warmup matters for Node. Node's first 2-3 seconds under load were visibly slower while the JIT warmed up. Bun reached steady state faster. For a long-lived server this is irrelevant; for short-lived serverless invocations it is not.
  • Memory was close. Bun's RSS sat around 95 MB, Node around 110 MB under the same load. Neither is a problem at this scale.

A small Python script to keep the runs honest

Benchmarks lie when you eyeball one run. I logged each run's output and used a short Python script to check that the variance between runs was small enough to trust the medians. If the runs disagree by more than a few percent, the box is noisy and the comparison is meaningless.

# variance_check.py - feed it the rps from each run
import statistics
import sys

def summarize(label, samples):
    mean = statistics.mean(samples)
    stdev = statistics.pstdev(samples)
    cv = (stdev / mean) * 100
    flag = "OK" if cv < 5 else "NOISY"
    print(f"{label:12} mean={mean:8.0f} cv={cv:4.1f}% [{flag}]")
    return cv

bun  = [57900, 58400, 57600, 58800, 58100]
node = [40800, 41200, 41500, 40600, 41100]

summarize("bun", bun)
summarize("node", node)

if min(bun) > max(node):
    print("verdict: bun strictly faster across all runs")
else:
    print("verdict: runs overlap, gap is within noise")
Enter fullscreen mode Exit fullscreen mode

With a coefficient of variation under 5% on both, and Bun's slowest run still faster than Node's fastest, I trusted the result. This step is the difference between "Bun is faster" and "Bun was faster on my box on Tuesday."

So why is the canonical API still PHP

Bun won the benchmark cleanly, and if I were starting a new JavaScript service today I would reach for it over Node without hesitation for this kind of read-heavy JSON workload. But raw runtime throughput was never the bottleneck for our metadata path, and the benchmark made that obvious in a way I should have predicted:

  • Cloudflare already eats most reads. With sane cache headers, the vast majority of metadata requests never touch the origin at all. A 40% origin speedup on the 5% of traffic that misses cache is a rounding error on the user-facing number.
  • LiteSpeed plus OPcache is not the slow part. The PHP reference came within striking distance of Node despite zero tuning, because the request is dominated by SQLite and serialization, which are the same regardless of language.
  • Operational surface area has a cost. Our cron jobs, sitemap generation, IndexNow pings, and admin panel are all PHP against the same SQLite file. Introducing a second runtime to shave sub-millisecond origin latency means two deploy pipelines, two dependency trees, and two places a SQLite write lock can bite.

Where I would use Bun is a genuinely new, latency-sensitive service that does not share the PHP codebase — a websocket layer for live view counts, or an edge-side metadata aggregator that fans out to several internal endpoints. There, Bun's startup speed, native SQLite driver, and tighter tail latency are real advantages, and there is no incumbent PHP code to duplicate.

Conclusion

Bun beat Node on this workload by about 40% on throughput and held a tighter p99, and its built-in SQLite driver removed a whole class of native-module maintenance pain. The built-in node:sqlite module is promising but not yet a reason to switch off better-sqlite3. If you are choosing a JavaScript runtime for a read-heavy JSON API in 2026, Bun is the easy pick.

But benchmark the thing you actually ship, not an empty handler. Once a real database read and a real JSON encode are in the loop, the runtime gap narrows, and the bigger wins live in your cache layer and your query plan. For us, the metadata path stays PHP plus SQLite behind LiteSpeed and Cloudflare, because the runtime was never the bottleneck — and the fastest request is still the one that never reaches your origin at all.

Top comments (0)