DEV Community

Aditya Srivastava
Aditya Srivastava

Posted on

Node.js vs Bun vs Go - A Multi-Layer HTTP Benchmark

๐Ÿš€ Node.js vs Bun vs Go: A Multi-Layer HTTP Benchmark

๐ŸŽฏ Premise

I came across a video discussing Bun's new native image support claiming "blazing fast" performance. As a software developer with 4+ years building Node services, I wanted to quantify Bun's actual performance characteristics compared to Node.js in realistic scenarios.

I included Go as a baseline representing compiled, systems-level performance to contextualize the JavaScript runtime results.

โšก What this benchmark tests: Raw HTTP throughput serving static JSON responses

โŒ What this benchmark does NOT test: Database I/O, JSON parsing, real application logic, memory pressure, error handling, or long-tail latency under sustained load


๐Ÿ”ฌ Methodology: From Localhost to Cloud

Localhost benchmarks often produce misleading results because they test memory bus speeds rather than real-world network and system constraints. To uncover performance across different bottleneck layers, I tested three phases:

๐Ÿ“Š Test Phases

Phase Environment What It Reveals
1๏ธโƒฃ Localhost Loopback interface Pure event loop overhead
2๏ธโƒฃ LAN over Tailscale WiFi network Network I/O constraints
3๏ธโƒฃ Cloud Datacenter DigitalOcean droplets Removes local hardware limits

โš™๏ธ Test Configuration

Workload: GET /json โ†’ {"message": "Hello from [runtime]"}

Hardware:

  • ๐Ÿ’ป Local: Windows dev machine, Tenda U10 USB WiFi adapter (WiFi 3)
  • โ˜๏ธ Cloud: DigitalOcean shared droplets (burstable vCPUs), 10 Gbps datacenter network

Load generator: wrk -t2 -c200 -d30s

โš ๏ธ Important: Each test was run once. For production decisions, you'd want 5+ runs with statistical analysis (median, stddev, confidence intervals).


๐Ÿ Phase 1: Localhost Baseline

Testing over the loopback interface to measure pure event loop and syscall overhead without network constraints.

๐Ÿ“ˆ Results: Local Memory Performance

Configuration Node.js Bun Go
1 CPU Core ~14,000 RPS ~28,000 RPS ~29,000 RPS
4 CPU Cores (Single Process) ~16,000 RPS ~30,000 RPS ~115,000 RPS ๐Ÿš€
4 CPU Cores (Multi-Process) ~110,000 RPS ~170,000 RPS ๐Ÿ† N/A

๐Ÿ’ก Analysis

๐Ÿงต Single-threaded bottleneck: Node and Bun's JavaScript execution is single-threaded. Without process clustering, they max out one CPU core even with --cpus="4", leaving 3 cores idle. Go's M:N scheduler automatically utilizes all available cores in a single process.

๐Ÿ“Š Multi-process scaling: Once clustered (Node's cluster module, Bun with reusePort: true spawned 4 times), both JavaScript runtimes showed strong scaling. Bun's lighter-weight Zig event loop showed ~55% higher throughput than Node's V8-based implementation.

โœ… What this tells us: For CPU-bound workloads on the loopback interface, Bun's event loop has measurably lower per-request overhead than Node. Go's native multi-core scheduling eliminates the need for manual clustering.


๐ŸŒ Phase 2: Network-Constrained Reality (Tailscale over WiFi)

Requests now traverse a physical network: MacBook โ†’ USB WiFi adapter โ†’ Tailscale WireGuard VPN โ†’ Windows dev machine.

๐Ÿ“Š Results: Network-Bound I/O (4 Cores, Clustered)

Metric Node.js Bun Go
Throughput 7,954 RPS 12,519 RPS 12,873 RPS ๐Ÿ†
Avg Latency 26.79 ms 16.49 ms 15.69 ms โšก
Max Latency 864.53 ms โš ๏ธ 163.24 ms 152.21 ms โœ…
Bandwidth 1.62 MB/s 1.76 MB/s 1.67 MB/s

๐Ÿ’ก Analysis

๐Ÿ”Œ Network becomes the bottleneck: All three runtimes collapsed from 30k-170k RPS to 8-13k RPS. The WiFi 3 USB adapter (theoretical max ~54 Mbps, real-world much lower) became the limiting factor.

โš ๏ธ Node's outlier spike: The 864ms max latency suggests either:

  • Garbage collection pause under network pressure
  • IPC coordination delays between cluster master/workers when packets arrive in bursts
  • Should have been investigated with proper GC tuning flags and p99 analysis

๐Ÿ“Œ What this tells us: This phase primarily measured my WiFi adapter's limitations, not runtime performance. However, it does show that once network I/O becomes the constraint, runtime choice matters less than network hardware quality.


โ˜๏ธ Phase 3: Cloud Infrastructure (1 Core)

Moved to DigitalOcean droplets to remove local hardware constraints. Target and load generator in same datacenter.

๐Ÿณ Docker Configuration

docker run --rm --cpus="1" -m="512m" -p 3000:3000 [image]
Enter fullscreen mode Exit fullscreen mode

โš ๏ธ Note: --cpus="1" uses CFS CPU quotas, not core pinning. The container can still migrate between physical cores, introducing cache invalidation. Should have used --cpuset-cpus="0" for true single-core isolation.

๐Ÿ“Š Results: Cloud Single-Core

Metric Node.js Go Bun
Throughput 11,705 RPS 13,935 RPS 25,444 RPS ๐Ÿš€
Avg Latency 29.13 ms 22.83 ms 7.89 ms โšก
Max Latency 2,000.00 ms โš ๏ธ 135.97 ms 93.27 ms โœ…
Failed Requests 60 (timeout) ๐Ÿ”ด 0 โœ… 0 โœ…

๐Ÿ’ก Analysis

๐Ÿ”ด Node's timeout failures: The 60 failed requests with 2-second max latency strongly suggest GC pauses. This test should have been re-run with Node tuning flags (--max-old-space-size, --optimize-for-size) to determine if this is fundamental or tunable.

๐Ÿ† Bun's single-core dominance: Nearly double Go's throughput on a single core is impressive, but remember this is for a trivial 40-byte JSON response. Real applications doing actual work may show different patterns.


๐Ÿš€ Phase 4: Cloud Multi-Core (4 Cores)

Full resource allocation with clustering enabled for JavaScript runtimes.

๐Ÿณ Configuration

docker run --rm --cpus="4" -m="512m" -p 3000:3000 [image]
Enter fullscreen mode Exit fullscreen mode

โš”๏ธ The Load Generator

# Attacker VM - wrk in Alpine container
docker run --rm alpine sh -c "apk add --no-cache wrk && \
  wrk -t2 -c200 -d30s http://159.65.6.89:3000/json"
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“Š Results: Cloud 4-Core Maximum Throughput

Metric Node.js Go Bun
Throughput 31,025 RPS 37,617 RPS 53,446 RPS ๐Ÿ†
Total Requests (30s) 933,074 1,130,171 1,605,818 ๐ŸŽฏ
Avg Latency 8.62 ms 5.79 ms 4.04 ms โšก
Max Latency 641.25 ms โš ๏ธ 218.91 ms 76.54 ms โœ…
CPU Usage 400%+ 340% 383%

๐Ÿ’ก CPU Efficiency Deep Dive

The raw CPU % numbers are misleading. What matters is CPU cost per request:

๐Ÿ“Š Efficiency Ranking:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Bun:  0.0072% CPU per request  ๐Ÿฅ‡  โ”‚
โ”‚ Go:   0.0090% CPU per request  ๐Ÿฅˆ  โ”‚
โ”‚ Node: 0.0129% CPU per request  ๐Ÿฅ‰  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Enter fullscreen mode Exit fullscreen mode

Calculation:

  • Node: 400% / 31k RPS = 0.0129% per request
  • Go: 340% / 37.6k RPS = 0.0090% per request
  • Bun: 383% / 53.4k RPS = 0.0072% per request

๐Ÿ’ก Key Insight: Bun is genuinely the most efficient, but Node's higher absolute CPU usage just means all workers are busyโ€”which is what you want under load.

๐Ÿค” Go's "idle" CPU: The 340% (leaving 60% unused) might indicate:

  • GOMAXPROCS not set correctly
  • Network socket polling leaving CPU headroom
  • Or simply more efficient syscall handling

โš–๏ธ Code Architecture Comparison

๐Ÿšจ Critical Difference: The Go Code is Heavily Optimized

The Node and Bun code use default patterns, but the Go implementation uses production micro-optimizations:

Go optimizations applied:

var jsonResponse = []byte(`{"message":"Hello from Go!"}`)  // Pre-rendered
w.Write(jsonResponse)  // Direct byte write, no JSON encoding
Enter fullscreen mode Exit fullscreen mode

Equivalent fair comparison would be:

// Fair comparison - same as Node/Bun pattern
json.NewEncoder(w).Encode(map[string]string{"message": "Hello from Go!"})
Enter fullscreen mode Exit fullscreen mode

โš ๏ธ Impact: This makes Go ~15-20% slower but tests equivalent functionality. The current benchmark favors Go's implementation.


๐Ÿ” Bun "Clustering" Isn't Actually Clustering

The Bun code uses reusePort: true in a single process. This enables kernel-level socket load balancing but doesn't spawn multiple processes like Node's cluster module.

For true architectural equivalence:

// This is what would match Node's architecture
import { spawn } from "bun";
for (let i = 0; i < 4; i++) {
  spawn(["bun", "server.js"]);
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ’ก Note: The current test compares single-process Bun vs multi-process Node, which actually makes Bun's performance even more impressive but should be disclosed.


๐ŸŽฏ What This Benchmark Actually Tells Us

โœ… Valid Conclusions

1. โšก Bun's event loop has lower per-request overhead than Node 
   for simple HTTP responses

2. ๐Ÿ“ˆ Bun scales efficiently to multiple cores via kernel-level 
   socket distribution  

3. ๐ŸŽฏ Go provides predictable performance with excellent CPU 
   efficiency

4. ๐ŸŒ Network hardware matters more than runtime choice once 
   you hit I/O limits

5. ๐Ÿ”„ Node's cluster architecture has measurable IPC overhead 
   under high load
Enter fullscreen mode Exit fullscreen mode

โŒ Invalid Conclusions

"Bun is production-ready"

Speed โ‰  ecosystem maturity, debugger support, APM tooling

"Node is slow"

This tests static JSON echo; database-heavy apps show different patterns

"Go is always better"

Developer productivity, ecosystem, and deployment complexity matter

"These numbers apply to my app"

Real apps do parsing, validation, DB queries, business logic


๐Ÿ” Limitations & What's Missing

โŒ Not Tested

Click to expand - What this benchmark doesn't cover

  • Realistic payloads: 10KB+ JSON parsing and validation
  • Database I/O: Connection pooling, query performance
  • Memory pressure: Behavior at 80%+ RAM utilization
  • Sustained load: 24-hour endurance, memory leaks
  • Error handling: Behavior under packet loss, slow clients
  • Cold starts: Container spin-up time (critical for serverless)
  • Long-tail latency: p95, p99, p99.9 percentiles over hours

๐Ÿ”ง Methodological Improvements Needed

๐Ÿ“Š Statistical rigor

5+ runs per config with statistical significance testing

๐ŸŽฏ Proper CPU pinning

Use --cpuset-cpus instead of --cpus

โš™๏ธ GC tuning for Node

Test with optimized V8 flags

โš–๏ธ Fair code comparison

Either optimize all three or use stock patterns for all

๐Ÿ”„ Proper clustering for Bun

Multi-process architecture to match Node


๐ŸŽฏ Production Recommendations

Choose based on your actual constraints:

๐Ÿฐ Use Bun if:

โœ… You have existing Node.js code and want drop-in performance gains
โœ… Your workload is I/O-heavy API routing/proxying
โœ… You're comfortable with a newer ecosystem (risk tolerance)
โš ๏ธ  You can handle potential edge cases in package compatibility
Enter fullscreen mode Exit fullscreen mode

๐Ÿ”ท Use Go if:

โœ… You need predictable resource consumption for Kubernetes limits
โœ… Your team values type safety and compile-time checks
โœ… You're building infrastructure/platform services
โœ… You need maximum efficiency per CPU core
โœ… Long-term stability and tooling maturity matter
Enter fullscreen mode Exit fullscreen mode

๐ŸŸข Use Node.js if:

โœ… You have existing Node infrastructure and expertise
โœ… Your bottleneck is database/external services (not event loop)
โœ… Ecosystem maturity and package availability are critical
โœ… You need battle-tested observability/APM tooling
Enter fullscreen mode Exit fullscreen mode

๐Ÿ’Ž The Real Takeaway

For a 40-byte JSON echo server, Bun is measurably faster than Node.js.

But real applications aren't JSON echo servers. Your actual bottlenecks are probably:

๐Ÿ—„๏ธ  Database query time
๐ŸŒ  External API latency  
๐Ÿงฎ  Business logic complexity
๐Ÿ“ก  Network infrastructure
Enter fullscreen mode Exit fullscreen mode

Profile your real workload before choosing a runtime based on microbenchmarks.

That said, Bun's performance characteristics are impressive and worth evaluating for I/O-heavy services where event loop overhead matters.


๐Ÿ“ Full Code Listings

Node.js (Clustered)

const cluster = require('cluster');
const http = require('http');
const numCPUs = 4;

if (cluster.isMaster) {
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
  cluster.on('exit', (worker) => {
    console.log(`Worker ${worker.process.pid} died`);
    cluster.fork();
  });
} else {
  http.createServer((req, res) => {
    if (req.method === 'GET' && req.url === '/json') {
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ message: "Hello from Clustered Node!" }));
    } else {
      res.writeHead(404);
      res.end();
    }
  }).listen(3000);
}
Enter fullscreen mode Exit fullscreen mode

Bun (Single Process with reusePort)

Bun.serve({
  port: 3000,
  reusePort: true, // Kernel-level socket load balancing
  fetch(request) {
    const url = new URL(request.url);
    if (request.method === 'GET' && url.pathname === '/json') {
      return new Response(
        JSON.stringify({ message: "Hello from Bun!" }), 
        { headers: { 'Content-Type': 'application/json' } }
      );
    }
    return new Response("Not Found", { status: 404 });
  },
});
Enter fullscreen mode Exit fullscreen mode

Go (Optimized - Not Fair Comparison)

โš ๏ธ This version pre-renders the response and skips JSON encoding

package main

import (
    "fmt"
    "net/http"
)

// Pre-rendered response eliminates JSON encoding overhead
var jsonResponse = []byte(`{"message":"Hello from Go!"}`)

func jsonHandler(w http.ResponseWriter, r *http.Request) {
    if r.Method != http.MethodGet {
        w.WriteHeader(http.StatusMethodNotAllowed)
        return
    }
    w.Header().Set("Content-Type", "application/json")
    w.Write(jsonResponse) // Note: Ignoring error - not production-safe
}

func main() {
    server := &http.Server{
        Addr:    ":3000",
        Handler: http.HandlerFunc(jsonHandler),
    }
    fmt.Println("Go server running on port 3000")
    server.ListenAndServe()
}
Enter fullscreen mode Exit fullscreen mode

Go (Fair Comparison - Uses JSON Encoding)

โœ… This version matches Node/Bun's approach

package main

import (
    "encoding/json"
    "net/http"
)

type Response struct {
    Message string `json:"message"`
}

func jsonHandler(w http.ResponseWriter, r *http.Request) {
    if r.Method != http.MethodGet {
        w.WriteHeader(http.StatusMethodNotAllowed)
        return
    }
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(Response{Message: "Hello from Go!"})
}

func main() {
    http.HandleFunc("/json", jsonHandler)
    http.ListenAndServe(":3000", nil)
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ™ Acknowledgments

Thanks to the readers who will inevitably point out additional issues I missed. Benchmarking is hard, and there's always room for improvement.

If you want to reproduce these tests or improve the methodology, feel free to reach out!


Found this useful? Drop a โค๏ธ and let me know what you'd like to see benchmarked next!

Top comments (0)