๐ Node.js vs Bun vs Go: A Multi-Layer HTTP Benchmark
๐ฏ Premise
I came across a video discussing Bun's new native image support claiming "blazing fast" performance. As a software developer with 4+ years building Node services, I wanted to quantify Bun's actual performance characteristics compared to Node.js in realistic scenarios.
I included Go as a baseline representing compiled, systems-level performance to contextualize the JavaScript runtime results.
โก What this benchmark tests: Raw HTTP throughput serving static JSON responses
โ What this benchmark does NOT test: Database I/O, JSON parsing, real application logic, memory pressure, error handling, or long-tail latency under sustained load
๐ฌ Methodology: From Localhost to Cloud
Localhost benchmarks often produce misleading results because they test memory bus speeds rather than real-world network and system constraints. To uncover performance across different bottleneck layers, I tested three phases:
๐ Test Phases
| Phase | Environment | What It Reveals |
|---|---|---|
| 1๏ธโฃ Localhost | Loopback interface | Pure event loop overhead |
| 2๏ธโฃ LAN over Tailscale | WiFi network | Network I/O constraints |
| 3๏ธโฃ Cloud Datacenter | DigitalOcean droplets | Removes local hardware limits |
โ๏ธ Test Configuration
Workload: GET /json โ {"message": "Hello from [runtime]"}
Hardware:
- ๐ป Local: Windows dev machine, Tenda U10 USB WiFi adapter (WiFi 3)
- โ๏ธ Cloud: DigitalOcean shared droplets (burstable vCPUs), 10 Gbps datacenter network
Load generator: wrk -t2 -c200 -d30s
โ ๏ธ Important: Each test was run once. For production decisions, you'd want 5+ runs with statistical analysis (median, stddev, confidence intervals).
๐ Phase 1: Localhost Baseline
Testing over the loopback interface to measure pure event loop and syscall overhead without network constraints.
๐ Results: Local Memory Performance
| Configuration | Node.js | Bun | Go |
|---|---|---|---|
| 1 CPU Core | ~14,000 RPS | ~28,000 RPS | ~29,000 RPS |
| 4 CPU Cores (Single Process) | ~16,000 RPS | ~30,000 RPS | ~115,000 RPS ๐ |
| 4 CPU Cores (Multi-Process) | ~110,000 RPS | ~170,000 RPS ๐ | N/A |
๐ก Analysis
๐งต Single-threaded bottleneck: Node and Bun's JavaScript execution is single-threaded. Without process clustering, they max out one CPU core even with --cpus="4", leaving 3 cores idle. Go's M:N scheduler automatically utilizes all available cores in a single process.
๐ Multi-process scaling: Once clustered (Node's cluster module, Bun with reusePort: true spawned 4 times), both JavaScript runtimes showed strong scaling. Bun's lighter-weight Zig event loop showed ~55% higher throughput than Node's V8-based implementation.
โ What this tells us: For CPU-bound workloads on the loopback interface, Bun's event loop has measurably lower per-request overhead than Node. Go's native multi-core scheduling eliminates the need for manual clustering.
๐ Phase 2: Network-Constrained Reality (Tailscale over WiFi)
Requests now traverse a physical network: MacBook โ USB WiFi adapter โ Tailscale WireGuard VPN โ Windows dev machine.
๐ Results: Network-Bound I/O (4 Cores, Clustered)
| Metric | Node.js | Bun | Go |
|---|---|---|---|
| Throughput | 7,954 RPS | 12,519 RPS | 12,873 RPS ๐ |
| Avg Latency | 26.79 ms | 16.49 ms | 15.69 ms โก |
| Max Latency | 864.53 ms โ ๏ธ | 163.24 ms | 152.21 ms โ |
| Bandwidth | 1.62 MB/s | 1.76 MB/s | 1.67 MB/s |
๐ก Analysis
๐ Network becomes the bottleneck: All three runtimes collapsed from 30k-170k RPS to 8-13k RPS. The WiFi 3 USB adapter (theoretical max ~54 Mbps, real-world much lower) became the limiting factor.
โ ๏ธ Node's outlier spike: The 864ms max latency suggests either:
- Garbage collection pause under network pressure
- IPC coordination delays between cluster master/workers when packets arrive in bursts
- Should have been investigated with proper GC tuning flags and p99 analysis
๐ What this tells us: This phase primarily measured my WiFi adapter's limitations, not runtime performance. However, it does show that once network I/O becomes the constraint, runtime choice matters less than network hardware quality.
โ๏ธ Phase 3: Cloud Infrastructure (1 Core)
Moved to DigitalOcean droplets to remove local hardware constraints. Target and load generator in same datacenter.
๐ณ Docker Configuration
docker run --rm --cpus="1" -m="512m" -p 3000:3000 [image]
โ ๏ธ Note:
--cpus="1"uses CFS CPU quotas, not core pinning. The container can still migrate between physical cores, introducing cache invalidation. Should have used--cpuset-cpus="0"for true single-core isolation.
๐ Results: Cloud Single-Core
| Metric | Node.js | Go | Bun |
|---|---|---|---|
| Throughput | 11,705 RPS | 13,935 RPS | 25,444 RPS ๐ |
| Avg Latency | 29.13 ms | 22.83 ms | 7.89 ms โก |
| Max Latency | 2,000.00 ms โ ๏ธ | 135.97 ms | 93.27 ms โ |
| Failed Requests | 60 (timeout) ๐ด | 0 โ | 0 โ |
๐ก Analysis
๐ด Node's timeout failures: The 60 failed requests with 2-second max latency strongly suggest GC pauses. This test should have been re-run with Node tuning flags (--max-old-space-size, --optimize-for-size) to determine if this is fundamental or tunable.
๐ Bun's single-core dominance: Nearly double Go's throughput on a single core is impressive, but remember this is for a trivial 40-byte JSON response. Real applications doing actual work may show different patterns.
๐ Phase 4: Cloud Multi-Core (4 Cores)
Full resource allocation with clustering enabled for JavaScript runtimes.
๐ณ Configuration
docker run --rm --cpus="4" -m="512m" -p 3000:3000 [image]
โ๏ธ The Load Generator
# Attacker VM - wrk in Alpine container
docker run --rm alpine sh -c "apk add --no-cache wrk && \
wrk -t2 -c200 -d30s http://159.65.6.89:3000/json"
๐ Results: Cloud 4-Core Maximum Throughput
| Metric | Node.js | Go | Bun |
|---|---|---|---|
| Throughput | 31,025 RPS | 37,617 RPS | 53,446 RPS ๐ |
| Total Requests (30s) | 933,074 | 1,130,171 | 1,605,818 ๐ฏ |
| Avg Latency | 8.62 ms | 5.79 ms | 4.04 ms โก |
| Max Latency | 641.25 ms โ ๏ธ | 218.91 ms | 76.54 ms โ |
| CPU Usage | 400%+ | 340% | 383% |
๐ก CPU Efficiency Deep Dive
The raw CPU % numbers are misleading. What matters is CPU cost per request:
๐ Efficiency Ranking:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Bun: 0.0072% CPU per request ๐ฅ โ
โ Go: 0.0090% CPU per request ๐ฅ โ
โ Node: 0.0129% CPU per request ๐ฅ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Calculation:
- Node: 400% / 31k RPS = 0.0129% per request
- Go: 340% / 37.6k RPS = 0.0090% per request
- Bun: 383% / 53.4k RPS = 0.0072% per request
๐ก Key Insight: Bun is genuinely the most efficient, but Node's higher absolute CPU usage just means all workers are busyโwhich is what you want under load.
๐ค Go's "idle" CPU: The 340% (leaving 60% unused) might indicate:
- GOMAXPROCS not set correctly
- Network socket polling leaving CPU headroom
- Or simply more efficient syscall handling
โ๏ธ Code Architecture Comparison
๐จ Critical Difference: The Go Code is Heavily Optimized
The Node and Bun code use default patterns, but the Go implementation uses production micro-optimizations:
Go optimizations applied:
var jsonResponse = []byte(`{"message":"Hello from Go!"}`) // Pre-rendered
w.Write(jsonResponse) // Direct byte write, no JSON encoding
Equivalent fair comparison would be:
// Fair comparison - same as Node/Bun pattern
json.NewEncoder(w).Encode(map[string]string{"message": "Hello from Go!"})
โ ๏ธ Impact: This makes Go ~15-20% slower but tests equivalent functionality. The current benchmark favors Go's implementation.
๐ Bun "Clustering" Isn't Actually Clustering
The Bun code uses reusePort: true in a single process. This enables kernel-level socket load balancing but doesn't spawn multiple processes like Node's cluster module.
For true architectural equivalence:
// This is what would match Node's architecture
import { spawn } from "bun";
for (let i = 0; i < 4; i++) {
spawn(["bun", "server.js"]);
}
๐ก Note: The current test compares single-process Bun vs multi-process Node, which actually makes Bun's performance even more impressive but should be disclosed.
๐ฏ What This Benchmark Actually Tells Us
โ Valid Conclusions
1. โก Bun's event loop has lower per-request overhead than Node
for simple HTTP responses
2. ๐ Bun scales efficiently to multiple cores via kernel-level
socket distribution
3. ๐ฏ Go provides predictable performance with excellent CPU
efficiency
4. ๐ Network hardware matters more than runtime choice once
you hit I/O limits
5. ๐ Node's cluster architecture has measurable IPC overhead
under high load
โ Invalid Conclusions
"Bun is production-ready"
Speed โ ecosystem maturity, debugger support, APM tooling
"Node is slow"
This tests static JSON echo; database-heavy apps show different patterns
"Go is always better"
Developer productivity, ecosystem, and deployment complexity matter
"These numbers apply to my app"
Real apps do parsing, validation, DB queries, business logic
๐ Limitations & What's Missing
โ Not Tested
Click to expand - What this benchmark doesn't cover
- Realistic payloads: 10KB+ JSON parsing and validation
- Database I/O: Connection pooling, query performance
- Memory pressure: Behavior at 80%+ RAM utilization
- Sustained load: 24-hour endurance, memory leaks
- Error handling: Behavior under packet loss, slow clients
- Cold starts: Container spin-up time (critical for serverless)
- Long-tail latency: p95, p99, p99.9 percentiles over hours
๐ง Methodological Improvements Needed
๐ Statistical rigor
5+ runs per config with statistical significance testing๐ฏ Proper CPU pinning
Use--cpuset-cpusinstead of--cpusโ๏ธ GC tuning for Node
Test with optimized V8 flagsโ๏ธ Fair code comparison
Either optimize all three or use stock patterns for all๐ Proper clustering for Bun
Multi-process architecture to match Node
๐ฏ Production Recommendations
Choose based on your actual constraints:
๐ฐ Use Bun if:
โ
You have existing Node.js code and want drop-in performance gains
โ
Your workload is I/O-heavy API routing/proxying
โ
You're comfortable with a newer ecosystem (risk tolerance)
โ ๏ธ You can handle potential edge cases in package compatibility
๐ท Use Go if:
โ
You need predictable resource consumption for Kubernetes limits
โ
Your team values type safety and compile-time checks
โ
You're building infrastructure/platform services
โ
You need maximum efficiency per CPU core
โ
Long-term stability and tooling maturity matter
๐ข Use Node.js if:
โ
You have existing Node infrastructure and expertise
โ
Your bottleneck is database/external services (not event loop)
โ
Ecosystem maturity and package availability are critical
โ
You need battle-tested observability/APM tooling
๐ The Real Takeaway
For a 40-byte JSON echo server, Bun is measurably faster than Node.js.
But real applications aren't JSON echo servers. Your actual bottlenecks are probably:
๐๏ธ Database query time
๐ External API latency
๐งฎ Business logic complexity
๐ก Network infrastructure
Profile your real workload before choosing a runtime based on microbenchmarks.
That said, Bun's performance characteristics are impressive and worth evaluating for I/O-heavy services where event loop overhead matters.
๐ Full Code Listings
Node.js (Clustered)
const cluster = require('cluster');
const http = require('http');
const numCPUs = 4;
if (cluster.isMaster) {
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker) => {
console.log(`Worker ${worker.process.pid} died`);
cluster.fork();
});
} else {
http.createServer((req, res) => {
if (req.method === 'GET' && req.url === '/json') {
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ message: "Hello from Clustered Node!" }));
} else {
res.writeHead(404);
res.end();
}
}).listen(3000);
}
Bun (Single Process with reusePort)
Bun.serve({
port: 3000,
reusePort: true, // Kernel-level socket load balancing
fetch(request) {
const url = new URL(request.url);
if (request.method === 'GET' && url.pathname === '/json') {
return new Response(
JSON.stringify({ message: "Hello from Bun!" }),
{ headers: { 'Content-Type': 'application/json' } }
);
}
return new Response("Not Found", { status: 404 });
},
});
Go (Optimized - Not Fair Comparison)
โ ๏ธ This version pre-renders the response and skips JSON encoding
package main
import (
"fmt"
"net/http"
)
// Pre-rendered response eliminates JSON encoding overhead
var jsonResponse = []byte(`{"message":"Hello from Go!"}`)
func jsonHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
w.WriteHeader(http.StatusMethodNotAllowed)
return
}
w.Header().Set("Content-Type", "application/json")
w.Write(jsonResponse) // Note: Ignoring error - not production-safe
}
func main() {
server := &http.Server{
Addr: ":3000",
Handler: http.HandlerFunc(jsonHandler),
}
fmt.Println("Go server running on port 3000")
server.ListenAndServe()
}
Go (Fair Comparison - Uses JSON Encoding)
โ
This version matches Node/Bun's approach
package main
import (
"encoding/json"
"net/http"
)
type Response struct {
Message string `json:"message"`
}
func jsonHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
w.WriteHeader(http.StatusMethodNotAllowed)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(Response{Message: "Hello from Go!"})
}
func main() {
http.HandleFunc("/json", jsonHandler)
http.ListenAndServe(":3000", nil)
}
๐ Acknowledgments
Thanks to the readers who will inevitably point out additional issues I missed. Benchmarking is hard, and there's always room for improvement.
If you want to reproduce these tests or improve the methodology, feel free to reach out!
Found this useful? Drop a โค๏ธ and let me know what you'd like to see benchmarked next!
Top comments (0)