DEV Community

Aissam Irhir
Aissam Irhir

Posted on

AI Image Preprocessing in JavaScript: The Missing Piece for ML Engineers

🧠 AI Image Preprocessing in JavaScript: The Missing Piece for ML Engineers

Stop wasting hours on Python preprocessing pipelines. Here's how to prepare ML-ready tensors directly in JavaScript — 80 images/second with SIMD acceleration.


The Problem Every ML Engineer Knows

You've built an amazing ML model. PyTorch, TensorFlow, ONNX — doesn't matter. But before inference, you need to:

  1. Load images
  2. Resize to exact dimensions (224×224, 640×640...)
  3. Normalize pixel values (ImageNet, CLIP...)
  4. Convert to tensor format (CHW vs HWC)
  5. Handle batching

In Python? Easy. In JavaScript? Until now, impossible without slow workarounds.

// ❌ The old way: Multiple libraries, slow, no SIMD
import sharp from 'sharp';
// Resize... but no tensor output
// Manual Float32Array creation... no normalization
// No SIMD... CPU bottleneck
Enter fullscreen mode Exit fullscreen mode

What if I told you there's now a JavaScript package that does ALL of this natively, with SIMD acceleration?


🚀 Introducing Native ML Tensor Conversion for JavaScript

bun-image-turbo v1.7.0 is the first JavaScript package with native SIMD-accelerated image-to-tensor conversion.

bun add bun-image-turbo
# or: npm install bun-image-turbo
Enter fullscreen mode Exit fullscreen mode

One Function. ML-Ready Output.

import { toTensor } from 'bun-image-turbo';

const buffer = Buffer.from(await Bun.file('photo.jpg').arrayBuffer());

// PyTorch/ONNX ready in ONE line
const tensor = await toTensor(buffer, {
  width: 224,
  height: 224,
  normalization: 'Imagenet',  // Built-in ImageNet normalization!
  layout: 'Chw',              // Channel-first for PyTorch
  batch: true                 // Add batch dimension
});

// Shape: [1, 3, 224, 224] — Ready for inference!
const float32Data = tensor.toFloat32Array();
Enter fullscreen mode Exit fullscreen mode

Output:

Shape: [1, 3, 224, 224]
Dtype: Float32
Ready for: PyTorch, ONNX Runtime, TensorFlow
Enter fullscreen mode Exit fullscreen mode

🎯 Why This Matters for AI/ML

Before: The JavaScript ML Preprocessing Nightmare

// ❌ OLD: Multiple steps, no SIMD, slow
const sharp = require('sharp');
const pixels = await sharp(buffer)
  .resize(224, 224)
  .raw()
  .toBuffer();

// Manual normalization (slow, error-prone)
const float32 = new Float32Array(224 * 224 * 3);
const mean = [0.485, 0.456, 0.406];
const std = [0.229, 0.224, 0.225];

for (let i = 0; i < pixels.length; i++) {
  const channel = i % 3;
  float32[i] = (pixels[i] / 255 - mean[channel]) / std[channel];
}

// Manual CHW conversion (even slower)
// ... 50 more lines of code
Enter fullscreen mode Exit fullscreen mode

After: One Line, SIMD-Accelerated

// ✅ NEW: Native SIMD, built-in normalization
const tensor = await toTensor(buffer, {
  width: 224, height: 224,
  normalization: 'Imagenet',
  layout: 'Chw', batch: true
});
Enter fullscreen mode Exit fullscreen mode

⚡ Performance That Changes Everything

Benchmarked on Apple M1 Pro:

Operation bun-image-turbo Manual JS Speedup
224×224 ImageNet 12.5ms ~45ms 3.6x
224×224 Uint8 5.2ms ~20ms 3.8x
1920×1080 → 224×224 25.8ms ~80ms 3.1x
Throughput 80 img/s ~22 img/s 3.6x

Why so fast?

  • Native Rust implementation
  • SIMD acceleration (SSE2/AVX2/NEON)
  • Rayon parallel processing
  • Zero-copy buffer handling

🔧 Built-in Normalizations for Every Model

No more googling "ImageNet mean std values":

// ResNet, VGG, EfficientNet
await toTensor(buffer, { normalization: 'Imagenet' });
// mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]

// CLIP, OpenAI models
await toTensor(buffer, { normalization: 'Clip' });
// mean: [0.48145466, 0.4578275, 0.40821073]
// std: [0.26862954, 0.26130258, 0.27577711]

// Simple [0, 1] range
await toTensor(buffer, { normalization: 'ZeroOne' });

// [-1, 1] range (GANs, diffusion models)
await toTensor(buffer, { normalization: 'NegOneOne' });
Enter fullscreen mode Exit fullscreen mode

🖼️ AI Training Data: Smart Cropping

Preparing training datasets? Crop first, then convert:

import { crop, toTensor } from 'bun-image-turbo';

// Step 1: Crop to square (perfect for classification)
const squared = await crop(buffer, {
  aspectRatio: "1:1",
  gravity: "center"  // Smart center crop
});

// Step 2: Convert to tensor
const tensor = await toTensor(squared, {
  width: 224, height: 224,
  normalization: 'Imagenet',
  layout: 'Chw'
});
Enter fullscreen mode Exit fullscreen mode

Crop Presets for Common AI Tasks

// Face detection training (square crops)
const face = await crop(buffer, { aspectRatio: "1:1" });

// Object detection (YOLO uses square input)
const yolo = await crop(buffer, { aspectRatio: "1:1" });

// Video analysis (16:9 frames)
const video = await crop(buffer, { aspectRatio: "16:9" });

// Document OCR (A4 ratio)
const doc = await crop(buffer, { aspectRatio: "210:297" });
Enter fullscreen mode Exit fullscreen mode

🔌 Framework Integration Examples

ONNX Runtime — Image Classification

import { toTensor } from 'bun-image-turbo';
import * as ort from 'onnxruntime-node';

async function classifyImage(imagePath: string) {
  const session = await ort.InferenceSession.create('resnet50.onnx');
  const buffer = Buffer.from(await Bun.file(imagePath).arrayBuffer());

  const tensor = await toTensor(buffer, {
    width: 224, height: 224,
    normalization: 'Imagenet',
    layout: 'Chw',
    batch: true
  });

  const ortTensor = new ort.Tensor(
    'float32',
    tensor.toFloat32Array(),
    tensor.shape
  );

  const results = await session.run({ input: ortTensor });
  return results.output.data;
}
Enter fullscreen mode Exit fullscreen mode

TensorFlow.js — Feature Extraction

import { toTensor } from 'bun-image-turbo';
import * as tf from '@tensorflow/tfjs-node';

const tensor = await toTensor(buffer, {
  width: 224, height: 224,
  normalization: 'Imagenet',
  layout: 'Hwc',  // TensorFlow uses HWC!
  batch: true
});

const tfTensor = tf.tensor4d(
  tensor.toFloat32Array(),
  tensor.shape as [number, number, number, number]
);
Enter fullscreen mode Exit fullscreen mode

CLIP — Image Embeddings for Search

import { toTensor } from 'bun-image-turbo';

const tensor = await toTensor(buffer, {
  width: 224, height: 224,
  normalization: 'Clip',  // CLIP-specific normalization
  layout: 'Chw',
  batch: true
});

// Feed to CLIP image encoder for semantic search
Enter fullscreen mode Exit fullscreen mode

🚀 Real-World AI Use Cases

1. Image Classification API

import { Hono } from 'hono';
import { toTensor } from 'bun-image-turbo';
import * as ort from 'onnxruntime-node';

const app = new Hono();
const session = await ort.InferenceSession.create('model.onnx');

app.post('/classify', async (c) => {
  const formData = await c.req.formData();
  const file = formData.get('image') as File;
  const buffer = Buffer.from(await file.arrayBuffer());

  const tensor = await toTensor(buffer, {
    width: 224, height: 224,
    normalization: 'Imagenet',
    layout: 'Chw', batch: true
  });

  const results = await session.run({
    input: new ort.Tensor('float32', tensor.toFloat32Array(), tensor.shape)
  });

  return c.json({ predictions: Array.from(results.output.data).slice(0, 5) });
});

export default app;
Enter fullscreen mode Exit fullscreen mode

2. Batch Training Data Preparation

import { toTensorSync, crop } from 'bun-image-turbo';
import { Glob } from 'bun';

async function prepareTrainingBatch(folder: string) {
  const glob = new Glob('**/*.{jpg,png,webp}');
  const tensors: Float32Array[] = [];

  for await (const path of glob.scan(folder)) {
    const buffer = Buffer.from(await Bun.file(path).arrayBuffer());

    // Crop to square, then convert
    const cropped = await crop(buffer, { aspectRatio: "1:1" });
    const tensor = toTensorSync(cropped, {
      width: 224, height: 224,
      normalization: 'Imagenet',
      layout: 'Chw'
    });

    tensors.push(tensor.toFloat32Array());
  }

  console.log(`Prepared ${tensors.length} training samples`);
  return tensors;
}
Enter fullscreen mode Exit fullscreen mode

3. Real-Time Object Detection Preprocessing

import { toTensor } from 'bun-image-turbo';

async function preprocessForYOLO(buffer: Buffer) {
  return await toTensor(buffer, {
    width: 640, height: 640,     // YOLO input size
    normalization: 'ZeroOne',   // [0, 1] range
    layout: 'Chw',
    batch: true
  });
}
Enter fullscreen mode Exit fullscreen mode

📊 Complete Benchmark Results

Scenario Time Throughput
Single 224×224 (Float32) 12.5ms 80/s
Single 224×224 (Uint8) 5.2ms 192/s
Resize 1920×1080 → 224×224 25.8ms 39/s
Batch 32 images ~400ms 80/s
CLIP preprocessing 12.8ms 78/s

🎁 What's New in v1.7.0

Feature Description
toTensor() Native SIMD-accelerated tensor conversion
toTensorSync() Synchronous variant for workers
4 Normalizations ImageNet, CLIP, ZeroOne, NegOneOne
2 Layouts CHW (PyTorch) & HWC (TensorFlow)
Batch Support Optional batch dimension
Float32 & Uint8 Two output dtypes

🔗 Get Started

# Install
bun add bun-image-turbo

# Or with npm/yarn/pnpm
npm install bun-image-turbo
Enter fullscreen mode Exit fullscreen mode

Links:


💬 What's Next?

Building something cool with bun-image-turbo? I'd love to hear about it!

Previous in series: Master Image Metadata: EXIF for AI Images

Top comments (0)