DEV Community

monkeymore studio
monkeymore studio

Posted on

Building a Browser-Based AI Image Upscaler

Introduction

In this article, we'll explore how to implement a powerful browser-based AI image upscaler that uses deep learning models to enlarge images while maintaining quality. This tool supports 2x and 4x upscaling using Real-ESRGAN and Real-CUGAN models, running entirely in the browser with WebGPU or WebGL acceleration.

Why Browser-Based Upscaling?

1. Privacy Protection

When users upscale images in the browser, their photos never leave their device.

2. Zero Server Costs

Running the AI model in the browser eliminates the need for:

  • GPU servers for deep learning inference
  • Bandwidth for uploading/downloading images
  • API costs for third-party upscaling services

3. Offline Capability

Once models are cached, users can upscale images without an internet connection.

Technical Architecture

Core Implementation

1. Data Structures

interface ImageItem {
  id: string;
  file: File;
  originalUrl: string;
  processedUrl: string | null;
  status: 'pending' | 'processing' | 'done' | 'error';
  progress: number;
  info: string;
  name: string;
}

interface ImgInstance {
  width: number;
  height: number;
  data: Uint8Array;
  getImageCrop(x: number, y: number, image: ImgInstance, x1: number, y1: number, x2: number, y2: number): void;
  padToTileSize(tileSize: number): void;
  cropToOriginalSize(width: number, height: number): void;
}
Enter fullscreen mode Exit fullscreen mode

2. Backend Detection

Automatically detect and use the best available backend:

const [backend, setBackend] = useState<"webgl" | "webgpu">("webgpu");
const [detectedBackend, setDetectedBackend] = useState<"webgl" | "webgpu" | null>(null);

useEffect(() => {
  const detectBackend = async () => {
    try {
      // Check if WebGPU is available
      if (navigator.gpu) {
        const adapter = await navigator.gpu.requestAdapter();
        if (adapter) {
          setDetectedBackend("webgpu");
          setBackend("webgpu");
          return;
        }
      }
    } catch (e) {
      console.log("WebGPU not available:", e);
    }

    // Fall back to WebGL
    setDetectedBackend("webgl");
    setBackend("webgl");
  };

  detectBackend();
}, []);
Enter fullscreen mode Exit fullscreen mode

3. Processing Configuration

const [modelType, setModelType] = useState<"realesrgan" | "realcugan">("realcugan");
const [model, setModel] = useState("anime_plus");
const [factor, setFactor] = useState(4);
const [denoise, setDenoise] = useState("conservative");
const [tileSize, setTileSize] = useState(64);
const [minLap, setMinLap] = useState(12);
Enter fullscreen mode Exit fullscreen mode

Model Options:

Model Type Options
Real-CUGAN Scale: 2x, 4x; Denoise: conservative, no-denoise, denoise1x, denoise2x, denoise3x
Real-ESRGAN Models: anime_fast, anime_plus, general_fast, general_plus

4. Processing a Single Image

const processImage = async (imageItem: ImageItem): Promise<void> => {
  return new Promise((resolve, reject) => {
    const img = new Image();
    img.src = imageItem.originalUrl;

    img.onload = () => {
      const canvas = imgCanvasRef.current;
      canvas.width = img.width;
      canvas.height = img.height;
      const ctx = canvas.getContext("2d");
      ctx.drawImage(img, 0, 0);

      // Create image data
      const data = ctx.getImageData(0, 0, img.width, img.height).data;
      const input = new ImgClass(img.width, img.height, new Uint8Array(data));

      // Create worker for processing
      const worker = new Worker("/upscale-worker.js");

      worker.onmessage = (e) => {
        const { progress, done, output, info } = e.data;

        if (info) {
          setImages(prev => prev.map(img => 
            img.id === imageItem.id ? { ...img, info } : img
          ));
        }

        if (progress !== undefined) {
          setImages(prev => prev.map(img => 
            img.id === imageItem.id ? { ...img, progress } : img
          ));
        }

        if (done && output) {
          // Convert output to blob URL
          const outCanvas = document.createElement('canvas');
          outCanvas.width = input.width * factor;
          outCanvas.height = input.height * factor;
          const outCtx = outCanvas.getContext("2d");

          const imgData = outCtx.createImageData(outCanvas.width, outCanvas.height);
          imgData.data.set(new Uint8Array(output));
          outCtx.putImageData(imgData, 0, 0);

          outCanvas.toBlob((blob) => {
            const url = URL.createObjectURL(blob);
            setImages(prev => prev.map(img => 
              img.id === imageItem.id ? { ...img, processedUrl: url, status: 'done' } : img
            ));
            worker.terminate();
            resolve();
          }, "image/jpeg", 0.92);
        }
      };

      // Send data to worker
      worker.postMessage({
        input: input.data.buffer,
        factor: factor,
        denoise: denoise,
        tile_size: tileSize,
        min_lap: minLap,
        model_type: modelType,
        width: input.width,
        height: input.height,
        model: model,
        backend: backend,
        baseUrl: window.location.origin,
      }, [input.data.buffer]);
    };
  });
};
Enter fullscreen mode Exit fullscreen mode

Web Worker Implementation

The heavy lifting happens in a Web Worker to keep the UI responsive:

1. Loading TensorFlow.js

importScripts("https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@4.22.0/dist/tf.min.js");

// Load WebGPU backend (optional)
try {
  importScripts("https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgpu@4.22.0/dist/tf-backend-webgpu.min.js");
} catch (e) {
  console.log("WebGPU backend not available, will use WebGL");
}
Enter fullscreen mode Exit fullscreen mode

2. Image Class for Processing

class Img {
  constructor(width, height, data = new Uint8Array(width * height * 4)) {
    this.width = width;
    this.height = height;
    this.data = data;
  }

  getImageCrop(x, y, image, x1, y1, x2, y2) {
    const width = x2 - x1;
    for (let j = 0; j < y2 - y1; j++) {
      const destIndex = (y + j) * this.width * 4 + x * 4;
      const srcIndex = (y1 + j) * image.width * 4 + x1 * 4;
      this.data.set(image.data.subarray(srcIndex, srcIndex + width * 4), destIndex);
    }
  }

  padToTileSize(tileSize) {
    // Pad image to tile size for processing
    // ... implementation
  }

  cropToOriginalSize(width, height) {
    // Crop back to original size after upscaling
    // ... implementation
  }
}
Enter fullscreen mode Exit fullscreen mode

3. Model Loading

self.addEventListener("message", async (e) => {
  const { data } = e;

  // Model URL from GitHub
  const githubBaseUrl = "https://raw.githubusercontent.com/linmingren/openmodels/main/models";
  let modelUrl;

  if (data?.model_type === "realesrgan") {
    modelUrl = `${githubBaseUrl}/realesrgan/${data?.model}-${data?.tile_size}/model.json`;
  } else {
    modelUrl = `${githubBaseUrl}/realcugan/${data?.factor}x-${data?.denoise}-${data?.tile_size}/model.json`;
  }

  // Try to load from IndexedDB cache first
  try {
    model = await tf.loadGraphModel(`indexeddb://${modelName}`);
    self.postMessage({ progress: 10, info: "Loaded from cache" });
  } catch (error) {
    // Download from GitHub
    const fetchedModel = await tf.loadGraphModel(modelUrl);
    // Save to cache for next time
    await fetchedModel.save(`indexeddb://${modelName}`);
    model = fetchedModel;
  }
});
Enter fullscreen mode Exit fullscreen mode

4. Tile-Based Processing

async function enlargeImageWithFixedInput(model, inputImg, factor, inputSize, minLap) {
  const width = inputImg.width;
  const height = inputImg.height;
  const output = new Img(width * factor, height * factor);

  // Calculate optimal tile layout
  let numX = 1;
  for (; (inputSize * numX - width) / (numX - 1) < minLap; numX++);
  let numY = 1;
  for (; (inputSize * numY - height) / (numY - 1) < minLap; numY++);

  // Process each tile
  for (let i = 0; i < numX; i++) {
    for (let j = 0; j < numY; j++) {
      const tile = new Img(inputSize, inputSize);
      tile.getImageCrop(0, 0, inputImg, x1, y1, x2, y2);

      // Upscale tile using TensorFlow.js
      const scaled = await upscale(tile, model);

      // Copy to output with blending
      output.getImageCrop(destX, destY, scaled, ...);
    }
  }

  return output;
}
Enter fullscreen mode Exit fullscreen mode

5. Upscale Function

async function upscale(image, model, alpha = false) {
  const result = tf.tidy(() => {
    const tensor = img2tensor(image);
    let result = model.predict(tensor);
    if (alpha) {
      result = tf.greater(result, 0.5);
    }
    return result;
  });

  const resultImage = await tensor2img(result);
  tf.dispose(result);
  return resultImage;
}

function img2tensor(image) {
  const imgdata = new ImageData(image.width, image.height);
  imgdata.data.set(image.data);
  const tensor = tf.browser.fromPixels(imgdata).div(255).toFloat().expandDims();
  return tensor;
}

async function tensor2img(tensor) {
  const [_, height, width, __] = tensor.shape;
  const clipped = tf.tidy(() =>
    tensor.reshape([height, width, 3]).mul(255).cast("int32").clipByValue(0, 255)
  );
  const data = await tf.browser.toPixels(clipped);
  return new Img(width, height, new Uint8Array(data));
}
Enter fullscreen mode Exit fullscreen mode

Processing Flow

Key Features

1. Multiple Model Support

Model Best For Scale
Real-CUGAN Anime illustrations 2x, 4x
Real-ESRGAN General photos 2x, 4x

2. Denoise Levels

  • Conservative: Minimal denoising, preserves details
  • No Denoise: Pure upscaling
  • Denoise 1x/2x/3x: Progressive noise reduction

3. Tile-Based Processing

Large images are processed in tiles to avoid memory issues:

  • Adjustable tile size (32-256)
  • Automatic overlap calculation
  • Seamless blending

4. Model Caching

Models are cached in IndexedDB for instant subsequent use:

  • First load: Download from GitHub
  • Subsequent: Load from local cache
  • Cache persists across sessions

Performance Characteristics

  1. First Run: 30-60 seconds (download + cache)
  2. Subsequent Runs: 5-15 seconds
  3. Tile Processing: Depends on image size and tile count
  4. Memory: ~500MB during processing
  5. WebGPU: 2-3x faster than WebGL

Browser Support

WebGPU requires:

  • Chrome 113+
  • Edge 113+
  • Safari 17+ (limited)

WebGL fallback:

  • All modern browsers
  • Slower but works everywhere

Use Cases

  1. Anime/Game Art: Upscale pixel art and sprites
  2. Photo Enhancement: Enlarge photos without blur
  3. Print Preparation: Increase resolution for printing
  4. Social Media: Create higher quality images
  5. Archival: Restore old low-resolution photos

Conclusion

Browser-based AI image upscaling brings professional-grade image enlargement to the web. The implementation uses:

  • TensorFlow.js for running deep learning models
  • WebGPU/WebGL for GPU-accelerated inference
  • Web Workers for non-blocking processing
  • IndexedDB for model caching
  • Tile-based processing for memory efficiency

Users can upscale images by 2x or 4x using Real-ESRGAN or Real-CUGAN models, all while their images never leave their device.


Try it yourself at Free Image Tools

Experience the power of browser-based AI image upscaling. No upload required - your images stay on your device!

Top comments (0)