Abdul Jabbar

Posted on Apr 5

I built a 100% local file converter — your files never leave your browser. Here's how.

#javascript #webdev #showdev #typescript

I got tired of online file converters that want me to upload sensitive documents to their servers. Tax returns, medical images, client spreadsheets. All going to some server I have no control over. Even the ones that call themselves "secure" still move your files off your machine before doing anything with them.

So I built ConvertSafe. It converts 50+ format pairs across images, data formats, and documents, and the entire conversion engine runs inside your browser tab. There are no API routes, no upload endpoints, no server functions. The deployed output is literally a folder of static HTML, CSS, and JS files sitting on Cloudflare's CDN.

I want to walk through how the whole thing actually works because the implementation had some surprisingly interesting problems.

Quick overview of the stack

The app is built with Next.js 15 using the App Router. Every page is statically generated at build time and exported to a plain out/ directory. Cloudflare Pages serves it from the edge.

All file processing happens through a combination of the Canvas API, the FileReader API, and a handful of specialized JavaScript libraries that get loaded on demand. There are no lambdas, no serverless functions, no background workers on a remote machine. When you convert a file, the work happens in the same browser tab you're looking at.

The conversion engine

Everything funnels through a single file called client-converter.ts. Think of it as a big routing table. It takes a source format and a target format, picks the right browser API or library for the job, runs the conversion, and returns a Blob that becomes a download link.

const SUPPORTED_PAIRS = new Set([
  // Data formats
  "CSV-JSON", "JSON-CSV", "JSON-XML", "XML-JSON",
  "YAML-JSON", "JSON-YAML", "CSV-YAML", "YAML-CSV",
  // Documents
  "PDF-TXT", "MD-HTML", "TXT-PDF", "DOCX-PDF",
  // Images via Canvas API
  "JPG-PNG", "PNG-JPG", "JPG-WEBP", "WEBP-JPG",
  "PNG-WEBP", "WEBP-PNG", "BMP-JPG", "SVG-PNG",
  // HEIC via heic2any
  "HEIC-JPG", "HEIC-PNG",
  // ... 50 pairs total
]);

export function canConvertClientSide(from: string, to: string): boolean {
  return SUPPORTED_PAIRS.has(`${from.toUpperCase()}-${to.toUpperCase()}`);
}

The main convertClientSide() function reads the file into memory, dispatches to the right conversion logic, and always returns { blob, fileName }. That result gets turned into a download link through URL.createObjectURL(). Simple interface, complicated internals.

Loading libraries only when you need them

This was one of the first problems I ran into. Libraries like heic2any (iPhone photo decoding), pdfjs-dist (PDF rendering), xlsx (spreadsheet parsing), and jspdf (PDF generation) are all pretty heavy. Bundling them all upfront would mean shipping hundreds of kilobytes to someone who just wants to convert a CSV file.

The solution was lazy loading with a simple caching pattern:

let _heic2any: any = null;
async function getHeic2any() {
  if (!_heic2any) {
    const mod = await import("heic2any");
    _heic2any = mod.default ?? mod;
  }
  return _heic2any;
}

Every heavy dependency gets this treatment. The first conversion using a particular library has a brief loading pause while the module downloads. Every conversion after that is instant because the reference stays cached in memory. The initial page load stays lean because none of these show up in the main bundle.

I ended up writing this wrapper for 10 libraries: js-yaml, marked, jsPDF, heic2any, utif (for TIFF files), mammoth (for DOCX), turndown (HTML to Markdown), html2canvas, imagetracerjs (raster to vector), xlsx from SheetJS, and pdfjs-dist from Mozilla.

Catching bad files before they break things

People drop all kinds of stuff into file converters. Renamed files, corrupted files, things with completely wrong extensions. Trusting the file extension alone is a recipe for confusing error messages or silent failures.

Instead, the converter reads the first 16 bytes of every uploaded file and checks for known magic byte signatures:

const FILE_SIGNATURES: Record<string, { bytes: number[]; offset?: number }[]> = {
  JPG:  [{ bytes: [0xFF, 0xD8, 0xFF] }],
  PNG:  [{ bytes: [0x89, 0x50, 0x4E, 0x47] }],
  WEBP: [
    { bytes: [0x52, 0x49, 0x46, 0x46], offset: 0 },
    { bytes: [0x57, 0x45, 0x42, 0x50], offset: 8 }
  ],
  PDF:  [{ bytes: [0x25, 0x50, 0x44, 0x46] }],
  HEIC: [{ bytes: [0x66, 0x74, 0x79, 0x70], offset: 4 }],
};

WebP is a fun one. It lives inside a RIFF container, so you look for the bytes spelling "RIFF" at the beginning and then "WEBP" starting at byte 8. HEIC and AVIF both sit inside ISO BMFF containers, which means they share the same "ftyp" marker at offset 4. For validation that's fine since the browser APIs handle the actual decoding once we've confirmed the file is legit.

Text based formats like CSV, JSON, XML, and YAML don't have magic bytes. There's no reliable binary signature for a plain text file, so those skip validation entirely.

Image conversions with the Canvas API

The Canvas API does most of the heavy lifting for image format conversions. The core pattern is simple: load the source image, paint it onto a hidden canvas element, then export the canvas in whatever target format you need.

const img = new Image();
img.src = URL.createObjectURL(file);
await new Promise((resolve) => { img.onload = resolve; });

const canvas = document.createElement("canvas");
canvas.width = img.naturalWidth;
canvas.height = img.naturalHeight;

const ctx = canvas.getContext("2d")!;
ctx.drawImage(img, 0, 0);

const blob = await new Promise<Blob>((resolve) => {
  canvas.toBlob((b) => resolve(b!), "image/webp", 0.92);
});

This covers JPG, PNG, WebP, BMP, AVIF, and GIF. The browser natively decodes the source format, and canvas.toBlob() handles encoding to the target.

One thing that tripped me up early on: AVIF support in toBlob() varies by browser. Chrome and Edge handle it well. Firefox landed support more recently. Safari is still catching up. The converter detects this at runtime and shows a clear message if someone's browser can't handle a particular conversion.

HEIC is the exception to the Canvas approach. No major browser natively decodes HEIC images yet, so the heic2any library handles the decoding step first. Once it produces a browser friendly format, Canvas takes over for any further conversion.

How the pages are structured

Each converter page has a split personality. It needs to be a static, content rich page for search engines and simultaneously an interactive application for actual users.

Next.js 15's App Router makes this straightforward with the server component and client component boundary:

app/[slug]/page.tsx          ← Server Component (ships zero JS)
  ├── Breadcrumbs            ← Server Component
  ├── SEO content sections   ← Server Component
  ├── 5 JSON-LD schemas      ← Server Component
  └── <ConverterApp />       ← Client Component ("use client")

The server component renders all the static content at build time through generateStaticParams(). All 50 converter pages come out as pre-rendered HTML. The ConverterApp client component hydrates separately as an interactive island. Crawlers get fully rendered content with structured data. Users get a working file converter. Neither side compromises for the other.

Every page embeds five JSON-LD schemas: HowTo, FAQPage, BreadcrumbList, SoftwareApplication, and WebPage. Google actually uses all five for different types of rich results, so the structured data investment pays off.

Dealing with office formats

Data formats like CSV and JSON are straightforward string manipulation. The formats that pushed me toward heavier libraries were DOCX, XLSX, and PDF.

DOCX to PDF uses mammoth to pull HTML out of the DOCX file (which is really just a ZIP archive full of XML), then feeds that HTML into jsPDF for rendering. Mammoth intentionally flattens complex Word formatting into clean semantic HTML, so you lose things like fancy headers and custom styles. For the vast majority of documents though, the output looks right.

XLSX handling relies on SheetJS. It parses Excel workbooks in the browser, extracts cell data, and can write back to CSV, JSON, or new XLSX files. The library weighs around 300KB, which is exactly why the lazy loading pattern matters so much.

PDF to image uses Mozilla's pdfjs-dist to render PDF pages onto a canvas, which can then be exported as JPG or PNG. Right now it handles the first page. Multi-page export is coming.

TXT to PDF sounds trivial but has edge cases that will bite you. Long lines need word wrapping. Long documents need automatic page breaks. The converter handles this with jsPDF, wrapping text at 180mm width and inserting page breaks when the content overflows.

What I noticed about other "privacy first" converters

Before building ConvertSafe I spent a good amount of time studying the competition. There are over 40 tools in this space now, and a few patterns jumped out.

A surprising number of tools that market themselves as "100% client side" and "your files never leave your device" are running Google Analytics, Microsoft Clarity, or similar tracking scripts in the background. If someone is converting medical records or financial documents, having session replay software recording their screen undermines the entire privacy promise. ConvertSafe uses Umami for analytics, which is cookieless and doesn't track individual users.

The other thing I noticed is that nobody searches for "privacy first file converter." People search for "csv to json" or "heic to jpg." The tools that actually get traffic are the ones with dedicated, well written pages for each specific conversion pair. That's why ConvertSafe has 50 individual landing pages instead of one generic page with a format dropdown.

Where things stand today

The site has been live for about six weeks. Still early days.

50 converter pairs across images, data, and documents
61 pages in the sitemap
11 pages indexed in Google so far (new domains are slow)
Lighthouse: 95+ Performance, 100 SEO, 100 Accessibility
Deployed as a fully static site on Cloudflare Pages

Google has discovered all the pages but is taking its time actually indexing them. That's normal for a fresh domain with no backlink history. Building authority is the current priority.

Verify it yourself

This is the part that matters most to me. Open any converter page on ConvertSafe, open your browser's DevTools, switch to the Network tab, and convert a file. You will see zero outbound requests carrying file data. Everything happens in your browser's memory.

Check the Sources tab too. The conversion code is all right there. No obfuscation, no hidden server calls behind abstraction layers.

Here are a few converters worth trying:

CSV to JSON supports pasting text directly or uploading a file
HEIC to JPG is great if you have iPhone photos you need to share
JPG to WebP lets you see how much smaller WebP files actually are
DOCX to PDF converts Word documents without touching a Microsoft server
PNG to SVG does raster to vector tracing entirely in the browser

What's coming next

Multi-language support is the biggest thing on the roadmap. Every converter page in 8+ languages means 400+ indexable pages, which is exactly how tools like iLovePDF scaled to hundreds of millions of visits. New formats like TOML and TSV are planned too.

I'm also working toward full PWA support with a service worker, which would make the whole tool work offline. The privacy story gets even more compelling when the converter works without an internet connection at all.

Happy to answer any questions about the architecture, the conversion logic, or anything else. The client side file processing space has some genuinely interesting engineering problems and I've enjoyed working through them.

Give it a try at convertsafe.app and let me know what you think.

DEV Community