How to stop rewriting your storage layer every time you switch providers

#javascript #typescript #opensource #webdev

The problem nobody warns you about

Last quarter I migrated a side project from AWS S3 to Cloudflare R2. Should've been a two-hour job. It took an entire Saturday.

The migration itself wasn't the issue. R2 is S3-compatible, so the protocol mostly worked. The pain came from everywhere else in my codebase — the helper that uploaded user avatars to S3, the worker that streamed CSV exports, the cron job that backed up Postgres dumps. Each one had been written against a slightly different abstraction. Some used the AWS SDK v3 directly. Some used a wrapper I'd built in 2022 and forgotten about. One particularly cursed module used presigned URLs through a fetch call I'd copy-pasted from Stack Overflow.

If you've ever tried to swap object storage providers in a real codebase, you know this feeling. The provider is rarely the problem. Your own coupling is.

Why this keeps happening

The root cause is that every storage provider ships an SDK that reflects its own internal model rather than a shared standard. The AWS SDK leans heavily on commands and middleware. Google Cloud Storage uses a more object-oriented Bucket.file() style. Azure has its own client hierarchy. Even the "compatible" providers diverge once you touch anything beyond PutObject and GetObject — multipart uploads, signed URLs, metadata handling, streaming reads.

So when you write s3Client.send(new PutObjectCommand(...)) directly in your route handler, you're not really writing storage code. You're writing AWS-flavored storage code. Multiply that across a few hundred files and you have a codebase that is technically using "object storage" but is, in practice, married to one vendor.

The second contributor is the SDKs' inconsistent treatment of I/O primitives. Some accept Node Buffer. Some want Readable streams. Some accept Uint8Array. Newer ones lean toward web standards like ReadableStream and Blob. If you've ever stared at a TypeScript error that says something like Type 'ReadableStream<Uint8Array>' is not assignable to type 'StreamingBlobPayloadInputTypes', you know exactly what I mean.

Step one: pick a boundary and defend it

The fix isn't "pick the right SDK." The fix is to put a thin abstraction in front of whatever SDK you use, and treat that abstraction as the only thing your application code is allowed to import.

Here's the minimum viable shape I now reach for in every project:

// storage/types.ts
export interface FileStore {
  put(key: string, data: Blob | ReadableStream): Promise<void>;
  get(key: string): Promise<Blob>;
  delete(key: string): Promise<void>;
  list(prefix?: string): Promise<string[]>;
  url(key: string, opts?: { expiresIn?: number }): Promise<string>;
}

That's it. Five methods. No provider-specific options leak through. The input types are Blob and ReadableStream because those are web standards — they work in Node 18+, in Deno, in Bun, in Workers, and in the browser. No Buffer. No Readable from node:stream. If you stay disciplined about this, you can run the same storage code in a Cloudflare Worker that you run in a Node API.

Step two: write the adapters once

Now each provider gets a small adapter that implements the interface. The first time you do this it feels like make-work. By the third provider, you'll wonder why you ever did it any other way.

// storage/s3.ts
import { S3Client, GetObjectCommand, PutObjectCommand } from '@aws-sdk/client-s3';
import type { FileStore } from './types';

export function createS3Store(bucket: string, client: S3Client): FileStore {
  return {
    async put(key, data) {
      // S3 accepts streams directly; Blob needs to be converted
      const body = data instanceof Blob ? new Uint8Array(await data.arrayBuffer()) : data;
      await client.send(new PutObjectCommand({ Bucket: bucket, Key: key, Body: body }));
    },
    async get(key) {
      const res = await client.send(new GetObjectCommand({ Bucket: bucket, Key: key }));
      // Body is a web ReadableStream in v3 — wrap it back into a Blob for the caller
      return new Response(res.Body as ReadableStream).blob();
    },
    // ...rest omitted for brevity
  } as FileStore;
}

Notice what the adapter does: it absorbs the SDK's idiosyncrasies and returns plain web types. The caller has no idea whether the bytes came from S3, a local disk, or a goat carrying a USB stick.

A recently published open-source project, files-sdk, takes essentially this approach and bundles adapters for multiple object and blob backends behind a single small API surface built on web-standard I/O. I haven't shipped it to production yet, but the design lines up well with what I've ended up building by hand on three separate projects. Worth a look if you want to skip writing the adapters yourself — and if not, the source is a decent reference for what a sane shape looks like.

Step three: make local development not painful

One of the quiet wins of this pattern is that you can ship an in-memory adapter for tests and a local-filesystem adapter for dev. No more MinIO Docker container just so your integration tests can run.

// storage/memory.ts
import type { FileStore } from './types';

export function createMemoryStore(): FileStore {
  const map = new Map<string, Blob>();
  return {
    async put(key, data) {
      // Normalise both inputs to Blob so get() can always return one
      map.set(key, data instanceof Blob ? data : await new Response(data).blob());
    },
    async get(key) {
      const blob = map.get(key);
      if (!blob) throw new Error(`No such key: ${key}`);
      return blob;
    },
    async delete(key) { map.delete(key); },
    async list(prefix = '') {
      return [...map.keys()].filter(k => k.startsWith(prefix));
    },
    async url(key) { return `memory://${key}`; },
  };
}

Use this in tests. Your test suite stops needing the network. Your CI runs in 30 seconds instead of three minutes. Worth every awkward as FileStore cast.

Prevention: the rules I now follow

A few habits that have saved me from repeating the Saturday-migration mistake:

Never import a storage SDK outside storage/. Add an ESLint rule if you have to. I use a no-restricted-imports rule to ban @aws-sdk/* everywhere except the adapter.
Inputs and outputs are web standards. Blob, ReadableStream, Uint8Array. No Buffer in the public API. The minute Buffer leaks out, your code stops running in Workers.
Resist the urge to expose provider-specific options. When someone inevitably asks for S3 server-side encryption headers, add it as an opaque metadata parameter, not an s3Options escape hatch. Escape hatches become load-bearing.
Wrap presigned URLs too. Don't expose them as a raw S3 call. The url() method on the interface should be the only way anyone gets a signed URL.
Keep the interface small. Five methods is enough for 95% of apps. If you really need multipart uploads, add a separate MultipartStore interface — don't bloat the main one.

The surprising thing about this pattern isn't that it makes migrations easier. It's that it makes the initial code easier to write. You stop thinking about which SDK method to call and start thinking about what you actually want to do with the bytes. Which, honestly, is what storage code should have been about all along.