Chanond Wongpiya

Posted on Apr 5

The Migration Never Happened. The Refactor Was Worth It Anyway.

#node #typescript #aws #architecture

The client asked for an estimate to migrate our platform from GCP to AWS. We ran the numbers: three months, minimum.

They said they'd think about it.

So while they were thinking, we refactored. Not the migration — just the groundwork. We consolidated every scattered GCP integration into a clean adapter layer, deployed it to production on GCP, and waited.

The client came back a few weeks later: they'd decided not to proceed.

But here's the thing — when they asked us to re-estimate, just in case, we came back with one month. The refactor we'd done while waiting had cut the migration cost by two thirds, for a migration that never happened.

This post is about what we built during that waiting period, why it was worth shipping regardless of the client's decision, and why you should probably do the same thing right now — before anyone asks you to.

The Problem: GCP Was Everywhere

Our platform started as most do — move fast, ship features, sort out abstractions later. GCP integrations for storage, pub/sub, and secrets management were called directly wherever they were needed. No abstraction layer. No separation of concern. Just @google-cloud/storage imported at the top of whatever file needed it.

src/
  services/
    invoice.service.ts       ← uses GCS directly
    document.service.ts      ← uses GCS directly
    notification.service.ts  ← uses Pub/Sub directly
  jobs/
    report-generator.ts      ← uses GCS directly
    cleanup.ts               ← uses GCS + Secret Manager
  api/
    upload.controller.ts     ← uses GCS directly

Six different entry points. One cloud provider baked into all of them.

What "Direct Integration" Actually Looked Like

Here's a simplified version of what we had — a document upload function inside a service:

// document.service.ts — BEFORE
import { Storage } from '@google-cloud/storage';
import { SecretManagerServiceClient } from '@google-cloud/secret-manager';

const storage = new Storage({ projectId: process.env.GCP_PROJECT_ID });
const bucket = storage.bucket(process.env.GCS_BUCKET_NAME!);

export async function uploadDocument(
  fileBuffer: Buffer,
  fileName: string,
  mimeType: string
): Promise<string> {
  const blob = bucket.file(`documents/${fileName}`);

  // GCP-specific stream upload — not transferable to S3 without rewrite
  await blob.save(fileBuffer, {
    metadata: { contentType: mimeType },
    resumable: false,
  });

  // GCP-specific public URL format
  return `https://storage.googleapis.com/${process.env.GCS_BUCKET_NAME}/documents/${fileName}`;
}

Clean enough in isolation. The problem was that this exact pattern — import SDK, instantiate client, call SDK method, format vendor-specific URL — was duplicated across the entire codebase. When the client asked for a migration estimate, the honest answer to touch every service, rewrite the logic, fix the URL formats, update the tests, and not break production was three months.

Then the client said they needed time to decide. And we had a choice: wait, or use that time well.

We used it well.

The Solution: Consolidate Before You Swap

While the client was deliberating, the strategy was simple: don't write a single line of AWS code yet. Instead, make the codebase ready to migrate — without actually migrating.

That meant two steps, with a hard gate between them:

Refactor first. Move all GCP calls behind a storage adapter interface without changing any behavior. Ship it. Run it in production on GCP.
Swap second. Write an AWS implementation of the same interface. Flip a config flag. Done — if and when the client says go.

Step 1 was safe to do unconditionally. It made the code better regardless of the client's decision. Step 2 would only happen if they greenlit the migration.

This also decoupled the refactoring risk from the migration risk. Step 1 could be validated in production before anyone touched AWS. Step 2 would be purely additive — no existing service code would need to change.

Designing the Interface

The interface is the contract. It has to be cloud-agnostic — no GCS-isms, no S3-isms. Just the operations the system actually needs.

// storage.adapter.ts
export interface StorageAdapter {
  /**
   * Uploads a file buffer and returns a publicly accessible URL.
   * Callers should not know or care which cloud backs this.
   */
  upload(params: UploadParams): Promise<string>;

  /**
   * Generates a short-lived signed URL for private file access.
   */
  getSignedUrl(filePath: string, expiresInSeconds: number): Promise<string>;

  /**
   * Permanently deletes a file. Resolves silently if file doesn't exist.
   */
  delete(filePath: string): Promise<void>;
}

export interface UploadParams {
  buffer: Buffer;
  destinationPath: string; // e.g. "documents/invoice-2024-03.pdf"
  mimeType: string;
  isPublic?: boolean;
}

Three methods. No vendor types leaking out. The callers only ever see StorageAdapter — never Storage from @google-cloud or S3Client from @aws-sdk.

The GCP Implementation

We now wrote a class that implemented the interface using the existing GCP SDK. The SDK code didn't change — it just moved into one place.

// gcs.adapter.ts
import { Storage } from '@google-cloud/storage';
import { StorageAdapter, UploadParams } from './storage.adapter';

export class GCSAdapter implements StorageAdapter {
  private bucket: ReturnType<Storage['bucket']>;

  constructor(private readonly storage: Storage, bucketName: string) {
    this.bucket = storage.bucket(bucketName);
  }

  async upload({ buffer, destinationPath, mimeType, isPublic }: UploadParams): Promise<string> {
    const file = this.bucket.file(destinationPath);

    await file.save(buffer, {
      metadata: { contentType: mimeType },
      resumable: false,
      public: isPublic ?? false,
    });

    // URL construction is now contained here — not scattered across services
    return isPublic
      ? `https://storage.googleapis.com/${this.bucket.name}/${destinationPath}`
      : destinationPath;
  }

  async getSignedUrl(filePath: string, expiresInSeconds: number): Promise<string> {
    const [url] = await this.bucket.file(filePath).getSignedUrl({
      action: 'read',
      expires: Date.now() + expiresInSeconds * 1000,
    });
    return url;
  }

  async delete(filePath: string): Promise<void> {
    try {
      await this.bucket.file(filePath).delete();
    } catch (err: any) {
      // GCS throws a 404 error object — we normalize "not found" to a no-op
      if (err?.code === 404) return;
      throw err;
    }
  }
}

The AWS Implementation

With the GCS adapter live in production, we wrote the S3 adapter as a separate, preparatory task — ready to deploy the moment the client said yes. It didn't touch a single service file.

// s3.adapter.ts
import { S3Client, PutObjectCommand, DeleteObjectCommand, GetObjectCommand } from '@aws-sdk/client-s3';
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';
import { StorageAdapter, UploadParams } from './storage.adapter';

export class S3Adapter implements StorageAdapter {
  constructor(
    private readonly client: S3Client,
    private readonly bucketName: string,
    private readonly publicBaseUrl: string // e.g. CloudFront URL for public files
  ) {}

  async upload({ buffer, destinationPath, mimeType, isPublic }: UploadParams): Promise<string> {
    await this.client.send(
      new PutObjectCommand({
        Bucket: this.bucketName,
        Key: destinationPath,
        Body: buffer,
        ContentType: mimeType,
        // S3 uses ACL strings; GCS used a boolean — adapter normalizes this
        ACL: isPublic ? 'public-read' : 'private',
      })
    );

    return isPublic
      ? `${this.publicBaseUrl}/${destinationPath}`
      : destinationPath;
  }

  async getSignedUrl(filePath: string, expiresInSeconds: number): Promise<string> {
    const command = new GetObjectCommand({ Bucket: this.bucketName, Key: filePath });
    return getSignedUrl(this.client, command, { expiresIn: expiresInSeconds });
  }

  async delete(filePath: string): Promise<void> {
    try {
      await this.client.send(
        new DeleteObjectCommand({ Bucket: this.bucketName, Key: filePath })
      );
    } catch (err: any) {
      // AWS SDK throws 'NoSuchKey' — we normalize to the same no-op behavior
      if (err?.name === 'NoSuchKey') return;
      throw err;
    }
  }
}

Wiring It Up

Both adapters get registered in a factory. One environment variable drives the whole switch.

// storage.factory.ts
import { Storage } from '@google-cloud/storage';
import { S3Client } from '@aws-sdk/client-s3';
import { StorageAdapter } from './storage.adapter';
import { GCSAdapter } from './gcs.adapter';
import { S3Adapter } from './s3.adapter';

export function createStorageAdapter(): StorageAdapter {
  const provider = process.env.STORAGE_PROVIDER; // 'gcs' | 's3'

  if (provider === 's3') {
    const client = new S3Client({ region: process.env.AWS_REGION });
    return new S3Adapter(client, process.env.S3_BUCKET_NAME!, process.env.S3_PUBLIC_URL!);
  }

  // Default to GCS so existing deployments are unaffected during rollout
  const storage = new Storage({ projectId: process.env.GCP_PROJECT_ID });
  return new GCSAdapter(storage, process.env.GCS_BUCKET_NAME!);
}

And every service that used to import @google-cloud/storage now imports the adapter:

// document.service.ts — AFTER
import { createStorageAdapter } from '../adapters/storage.factory';

const storageAdapter = createStorageAdapter();

export async function uploadDocument(
  fileBuffer: Buffer,
  fileName: string,
  mimeType: string
): Promise<string> {
  return storageAdapter.upload({
    buffer: fileBuffer,
    destinationPath: `documents/${fileName}`,
    mimeType,
    isPublic: false,
  });
}

The function body went from knowing about GCS internals to knowing nothing about storage at all. That's the point.

The Results

The client came back: they weren't moving forward with the migration.

That was fine. The refactor we shipped while waiting was already in production, already making the codebase better. And when they asked us to re-estimate — just to have a number on file — we came back with one month instead of three.

Metric	Original Estimate	Post-Refactor Estimate
Migration timeline	3 months	1 month
Files requiring changes during swap	~14	2 (factory + env config)
Test changes required	Full rewrite per service	Swap mock in one place
Rollback plan	Revert ~14 files	Flip one env var

The refactoring PR — consolidating all GCP calls into the adapter — was 580 lines. That's the PR that moved the needle on the estimate. The actual migration PR (the S3 swap) would have been around 300 lines and a config change. The expensive, risky work was already done.

The migration never happened. The codebase is better. The estimate dropped by two months. All from work done in a waiting period that could have been spent doing nothing.

Lessons Learned

Idle time is architectural time. When a decision is pending, you often have a window where the pressure is off. That's the best time to do the structural work that's always "too risky" during a sprint. We refactored with no deadline and no production risk. It's the calmest way to do the most impactful work.

The adapter pays for itself whether or not the migration happens. We consolidated error handling, URL construction, and SDK initialization into one place. The codebase is less coupled now. We added an in-memory adapter for tests at zero cost to any service. That's value independent of any cloud provider.

Refactor before you migrate, always. If the client had said yes immediately, the right call would still have been to consolidate first, then swap. You can't safely migrate a dependency that's load-bearing in fourteen places at once. The adapter step isn't optional — it's just a question of whether you do it before or during the migration.

Interface design is the hard part. Getting StorageAdapter right took longer than writing either implementation. A leaky interface — one that lets vendor-specific types or options escape into calling code — defeats the whole pattern. Spend the time here. The implementations are mechanical once the contract is solid.

Error normalization belongs at the boundary. GCS and S3 throw different error shapes for the same logical failures. If you don't handle that inside the adapter, vendor-specific error handling bleeds into your services and you've only moved the problem, not solved it.

The migration never happened — and the refactor was still worth every line.

That's the real argument for adapters: they're not just migration tooling. They're a permanent improvement to how your system relates to the things it depends on. The swap is just one possible payoff. The cleaner architecture is guaranteed.

DEV Community