Kevin Andrew

Posted on Jun 28

I Built a Tool That Gives Every Pull Request Its Own Isolated PostgreSQL Database in 28 Seconds

#aws #postgres #devtools #hackathon

I created this piece for the purposes of entering the H0 Hackathon. #H0Hackathon

The Problem Nobody Talks About

Your CI pipeline is green. Tests pass. Preview URL works. But there's one thing your team quietly accepts as a fact of life: everyone shares the same staging database.

This creates a class of bugs that are maddening to debug:

PR #14 adds a migration that drops a column. PR #15 (written before #14 merged) now fails — not because of bad code, but because the shared DB state changed.
Your QA team loads test data. Another engineer's automated test wipes it. Both were "doing the right thing."
A contractor needed read access to test something. Now your users' real emails, phone numbers, and home addresses are sitting in their local Postgres dump. The third one isn't just annoying. It's a compliance incident waiting to happen.

The Solution: Copy-on-Write Database Clones Per PR

I built STEM — a GitHub App + AWS backend that automatically provisions an isolated Aurora PostgreSQL clone for every pull request your team opens, anonymizes the PII in it, and injects the connection string directly into your Vercel preview environment.

The flow looks like this:

Developer opens PR
  → GitHub webhook fires
  → STEM clones your Aurora cluster (copy-on-write, ~28 seconds, zero data copied)
  → Writer instance spins up
  → PII columns detected and anonymized (emails, names, phones, SSNs, IPs)
  → DATABASE_URL injected into Vercel preview env
  → PR comment posted with connection details
  → Developer's preview hits an isolated, anonymized DB
Developer closes PR
  → Clone destroyed, env var removed, zero waste

Every PR gets a full production-shaped database — real schema, real query patterns, fake people.

How the Aurora Cloning Works

The magic is RestoreDBClusterToPointInTime with RestoreType: copy-on-write. Aurora's storage layer is log-structured — a clone shares the same underlying storage pages as the source and only diverges on writes. You get a full logical copy of your database in under 30 seconds regardless of how large it is.

await rds.send(new RestoreDBClusterToPointInTimeCommand({
  DBClusterIdentifier: `stem-pr-${prNumber}-${Date.now()}`,
  SourceDBClusterIdentifier: process.env.AURORA_SOURCE_CLUSTER_ID,
  RestoreType: 'copy-on-write',
  UseLatestRestorableTime: true,
  DBSubnetGroupName: process.env.AURORA_SUBNET_GROUP,
  VpcSecurityGroupIds: [process.env.AURORA_SECURITY_GROUP_ID],
}))

The copy-on-write restore type is the key. It does not copy data — it creates a new cluster that references the same storage. Your 500GB production database? 28-second clone. 5TB? Still 28 seconds.

PII Anonymization — Automatic, Zero Config

After the clone is ready, STEM connects and runs a detection + masking pass:

Schema inspection — scans column names for patterns (email, phone, ssn, address, dob, ip_address, credit_card, etc.)
Sample verification — pulls 10 rows per candidate column, runs regex validation to confirm actual PII values are present
Mass UPDATE — replaces every matched row with deterministic fake values using a seeded generator (same fake value for the same real value, so foreign key joins still work)
Audit log — writes the list of anonymized columns to the metadata store Developers never see real user data. QA runs against production-realistic schemas with fake values. Compliance teams can point to the audit log.

The Metadata Layer: Aurora DSQL

STEM uses Aurora DSQL as its metadata store — tracking branch states, clone cluster IDs, Vercel env var IDs, and anonymized column lists across the pipeline.

DSQL is serverless, scales to zero, and handles the concurrent writes from multiple simultaneous PR webhooks without contention. It also gives us a distributed transaction model that means the metadata state always stays consistent even when the pipeline is processing multiple PRs in parallel.

The branch state machine looks like:

CLONING → PROVISIONING → ACTIVE → DESTROYED

Each transition is written to DSQL atomically. The dashboard polls this table to show real-time pipeline progress.

Cross-Account AWS Security Model

STEM uses the standard SaaS cross-account IAM pattern — no long-lived customer credentials ever leave the customer's account.

When a customer connects their AWS:

They deploy a CloudFormation stack (one command, auto-generated from STEM's UI)
The stack creates an IAM role that trusts STEM's control-plane account
The role is scoped to exactly the RDS actions STEM needs — nothing else
STEM stores the role ARN and uses sts:AssumeRole to operate in the customer's account

{
  "Effect": "Allow",
  "Principal": { "AWS": "arn:aws:iam::STEM_ACCOUNT:root" },
  "Action": "sts:AssumeRole",
  "Condition": { "StringEquals": { "sts:ExternalId": "stem-<per-user-hmac>" } }
}

The ExternalId is HMAC-derived per user — prevents confused deputy attacks even if the role ARN is leaked.

The Stack

Backend: Next.js 15 App Router, TypeScript, deployed to Vercel
Metadata: Aurora DSQL (serverless, distributed)
Database cloning: Aurora PostgreSQL Serverless v2 via RestoreDBClusterToPointInTime
Auth: GitHub App OAuth, AES-256-GCM sealed sessions (Edge-compatible)
Frontend: Next.js 15, Tailwind, Recharts, SWR for live polling

- Infra security: Cross-account IAM with HMAC ExternalId, least-privilege RDS policy

The Numbers

From a real PR opened against a staging cluster:

Metric	Value
Clone ready	28s
PII columns masked	5
Daily cost (2 active branches)	$0.22
PR comment posted	within 35s of open

The cost efficiency comes from Aurora Serverless v2's ACU-based billing — idle clones scale to near-zero. A branch that no one's querying costs fractions of a cent per hour.

What's Next

Multi-tenant AWS fan-out: wire the per-user connected role ARN into the webhook handler so each customer's clones provision in their own account
Org-level pooling: cap concurrent branches per org with a lease queue

- Anonymization policies: let teams define custom column patterns and masking strategies via a config file in their repo

Try It

STEM is live at stem-frontend-six.vercel.app. Connect your GitHub, hook up your Aurora cluster, and your next PR will have its own database.

Built for H0: Hack the Zero Stack — Vercel v0 + AWS Databases hackathon.

Source: github.com/Andrew-Kevin-007/stem-app

If you've ever spent 45 minutes debugging a test failure that turned out to be someone else's migration running against your shared staging DB, this one's for you.

DEV Community