FetchSandbox

Posted on Jun 25

Our partners kept breaking on staging. So we gave them production access. (Don't do this.)

#api #webdev #devops #testing

We whitelisted partners on production with read-only test accounts. That was attempt four — after docs-only onboarding, shared UAT, and IP-whitelisted staging all failed in different ways.

I work on an embed platform. Partners integrate our APIs for payments, identity verification, and onboarding flows. We burned six months on one question: how do partners test their integration before go-live?

Attempt 1: "Just use our docs"

We pointed partners at endpoints, auth, request/response examples. "You're good to go."

They'd hit something we didn't document, open a support ticket, wait two days, lose momentum. A few never came back. One went with a competitor because they "couldn't get a working test setup in a week."

The docs described what should happen. Partners had no way to see what actually happens until they wrote code, deployed, and hoped.

Attempt 2: Internal UAT box

Next: give partners our internal UAT environment. IP whitelisted, shared credentials. We told them "please don't break anything" without irony.

Worked for three weeks.

QA pushed a bad config on a Tuesday. The partner's integration broke. They spent two days debugging their own code before opening a ticket. By the time we figured it out, their engineering lead was cc'ing our VP.

Another team ran load tests on UAT the same week. Partner API calls timed out. Their CTO asked if our platform was "production-ready."

UAT was built for us to break things. We invited external partners into that mess.

Attempt 3: Staging with IP whitelisting

We carved out "partner staging" — same codebase, separate deployment, IP whitelisted per partner. Felt like the grown-up solution.

It wasn't.

IP whitelists became a full-time job. Every new partner meant firewall rules. Home IPs differed from office IPs. One CTO traveling couldn't hit the sandbox from his hotel. We debugged VPN configs at 11pm on a Thursday.

Test data went stale. Partners created orders for products that existed in production but not staging. "Your API returns 404 for product_id XYZ." Correct — nobody seeded staging in three weeks.

Deploys collided with demos. Engineers shipped to staging without checking if partners were testing. A deploy during a partner's live demo call gets brought up in quarterly business reviews.

Cost. Full production mirror for partner testing — database, compute, monitoring, on-call — used maybe 10 hours a week.

Attempt 4: Production access

I wish I was kidding.

Reasoning: staging is flaky, UAT is a disaster, whitelist them on production with read-only test accounts.

The API was stable. Partners were happy. For about a month.

One partner's integration bug created 400 orphaned records in production. Data team spent a weekend cleaning up.

Compliance asked why test payloads with fake PII hit the production database.

Maintenance meant coordinating downtime with partners "just running a few test calls."

We'd built the world's most expensive sandbox: production with IP whitelisting.

What we actually needed

Partners didn't need access to our infrastructure.

They needed an API that looked and acted like ours — same endpoints, schemas, auth patterns — but was completely separate. POST creates a resource. GET retrieves it. State transitions work. Nobody else's deploy breaks it.

Not a mock server — those are stateless; step 2 doesn't know about step 1. Not a shared staging box — that's everybody's problem. A dedicated sandbox generated from the same OpenAPI spec the real API uses.

This is half why I started working on FetchSandbox. Give it an OpenAPI spec, get a stateful sandbox — CRUD, state machines, webhook events, seed data. No infrastructure to maintain.

# Partner gets a sandbox from your spec
npx fetchsandbox generate ./your-api-openapi.yaml

curl https://your-api.fetchsandbox.com/v1/orders \
  -H "api-key: sandbox_abc123"
# → 200 OK, realistic seed data

fetchsandbox run your-api create-and-fulfill-order
# → ✓ Create order — 201
# → ✓ Add line items — 200
# → ✓ Submit for fulfillment — 200
# → ✓ Webhook: order.fulfilled fired

No staging to maintain. No IP whitelists. No production risk. Partners get their own URL, data, and credentials. Your team doesn't touch it.

When a partner asks "does this workflow work?" — they prove it themselves.

The time sink nobody tracks

If you run a partner API, count hours per month on:

Provisioning and maintaining test environments
Debugging "is your sandbox down?" tickets
Managing IP whitelists and VPN access
Re-seeding stale test data
Coordinating deploys around partner testing schedules

At my company: 20–30 hours across engineering and DevOps. For a problem that shouldn't exist.

Open the Stripe sandbox and run a workflow end-to-end — no signup: fetchsandbox.com/playground

DEV Community