Rohit Tiwari

Posted on Mar 4 • Originally published at rohittiwari.me

I Stopped Using REST After This Production Incident

#api #webdev #typescript #architecture

It started with a production incident

[INCIDENT — e.g. "A customer was charged twice. The client received a network timeout, retried the request, and our order service created two identical orders seconds apart. Same card. Two charges. Support was already on the phone before I knew anything was wrong."]

I found the root cause in twenty minutes. But I spent three more hours in the logs trying to reconstruct exactly what had happened — because nothing in the system agreed on how to report things. Different error shapes. Different HTTP codes. Some endpoints returned nothing at all on failure.

We fixed it. But I spent the following week thinking about the actual problem — which wasn't the bug. It was that our API had no shared contract for anything.

The naming problem REST never solved

Here is the specific thing that kept bothering me:

DELETE /orders/123

Looks clean. But in our system, cancelling an order actually:

Validates the order can be cancelled given its current state
Issues a refund through Stripe
Releases reserved inventory
Sends a cancellation email to the customer
Writes a compliance log entry

The URL says DELETE. The operation does five things. Those two facts are not just different — they are actively misleading. Every developer who touches this endpoint has to go read the implementation to understand what they are actually calling. There is no other way to know.

Multiply that by fifty endpoints. A hundred. Every URL is technically accurate and practically useless.

The fix: name the operation, not the noun

I started naming operations after what they actually do — not what noun they touch, but the whole operation.

# REST — says what thing, not what happens
DELETE /orders/123

# PACT — says exactly what happens
POST /api/v1/private/cancel-order-and-refund-and-notify

That second URL is longer. It is also completely honest. You know what will happen before you call it. You know what to test. You know what to put in the incident report.

I called this convention PACT — Protocol for Action-based Coordinated Transport.

The resource segment of the URL is not a noun. It is a description of the full operation — whatever that involves, however many systems it touches.

The URL structure

POST /api/:version/:scheme/:resource

Three segments after the prefix. That is all.

Segment	What it is	Examples
`:version`	API contract version — team picks the format	`v1` `v2` `2024-01` `stable`
`:scheme`	Access level — enforced before your code runs	`public` `private` `internal`
`:resource`	The full operation name	`cancel-order-and-refund-and-notify`

Simple operations

POST /api/v1/public/fetch-product-catalogue
POST /api/v1/private/update-billing-address
POST /api/v1/private/delete-draft-post

Compound operations — where this shines

POST /api/v1/private/register-user-and-send-welcome
POST /api/v1/private/cancel-order-and-refund-and-notify
POST /api/v1/private/downgrade-plan-and-prorate-and-email
POST /api/v1/internal/expire-stale-sessions-and-audit-log

Read those compound operations. You know exactly what they do. No docs needed. No implementation dive. No guesswork.

When you ship v2, old and new run in parallel — clients migrate on their own timeline:

POST /api/v1/private/cancel-order-and-refund   # still running
POST /api/v2/private/cancel-order-and-refund   # new contract alongside it

Seven URL prefixes, one naming pattern

The same idea extends across every transport in your system:

POST   /api/:version/:scheme/:resource     # application operations
POST   /webhook/:provider/:event           # Stripe, GitHub, SendGrid
WS     /ws/:version/:resource             # real-time bidirectional
GET    /stream/:version/:resource         # SSE — server pushes to client
POST   /events/:version/:resource         # internal event bus
POST   /rpc/:version/:resource            # service-to-service calls
GET    /health                            # liveness — no version needed

One naming convention. Every transport. If you know what something does, you know what it is called — regardless of how it travels.

Responses: always HTTP 200, always the same shape

This is the other half of what the incident exposed. Every endpoint had invented its own error format.

PACT fixes it with one rule: always HTTP 200. Success or failure lives inside the body.

// Works
{
  "success": true,
  "requestId": "req_a3f9b2c1",
  "data": { "orderId": "ord_789", "refunded": true },
  "meta": { "duration": 42, "cached": false }
}

// Fails — still HTTP 200
{
  "success": false,
  "requestId": "req_a3f9b2c1",
  "error": {
    "category": "CONFLICT",
    "code": "ORDER_ALREADY_CANCELLED",
    "message": "This order cannot be cancelled",
    "retryable": false,
    "source": "runner"
  },
  "meta": { "duration": 8 }
}

requestId is present on every response — success or failure. Write your error handling once. It works on every endpoint, every version, forever.

Eight error categories

No more arguing about 422 vs 400 vs 409:

Category	Retry?	When
`VALIDATION`	No	Bad input — wrong types, missing fields
`AUTH`	No	Token missing or invalid
`FORBIDDEN`	No	Valid token, wrong permissions
`NOT_FOUND`	No	The thing does not exist
`CONFLICT`	No	Wrong state — duplicate, already done
`RATE_LIMITED`	✅ Yes	Too many requests — wait and retry
`INTERNAL`	✅ Yes	Server error
`UNAVAILABLE`	✅ Yes	Dependency is down

retryable: true = try again. retryable: false = fix the request first. That is all your client needs to decide what to do next.

Per-operation files — contract.ts and runner.ts

Each operation is a folder:

/transports/api/v1/cancel-order-and-refund/
  contract.ts    ← rules — declared here, enforced automatically
  runner.ts      ← business logic only
  test.spec.ts

contract.ts — you declare, the framework enforces:

export const contract = {
  scheme: 'private',

  schema: z.object({
    orderId: z.string(),
    reason:  z.string().optional(),
  }),

  idempotency: { required: true }, // ← prevents the double-charge incident
  cache:       { enabled: false },
  rateLimit:   { rpm: 30 },
}

runner.ts — zero protocol knowledge. No HTTP. No auth checks. No rate limiting. All of that is already handled before your code runs:

export default async function run(ctx: PactContext) {
  // ctx.payload   — validated, matches schema exactly
  // ctx.identity  — logged-in user (null if public)
  // ctx.tenant    — resolved tenant (null if single-tenant)
  // ctx.version   — 'v1', 'v2' — read-only, for logging only

  const order = await db.orders.find(ctx.payload.orderId)
  if (!order) throw new NotFoundError('ORDER_NOT_FOUND')

  await stripe.refund(order.paymentId)
  await email.send('cancellation', { order })
  await db.orders.cancel(order.id)

  return { cancelled: true, refunded: true }
  // PACT wraps it: { success: true, data: { cancelled: true, refunded: true } }
}

The double-charge from my incident? Solved by idempotency: { required: true }. Client sends X-Idempotency-Key. Same key seen again — stored result returned immediately, runner never executes twice. Declared at the contract level. Visible. Auditable. Automatic.

Versioning and deprecation

When v1 is being retired in favour of v2, you set one field:

// /transports/api/v1/cancel-order-and-refund/contract.ts
deprecated: '2026-06-01'  // 90-day minimum window required

From that point, every response includes:

"meta": {
  "duration": 24,
  "cached": false,
  "deprecation": "2026-06-01"
}

Plus a standard Sunset HTTP header that infrastructure tools read automatically. After the date, the route returns VERSION_RETIRED under UNAVAILABLE — not a silent 404. Clients get a clear, machine-readable signal that they need to upgrade.

Tenant context — not in the URL

Multi-tenancy lives in context, not path segments:

JWT token claim     →  wins always
X-Tenant-ID header  →  fallback when token has no tenant claim
Neither present     →  single-tenant mode — not an error

No /api/v1/acme/private/cancel-order in your routes. The URL says what happens. The context carries who is asking.

The folder structure

/transports             ← all communication channels
  /api
    /v1
      /cancel-order-and-refund
        contract.ts  runner.ts  test.spec.ts
      /register-user-and-send-welcome
        contract.ts  runner.ts  test.spec.ts
    /v2
      /cancel-order-and-refund    ← new contract, runs in parallel
        contract.ts  runner.ts  test.spec.ts
  /webhook
    /stripe
      /payment-succeeded
  /ws/v1/order-tracking-updates
  /stream/v1/activity-feed
  /events/v1/order-placed
  /rpc/v1/check-permissions

/core                   ← framework — never edit for features
/adapters               ← outbound: stripe, sendgrid, openai, anthropic
/queue
  publisher.ts
  consumer.ts
  /jobs                 ← background jobs are queue consumers
    /send-invoice-reminders
    /expire-stale-sessions
/libs                   ← all shared internal code
  /utils  /types  /constants  /validators  /errors  /helpers  /middleware
/config
server.ts

A new developer gets a bug report on POST /api/v1/private/cancel-order-and-refund. They open /transports/api/v1/cancel-order-and-refund/runner.ts. One path. No searching.

What I actually learned

The most valuable thing about operation-based naming was something I did not expect: it forced clearer thinking at design time.

When you have to write cancel-order-and-refund-and-notify as a URL, you commit to that operation as a unit. You think about what complete success means. You think about partial failure. You think about idempotency. You think about the audit trail.

When you write DELETE /orders/123, you are thinking about a resource state change — not the five downstream effects. That difference in framing is where a lot of production incidents are born.

The URL is a contract with everyone who reads it. It should be honest.

What PACT is — and is not

Not a framework. Not a package. Not a spec with a committee.

A convention — a set of rules a team agrees to follow. You implement it in whatever language and stack you already use. The value is consistency: every operation behaves the same way, reports errors the same way, handles retries the same way, deprecates the same way.

I built a full specification covering all seven transports, the complete request/response envelope, eight error categories, a 20-step middleware chain, versioning and deprecation contract, folder structure, and a compliance checklist. Drop a comment if you want it.

Does operation-based naming resonate with you? Where does it break down? What would you change?

PACT — Protocol for Action-based Coordinated Transport

DEV Community