Read Stripe's API reference for an hour and you'll notice every endpoint has a complete enumerated list of error codes with example payloads. Then look at your own API.
The contrast is hard to ignore.
Stripe's API documentation treats errors as first-class citizens. Every endpoint clearly documents not only the happy path but also every expected failure, complete with structured error codes, descriptions, HTTP status codes, and example responses.
Now compare that to many APIs in production.
You might find a generic list of HTTP status codes somewhere in the documentation, but business-specific errors are often buried inside controller logic, scattered across wiki pages, or simply undocumented. The test suite isn't much better—there are dozens of happy-path tests, but only a handful of negative scenarios.
That imbalance creates problems for everyone involved:
- Developers don't know which errors are expected.
- Frontend teams can't reliably handle failures.
- QA engineers miss important negative cases.
- Refactoring accidentally changes error responses without anyone noticing.
A few years ago, I borrowed a simple idea from Stripe's documentation and turned it into a testing strategy.
Instead of treating error responses as exceptions, we created an error-code catalog and made it the foundation of our negative test suite.
The result wasn't just better API error code testing—it also improved documentation, simplified maintenance, and made API contracts far more consistent.
Here's how the pattern works.
Why Error Responses Are Part of the API Contract
When people think about API testing, they naturally focus on successful responses.
Typical assertions include:
- HTTP 200 OK
- HTTP 201 Created
- Correct JSON payload
- Required fields
- Business calculations
Negative testing often gets much less attention.
Maybe there are a few tests for:
- Invalid authentication
- Missing required fields
- Unknown resources
Beyond that, many APIs rely on manual testing or hope that the framework handles everything correctly.
The problem is that real users encounter failures just as often as successful requests.
Examples include:
- Customer account locked
- Payment declined
- Coupon expired
- Inventory unavailable
- Duplicate registration
- Subscription canceled
- Rate limit exceeded
These aren't exceptional scenarios.
They're expected business outcomes.
Treating them as first-class API contracts changes how you design both documentation and tests.
The Error-Code Catalog as a Test Input
The first step is creating a centralized catalog of every business error the API can intentionally return.
A simplified example might look like this:
errors:
USER_NOT_FOUND:
httpStatus: 404
message: User not found
EMAIL_ALREADY_EXISTS:
httpStatus: 409
message: Email already exists
INVALID_TOKEN:
httpStatus: 401
message: Invalid authentication token
PAYMENT_DECLINED:
httpStatus: 402
message: Payment declined
ORDER_ALREADY_SHIPPED:
httpStatus: 409
message: Order cannot be modified
This catalog becomes far more than documentation.
It becomes an executable specification.
Instead of asking:
"What errors should this endpoint return?"
the answer already exists in one authoritative location.
Every new business error must be added here before it reaches production.
That single requirement dramatically improves consistency.
Why a Catalog Helps
Without a catalog:
- Documentation drifts.
- Tests become incomplete.
- Frontend teams discover errors by accident.
- Reviewers overlook breaking changes.
With a catalog:
- Every error is documented.
- Every error becomes testable.
- Every API consumer sees the same contract.
The catalog becomes the foundation for automation.
One Test per Error Code, Generated from the Catalog
Once the catalog exists, generating negative tests becomes surprisingly straightforward.
Rather than manually writing dozens of repetitive tests, a generator simply iterates through every defined error.
Conceptually:
for (const errorCode of catalog) {
generateNegativeTest(errorCode);
}
Each generated test validates four things:
- The expected HTTP status
- The error code
- The error message
- The response schema
Consider EMAIL_ALREADY_EXISTS.
The generated scenario might:
- Create a user.
- Attempt to create the same user again.
- Verify the response:
{
"code": "EMAIL_ALREADY_EXISTS",
"message": "Email already exists"
}
The implementation differs depending on the framework, but the testing philosophy remains the same:
Every documented error deserves exactly one corresponding test.
As new error codes are introduced, new tests appear automatically.
No engineer has to remember to write them.
Why This Scales Better
Imagine your API exposes:
- 150 endpoints
- 90 business error codes
Maintaining those manually quickly becomes tedious.
Generation solves two maintenance problems simultaneously:
- Missing tests
- Duplicate effort
Instead of asking developers to remember every negative case, the catalog guarantees baseline coverage.
Engineers can then focus on more complex business workflows rather than repetitive validation tests.
The Shape Assertion That Prevents Silent Error Drift
One lesson we learned very early was this:
Checking only the HTTP status is almost useless.
Imagine an endpoint originally returns:
{
"code": "USER_NOT_FOUND",
"message": "User not found",
"requestId": "abc123"
}
Months later, someone refactors the global exception handler.
The response becomes:
{
"error": "User not found"
}
The HTTP status is still:
404
Many tests still pass.
But every client expecting the original response contract is now broken.
This is known as silent error drift.
Nothing appears wrong until consumers start failing.
The Solution: Shape Assertions
Every negative test also validates the response structure.
Example:
expect(response.body).toEqual({
code: expect.any(String),
message: expect.any(String),
requestId: expect.any(String)
});
Notice that we're not only validating values.
We're validating the schema itself.
That single assertion protects every API consumer from accidental response changes.
Why This Matters
Consumers often depend on:
- Error codes
- Localization keys
- Correlation IDs
- Documentation URLs
Removing any of these fields can become a breaking API change even though the HTTP status remains correct.
Schema validation catches those problems immediately.
Keeping the Catalog in Sync with the Code (Code Generation)
The obvious concern is maintenance.
If engineers must manually update both:
- Source code
- Error catalog
the catalog eventually becomes outdated.
The solution is code generation.
Most applications already define errors centrally.
For example:
export enum ErrorCode {
USER_NOT_FOUND,
INVALID_TOKEN,
PAYMENT_DECLINED,
EMAIL_ALREADY_EXISTS
}
A simple generation step can produce:
- API documentation
- OpenAPI components
- Markdown reference tables
- Test inputs
- SDK constants
All from the same source.
Now there's only one place where error definitions live.
Everything else is generated automatically.
Benefits of Codegen
This approach creates several advantages:
Documentation Never Falls Behind
As soon as a new error appears in code, documentation updates automatically.
Generated Tests Stay Current
No manual synchronization required.
API Consumers Stay Aligned
Client SDKs can reference the same constants used by the server.
Code Reviews Become Easier
Adding a new business error becomes highly visible because it affects generated documentation and tests.
The Two Error Codes We Deliberately Don't Test (And Why)
Although our negative suite covers nearly every business error, there are two categories we intentionally exclude.
1. Generic Internal Server Errors
Example:
500 Internal Server Error
These represent unexpected failures.
They're not part of normal business behavior.
Rather than intentionally triggering every possible internal exception, we verify:
- Sensitive details aren't exposed
- Generic messages are returned
- Correlation IDs exist
- Logging occurs correctly
Testing every possible server failure adds little value.
Testing the response contract provides much greater return.
2. Infrastructure Failures
Examples include:
- Database unavailable
- Network partition
- DNS outage
- Message broker failure
- Cloud storage unavailable
These failures belong to resilience testing rather than standard API automation.
They are better validated using:
- Chaos engineering
- Fault injection
- Infrastructure testing
- Disaster recovery exercises
Mixing infrastructure scenarios into routine API negative tests usually creates unstable pipelines.
Keeping them separate results in cleaner and more reliable automation.
Additional Benefits We Didn't Expect
Once the catalog became part of our development process, several unexpected improvements appeared.
More Consistent APIs
Every endpoint used the same response format.
Better Frontend Development
Frontend teams no longer guessed which errors could occur.
Simpler Documentation
Error references stayed synchronized automatically.
Cleaner Pull Requests
Adding a new error became an explicit design decision rather than an implementation detail.
Better QA Coverage
Negative scenarios became just as visible as successful ones.
Final Thoughts
Most engineering teams invest heavily in testing successful requests while treating failures as secondary concerns.
Stripe demonstrates a different philosophy.
Errors are documented, standardized, and treated as an integral part of the public API contract.
Building an error-code catalog allowed us to adopt that same mindset.
Instead of manually maintaining dozens of repetitive error response testing scenarios, we generated them from a single source of truth.
Combined with response schema validation and code generation, the approach dramatically reduced maintenance while increasing confidence that every documented failure behaved exactly as expected.
If your API already has a growing collection of business errors, consider creating a centralized catalog before the list becomes unmanageable.
The investment is relatively small, but the payoff in documentation quality, test coverage, and long-term maintainability is substantial.
If you'd like to explore how automated API testing can support this approach, you can spin up a free trial to try this catalog pattern and see how generated negative tests, schema validation, and API contracts work together in practice.
Top comments (0)