APIs are an unbreakable contract between your service and its customers. While humans can (reluctantly) adjust to drastic UI redesigns, API clients will just stop working with even the slightest change. This is why it's important for API providers and API consumers to agree on how to interact.
Because this contract can't change unilaterally, being precise and strict is extremely important. Undefined behavior is de-facto defined behavior, and if you are too lenient with what you accept, you will forever be stuck supporting all the unpredictable ways people use your APIs.
When talking about APIs I often hear people quoting the robustness principle: "be conservative in what you send, be liberal in what you accept". I strongly disagree with this principle, as it leads to fragile, insecure, and inefficient systems.
For example, consider an API endpoint that expects a query parameter called sort
which according to the docs accepts the values asc
for ascending, and desc
for descending.
Here is an example psuedo-code implementation:
function list_items(sort: string) {
if (sort === "asc") {
return fetch_sorted_asc(...);
} else {
return fetch_sorted_desc(...);
}
}
This function, as you can see accepts sort
as a string, and then checks the value to sort the data accordingly. The problem is that the input is not validated to be one of two values, so in practice Desc
, descending
, and dscnding
will all lead to the input correctly being sorted in a descending manner. Which in turn means that anyone that uses this API incorrectly like this will see it working correctly and therefore will start relying on this behavior.
This means we are now stuck supporting endless variations of desc
until the end of time without even realizing. Which most likely also means we are going to break some implementations by accident when we change this code without even realizing it.
Additionally, it's easy to become more lenient later, but it's impossible to become more strict. So if you start more strict you keep your options open, if you start more lenient you don't.
Formulating the contract
As I said above, APIs are a contract, which is why it's important to formulate that contract as well as humanly (computerly?) possible. Therefore having an OpenAPI spec for your API goes a long way. Preferably also having validation that the actual implementation is compatible with the spec.
You should also be as specific as you can with both the definition, and therefore the validation of the fields. For example, the Svix user defined IDs (uid
s) follow a specific pattern: they have to be between 1 and 256 characters long, and match this regex ^[a-zA-Z0-9\-_.]+$
. This is both documented in the spec and validated at runtime.
The advantage of having it documented in the spec, is that you can generate rich and specific documentation for your customers. Your customers don't need to guess what values are allowed for a uid
, they can see it right there.
Validating the spec at runtime, when API calls are made, means that even if someone made a mistake and accidentally passed the wrong value, this will be caught immediately and they would be able to rectify the mistake.
Another advantage of having an OpenAPI spec, is that your API consumers can also generate validation on their end. Having schemas type checked on their end at compile time, so issues never hit production.
Tagging IDs with their type
Many APIs use uuid
s internally as their ID representation, which they then expose in their APIs. Those uuid
s usually look something like this: 12455207-ab83-5602-8d93-67999e204cff
. First of all, I think the base62 representation (not a typo, it's 62) is much nicer: 2bd6zcCxa8v3fx1nj9XVlQp3WR3
. It's more concise and more aesthetically pleasing, but to each their own.
The problem with using uuid
s, whether in their standard string representation or the base62 one, is that they are just generic IDs. This means that you can accidentally use an ID in the wrong place (which will usually return a 404
) and be very confused on why it doesn't work.
A much better system is to tag IDs with their type, so the example above when tagged would look like this: app_2bd6zcCxa8v3fx1nj9XVlQp3WR3
for an "application" and msg_2bd6zcCxa8v3fx1nj9XVlQp3WR3
for a message.
Tagged IDs mean that you can have strict validation in your API and great examples in your docs that show exactly the kind of ID that's expected. Passing the wrong ID will no longer result in a 404
, but rather with a validation error that explains exactly what's going on.
This makes debugging much much easier, but it also makes support much easier. Because it makes it very obvious when someone is accidentally passing the wrong IDs.
It's a wrap
This post was cross-posted from the the 5x9s newsletter. Subscribe for more content like this sent directly to your inbox.
For more content like this, make sure to follow us on Twitter, Github or RSS for the latest updates for the Svix webhook service, or join the discussion on our community Slack.
Top comments (0)