Recently at our semi-regular architecture discussion group, we've been looking at the Auth0 service: what it is, how it works and when it might be a good idea to use it.
I'd like to focus on the last aspect, using Auth0 as a case-study for the more general problem of when to use external services in our microservices mix, and when to roll your own.
On one hand, it's totally reasonable not to write yet another
UserService which handles user authentication and authorization. It's been done before, and we all like to reuse code. On the other, using an external service requires us to let go of the otherwise total control we have over our system.
It's a tradeoff. How to decide, when to use one option or the other?
Before diving into tradeoffs, let's make a very quick introduction to Auth0, which we'll use as our running example.
Auth0 is a service which manages identities of the system's users, providing both authentication and authorization.
Users can be authenticated in a variety of ways:
- using traditional username/password
- through various social identity providers (Google, Facebook, Twitter, … )
- passwordless, with login links sent by e-mail
- multi-factor (such as push, e-mail or voice notifications, one-time passwords)
For authorization, Auth0 offers role-based access control (again, with roles & permissions stored in Auth0's database, or externally), as well as extension points to provide your own mechanisms (through rules, hooks and extensions).
You have the option to use Auth0-hosted login screens (a component that is very important for the end-user experience), which can be themed so that they match your site; or you can roll your own, and just call Auth0 APIs.
What does that mean? OAuth2 defines a number of flows, which should be used depending on the type of the client (if the client's execution environment can be trusted or not), whether the client is an end-user or a machine, and what kind of session we want to establish (one-off, temporary or long-running).
Auth0 endpoints behave as specified by OAuth2, hence we know what to expect when interacting with them. Which OAuth2 flow to use depends on the concrete use case, which can be such as:
- regular web app, with a server-side component
- single-page-app, with client-side sessions only
- mobile apps
- machine-to-machine (service-to-service)
- single-sign-on (SSO)
OAuth2 relies on signed tokens being passed between interested parties. The tokens that Auth0 creates are JWT tokens, hence in an almost-human readable JSON format. That's yet another standard, which you can encounter when implementing the security layer of your application.
Finally, what are the alternatives to Auth0? It's not the only service implementing such functionality. Other options that you might want to check out if you are surveying the authentication landscape are:
- as-a-service: Okta, Amazon Cognito, onelogin, PingIdentity, FusionAuth
- self-hosted: KeyCloak, Gluu, FreeIPA
Modern systems are typically composed of multiple services of various sizes. Deployments vary, from Kubernetes on one end, through Heroku and bare metal on the other. However and wherever the services are deployed, they usually communicate using HTTP APIs, or at least offer the possibility to communicate with them this way.
Hence, replacing one of the (micro)services with an external API shouldn't create huge changes in the system architecture. Still, there's a number of factors to consider when deciding "if" and choosing "which" service to bet on.
The absolute first thing to consider is how easy it is to replace a potential external microservice. Nobody likes vendor lock-in: integrating too much with an external provider, we are then forced to use their services for better or worse, as the cost of switching is too high or would cause too much disruption.
Hence, we need to check whether the APIs that the service exposes conform to open, or at least "emergent" standards — and if these open standards are really used throughout the industry. Does the competition support the same standards as well? That is, what would it take to switch to another vendor?
How does Auth0 score in replaceability? On the plus sides, we've got the OAuth2 and OpenID Connect standards, which are not only a codified specification, but they are indeed used across the industry. Basing your authentication flows on OAuth2 will ensure that you will be able to switch both to other identity service vendors, or to roll your own: there's a lot of OAuth2-supporting open-source projects (and a couple of close-source ones probably as well).
However, it's not all roses. Auth0 does extend or adjust the standard slightly. For example to implement role-based access control, the permissions calculated from user roles are returned in the
permissions claim of the access tokens, which are non-standard. While adding custom claims is something the standard does allow of course, in the event of migration, you'd need to replicate that functionality.
Another point where you might encounter some vendor lock-in are the SDKs. Auth0 provides a rich set of libraries which allow interacting with the service from a number of programming languages, both to perform client or server-side authentication, parse access tokens and interact with the Auth0 management API. That's a very useful feature when developing applications: however, you should be aware that using them might tie you to the Auth0 service itself. If you were to create your own user service, you might have to amend the code to use a different or custom SDK.
Since we are delegating some part of the system's data and functionality to an external provider, security should be one of our main concerns. Looking through the company blog, documentation and regulatory compliance (GDPR/CCPA/CalOPPA/…) should provide some initial background into how seriously a given service takes security.
Further, we might want to investigate previous incidents (if any). If there were security incidents, this doesn't yet eliminate the service — we should check how the incident was handled, if a post-mortem was published, how transparent the company was about things that went wrong, and what are the conclusions for the future.
Even the presence of a simple status dashboard gives some insight into how the company communicates with their users.
When services are deployed together, they usually enjoy the benefits of communicating within a local or almost-local network. This changes when you use an external service. If it's being used a lot, what is the additional price per request that you have to pay for using the service?
And the price I'm talking about here is not measured in USD, but in milliseconds — every additional millisecond of latency might directly translate to user happiness (and we end up with USD again). Using an external service might mean an additional network hop or multiple hops — e.g. if the service looks up data in our database, at our datacenter, through an extension point.
Hence, does the external service provider offer endpoints in various regions? Quite often we have US and EU-based servers, but what about other continents — what if we want to deploy the rest of our services in Australia, Asia, Africa or non-US America?
You might also want to check if the external service is deployed on the infrastructure of a specific cloud provider. If it's e.g. AWS, you'll get great connectivity when hosting your services in the same AWS region, but probably worse when using a different region or GCP/Azure.
In case of Auth0, all of their public infrastructure is hosted on AWS. They offer four regions: US, US-2, EU and AU. It's also worth noting that you can choose a private cloud deployment option (which of course is more expensive).
External microservices put a focus on making the most common use cases easy to configure and pleasant to use, so that they can attract the widest possible audience (it's a business which has to make a profit, after all).
This might mean that, like many web frameworks in popular programming languages, bootstraping a project is rapid and leads to a very effective initial iteration. Even implementing 90% of the required functionality might be a breeze. However, the pain and hacking starts when we are left with the remaining non-standard 10%.
That's where extensibility of the external service comes into play. What are the available extension points? Can you plug-in into each step in the lifecycle of a managed data entity or into each step of a request? What are the constraints on the integrations?
Looking at Auth0, there are rules, which run after a user is authenticated. There are also hooks, which run at pre-defined extension points; some of them run synchronously, some asynchronously. Finally, there are extensions, which allow to integrate with third-party applications such as logging services (which is an important aspect as well!).
We've successfuly started using Auth0 at SoftwareMill and the available extension points have been sufficient so far. But more projects and more production experience is needed to verify if anything important is missing.
In the end, it all boils down to costs. For systems with low or medium traffic, using an external service will almost always be way cheaper than the cost of developing a custom implementation.
It's worth noting, however, that usage costs (which are typically sized proportionally to the number of users or API requests) are not the only cost associated with using an external service. There's still development costs, which include:
- configuring the service
- integrating the system's codebase with the service
Depending on the quality of the documentation, the UX of the administration panel, and automation options (see also below), this might require smaller or larger amounts of training. Same with integration costs — a good SDK which matches the ecosystem (both language and libraries) that you are using to develop the rest of the system will save a lot of development time. However, as already mentioned, it might also tighten vendor lock-in.
In case of Auth0, pricing is something you'll have to calculate for your use-case yourself, I'll refrain from giving any advice here. As for the development costs when using the service — OAuth2 and OpenID are both non-trivial standards to understand, and they take time to learn. However, this knowledge is largely vendor-independent. Getting to know Auth0 concepts still takes some time (the distinction between tenants, apps, etc. for example), but given solid background in open authentication standards, this shouldn't be too challenging.
When initially setting up the service, we'll probably be happy with a nice-looking administration UI, clicking around exploring the various options and discovering the features.
However later, as we dig deeper and start developing our system, integrating with the given external service more seriously, clicking in the UI will stop being fun and become frustrating instead. Moreover, if we are configuring multiple environments, doing this manually will not only be a waste of time, but also very error-prone.
That's why it's crucial that a service exposes all of their features through an API. AWS got this right, and even went a step further: the UI sometimes gives only some of the functionality, that is available when using the API programmatically. And that's the right approach. Having a well-documented configuration API is a must. If there are ready-to-use SDKs ready in popular programming languages, even better.
But that's only one aspect of automation. Another is infrastructure-as-code tools, such as Terraform, Ansible or Chef. Using these we can describe the target configuration of the external system, which is typically stored in one or multiple files. These files can then be part of our version control system, and the configuration can be partially dynamic, depending e.g. on the target environment. A separate tool-dependent process then applies the configuration to the external service, adjusting the "current" state with the "desired" state.
Auth0 exposes a comprehensive management API, so everything looks good on this front. It also allows connecting a git repository storing configuration, which is automatically applied on push. The configuration files in the repository must follow a pre-defined structure. Finally, there's a Terraform Auth0 provider, which can be used to fully manage Auth0 configuration.
Let's not forget about testing! Since we are externalising part of our system, we will no longer be able to run everything on our laptop. But that ship might have sailed long ago, if our system is truly composed of a non-trivial amount of microservices. We'll need service stubs for local development anyway.
That said, we'll still need to configure the external system in a couple of copies, to setup production/staging/development environments. As mentioned above, automation is crucial here. The external service should make it easy enough to quickly and dynamically create a fresh copy.
As we move towards fine-grained single-responsibility microservices, it's increasingly feasible to use an external service for the "standard" parts of each system.
We're commonly doing this with logging, using services such as Loggly or DataDog. We're using managed databases, be it on AWS, Heroku or database-vendor-specific solutions. We're storing binaries on S3. Externalising user authentication and authorization might be a good candidate as well.
There still might be cases where developing your own
UserService will be a better option — if you have very non-standard requirements, need to replace an existing API, or due to legal/regulatory requirements. However, chances are high that for quite a lot of systems using an external service will be the most cost-efficient and future-proof solution.