Monolith vs microservice, monorepo vs polyrepo - endless discussions have been held, industry trends and deeply rooted personal beliefs have been voiced and dies have been cast (wouldn't be my first choice) to decide on the chosen approach in various software projects.
In this article we will assume, that, for whatever reason known to the leaders of the project, microservices were chosen. We will additionally assume, that, at least, some of these microservices require storing a state somewhere in a database (SQL or NoSQL), whether they are CRUD services representing some business flow or entity, or for any other reason.
How should the databases serving different microservices be treated? Should different tables for microservices created in the same database with foreign keys connecting between them? Should there be a strict separation prohibiting any cross-reference or even access from one microservice's code to another one's data?
As always - there are pros and cons to different options and we will try to examine them.
1⃣ Separate database for each microservice
This approach is, probably, the most widely used pattern for micro-services databases. The main benefits of this approach are:
a) Guarantee that there is absolutely no way there could be cross-relations/dependencies between data of different microservices
b) Contain the "blast radius" if one of the microservices becomes compromised or is "stressing" the database.
This approach guarantees the strictest level of separation between the data elements managed by different microservices. It also the easiest way to scale-out microservices storage in case of a significant growth. (Consider offloading certain database connections to completely different database clusters).
Additional benefit (although not an extremely strong one) of this approach is the ease of backup/restore and schema change (where relevant) for data related to specific microservice, without any impact on other microservices.
One of the challenges of this approach is the overhead required to combine the data for each microservice with cross-system elements (for example, tenants for a multi-tenant environment). Consider an effort of creating a new tenant (various strategies for multi-tenant databases can be found in my previous blog:
Strategies for Using PostgreSQL as a Database for Multi-Tenant Services
Leonid Belkind ・ Jan 19 '20
The combination of this approach with various multi-tenant approaches creates the following architecture:
On-boarding and off-boarding of new tenants will need to take place in each database separately. Additionally, creation of new microservices will require more databases that will need to be aligned to reflect the existing tenants.
2⃣ Single database with schemas (a-la PostgreSQL) for different microservices
Somewhat similar to the previous approach, albeit using a single logical database that supports "workspaces" for logical separation between various objects. PostgreSQL Schemas, or, to an extent MySQL are such mechanisms.
This approach is also quite widely used. Its main advantage over the previous approach is the ability to provide cross-schema references (foreign keys in SQL Databases) where supported, for example in PostgreSQL. This can be used leveraged for optimizing operations, such as cascading delete of all data related to cross-microservice entities, such as user or organization.
When using such an approach, though, one needs to exercise caution not to create logical references and allow encapsulation of data for different microservices. When separating access control of different roles (allocated to microservices) to different schemas/workspaces, "blast radius" can be controlled in a way similar to the previous approach of completely separate databases, and then the benefit of allocating dedicated schemas/workspaces to cross-microservice data is achieved almost without any trade-offs.
When layering the multi-tenant challenge on top of such a configuration the options are more restricted, basically additional namespacing/pseudo-namespacing (using object name) can be used, or, alternatively, data of multiple tenants can be interleaved.
3⃣ Single database with different tables for different microservices
With this approach, there is a single "logical" database for configuration/storage of all microservices. Each microservice will have its own tables, with an optional ability to implement references / foreign keys to other tables.
While many purists may consider this approach an anti-pattern for microservices environment, in fact, if working properly, the only capability that is really more difficult here is the scale-out, as described in the first approach. Backup/restore of data related only to specific microservice is a rare requirement, but, if needed, it is also more difficult to implement using this approach.
In many databases, one can manage pseudo-namespaces for objects by using strong naming conventions, such as "microserviceX.tableY" and provide access roles accordingly only for tables/objects that relate to a specific microservice. When working this way, the real differences between this approach and the previous ones are becoming less evident.
Layering multi-tenant data on top of this approach can be done either by further pseudo-namespacing the objects "microserviceX.tenantY.tableZ" or by interleaving data of different tenants.
Summary
While the approaches presented above can be the architecture of choice for different use-cases, the most important considerations to have in mind when choosing the most suitable one are:
- Access Control / Blast Radius Control / Microservice Encapsulation
- Scale-Out Considerations
- Overhead in creating new Microservices
- Further multi-tenant considerations
Kudos to @kostyay and @eldadru for the research that lead to this article.
Top comments (1)
DB per microservice wouldn't allow us to leverage the power of SQL since it is not possible to use join query across the microservices. The 3rd option seems the best option for my own project. Thanks for sharing your thought! :)