As an engineer, I always have "solutions" which I thoroughly understand the pros and cons of. However, after working long enough, some things come to you intuitively, and you may want to revisit your thought process again. So here is a bit of my self-reflection.
I have a MongoDB database that stores data with the date in UTC. I have a small aggregation pipeline that is used to quickly pull reports for customers, and the time the customers see is in Thai time (GMT+7).
One day, we decided to let customers generate this report themselves from the front-end. The question is, do we still want to format the date in the same place?
My first thought was to move the logic to the API server, but the question is, why did I come up with that decision? So, I tried to list other choices.
Front-end: we can format it for a specific display.
API Server: The front-end sends the time zone to the API server -> the API server fetches the data from the DB, loops through it to format it into Thai time, and then sends the response.
Database: The front-end sends the time zone to the API server -> the API server sends the time zone to the database to format it within the aggregation pipeline.
Formatting the Date on the Front-end
Frameworks like Angular have pipe operators that are easy to use for data transformation just for display without mutating the original data. This is an approach that we would have to use anyway if we didn't control the API server ourselves. Also, it might seem proper if we use a common BFF and our frontends use different date formats.
However, testing on the front-end is quite a bit harder, and if we set the date in a state before rendering, the rendering lifecycle can have an effect, making debugging more difficult.
Furthermore, if there's a bug, a new deployment is required. If the front-end is a mobile app, there might be the additional cost of waiting for Apple/Google to approve the update.
Formatting the Date on the API Server
Formatting at this point has several advantages, such as easier testing. It might require an additional loop to iterate and format the data, but if the data isn't very large or complex, this cost is insignificant.
In return, we get a clear separation of concerns. The front-end is only responsible for displaying the data, and the database is only for providing data. They don't have to concern about how we handle the date. If there is a problem, we know the business logic is right here.
Formatting the Date in the Database
If we choose this approach, we can use the same existing query/pipeline to pull the data. But testing the formatting logic at this level might only be possible by writing a unit test in the API server to verify that the projection pipeline is formatted correctly or by doing smoke/integration tests. We lose the separation of concerns by adding extra logic to the database.
The most important thing that makes this a bad approach, even though it seems, at first glance, to allow us to format the data directly from the query without an extra loop, is that while we might pay a lower performance cost when looking at it in isolation, we pay this cost at the database.
The crucial issue is that the system's bottleneck, especially in microservices, is usually the database, which is much harder to scale than the API server. We can't improve performance at a single point without considering the relationships across the entire system. Otherwise, improving performance in one area might increase the throughput and just shift the bottleneck to the next part.
However, there is one frightening advantage that I've encountered from using the third approach in a different context.
I once worked somewhere a long time ago that used MSSQL as a database. At that company, they favored using stored procedures, which were used to store complex queries. The backend just had to send arguments and the name of the stored procedure to use. The mentioned advantage was that it allowed us to modify the data retrieval logic without a new deployment, which meant no downtime on the production system.
Personally, I think there are many scary things about this. Even if we can test with SSDT, it scatters the business logic. Some parts are on the backend, and many data-related parts are in the database. A change to a stored procedure could also affect the backend tests, unless it's just unit tests. Additionally, versioning is hard to track. You wouldn't know who changed the query and when it started going wrong.
Thinking about it, every decision comes from prioritizing different things, whether it's convenience, availability, maintainability, performance, or integrity. Each has its pros and cons. The problem is that we often don't have a comprehensive enough view to realize the downsides of our choices, which we might not be ready to pay for when the time comes.
If anyone sees more disadvantages to the second option, please feel free to share it with me.
Top comments (0)