Advantages of search databases

#architecture #meilisearch #elasticsearch #webdev

In the past, I often noticed a common approach where developers (including myself of course) used the same API for both reads and writes on every case. Even more so, we frequently relied on the same data source, such as MySQL/PostgreSQL, to handle both operations.

This means writing to the same columns and reading from them, often leading to struggles with optimizing indexes on fields that were heavily queried.

So for instance, we’d find ourselves frequently tweaking indexes to accommodate new filters or improve query performance, and fields used with operators like LIKE posed particular challenges due to their impact on performance.

These changes often lead to further adjustments to the backend, including modifying APIs to expose the updated functionality, measured times because of additional JOINs and so on...

To address the challenge about adding new filters and stuff in the API, there were attempts to optimize the process using tools and standards like Apicalypse and, of course, GraphQL.

These solutions aimed to streamline API query generation and reduce the manual effort required to implement new filters and functionalities, offering a more dynamic approach to handling data access, but, they had high learning curve.

With the rise of CQRS (Command Query Responsibility Segregation), a new approach began to emerge. This mindset encouraged the use of separate sources for writes and reads. Writes could emit events, and reads could build views from those events in dedicated places. Even if the reads and writes were managed within the same database (but different tables), this separation brought significant benefits, and, of course was able to get rid off of second challenge - JOINs and search queries on domain models, as read models are commonly in a form of denormalized JSONs.

However, this raised another problem. With reads, we had to scale writes, meaning the only reason we had to scale instances of our application from X to Y was because of reads. This issue could be partially mitigated with caching, and in the world of microservices, we could have dedicated microservices for reads.

But...

Still, this was not an ideal solution for other architectural styles like modular monoliths, where such separation might not align well with the system’s design philosophy. Another thing was, when API was down, the whole product was down, and keeping in mind, that most of the products are relying on more reads than writes, it could unnecessary impact the business (Aparat from down API of course ;) )

So, what if we could ask those "views," also known as read models, directly without involving the API and handling loads? This is where solutions like Meilisearch, AppSearch and others come into play, leveraging a pattern called the "Valet Key." By using this pattern, frontends can access read-optimized models directly, reducing the dependency on backend APIs. Of course, frontend still has to "ask" API for "Valet key", but frontend can cache keys, so even when API is down, frontend can still communicate and display content.

With this approach, we can focus on the read database and not worry about handling the traffic for reads in our API. The "Valet Key" provided to the frontend via our API is secured in a way that the frontend cannot alter it. It includes predefined filters and indexes.

If the frontend requires additional capabilities, it can request them through the API, where API can validate whether to allow them. It's still less calls.

Some pros I can see are:

Reduced API Load: Offloads the read traffic from the API, allowing it to focus on core operations.
Scalability: Read databases or search services are better optimized to handle high traffic, reducing the need to scale the application backend.
Flexibility: SaaS or self-hosted options allow teams to choose the best fit for their infrastructure.
Security: Predefined filters and indexes ensure the frontend can only access allowed data, minimizing risks. Keys can be invalidate by the API.
Developer Efficiency: Reduces the need for constant API updates for new filters or search capabilities.
Improved Performance: Direct access to read-optimized models provides faster query responses for users.

But, there are always cons cons:

Eventual Consistency: Data may appear after some time due to the nature of eventual consistency in read models.
Additional Maintenance: Introduces an extra component that requires monitoring and management.
Schema Complexity: Schemas must be stored in code or a common place, as different teams from different contexts may need to populate the same document (e.g., employee with email, but also with available credits and coupons). While not directly tied to this pattern, it adds complexity.
Cost of SaaS version or self hosted maintenance

So, this approach is not a silver bullet and introduces its own set of challenges, but if you're okay with a cons, then a small change on the frontend likely won’t require involving the backend team, streamlining the development process and improving overall agility, and of course scalability should be easier.

DEV Community

Advantages of search databases

Top comments (0)

Read next

Teorema de Nicômaco

Taking LLMs to (code) town - part II. Creating a vanilla.js web component toolchain from ground up

Building AI Chatbots: The Technology Behind Conversational AI

ReddAPI: The Easiest Way to Work with Reddit’s API! 🚀 by SeasonedCode