Kovid Rathee

Posted on Sep 30

Optimize Read Performance in Supabase with Postgres Materialized Views

#postgres #database #performance #supabase

A Postgres database typically consists of schemas that serve single-table CRUD (create, read, update, delete) operations. While CRUD operations handle basic transactional needs, applications also require more complex queries. These often involve multiple joins, nested CTEs, and intricate predicates for read-heavy workloads like embedded reporting and analytics. Such multitable queries can degrade overall database performance and block CRUD operations through locking. Postgres materialized views provide a solution by separating read-heavy workloads from transactional processing.

Materialized views are just like standard views but with one key difference—materialized views are precomputed and stored on disk. They save the query results in separate database objects, enabling you to run queries on top of a static snapshot instead of live underlying tables. The precomputation can help you so things like serving reports and enabling filtered data exports from multiple tables, where the queries can get quite complex.

This tutorial explains how I improved my application's read performance using Postgres materialized views in Supabase.

Understanding Materialized Views in Postgres

Unlike a standard view, a materialized view has an attached storage component; this means that materialized views can be precomputed and stored while a standard view cannot. Under the hood, both types of views follow Postgres's rule system. A table is designed to support all kinds of CRUD operations. In contrast, a materialized view is intended to be refreshed by a prewritten SQL query attached to its definition.

A materialized view is a point-in-time snapshot of a query result. This snapshot unlocks a wide range of benefits around query performance, with instant query execution as the results are precomputed. It also decouples the typical CRUD-type operations from the read-heavy operations using multiple tables and complex queries. It's also highly beneficial as it minimizes long-running lock contentions that can occur due to read-heavy operations on live tables.

Before using materialized views, you should consider that their data storage component will impact the database storage cost. Materialized views also don't provide the most up-to-date data. They're most suitable for use cases that can function with precomputed data. The article covers some other overheads associated with materialized views later, but first, let's look at why I chose to use materialized views to improve my query performance in Supabase.

Why Use Materialized Views

Supabase uses PostgREST instead of manual SQL-based CRUD programming and traditional ORMs (object-relational mappers). PostgREST converts your entire database into a RESTful API—the Supabase API—that allows you to interact with your database through standard API endpoints.
Read-heavy queries slow API response times because they compute results on the fly, requiring the API to wait for database processing. Materialized views address this issue by pre-computing expensive joins and aggregations, eliminating the need for real-time calculations in performance-critical applications.
Materialized views also have some security implications. You can create RLS (Row Level Security) policies on materialized views just as you would on tables. These policies translate directly into API endpoint security policies, providing granular control over your data at both the table and materialized view levels.

Creating and Using Materialized Views in Supabase

Let's look at how to create and use materialized views in Supabase using an example online storefront for an e-commerce business with the following database tables: users, products, orders, and order_payments.

Consider this scenario: The storefront owner logs into a portal that displays a month-on-month performance report. The summary should only include successful orders (orders that have been paid for) and exclude orders where a refund has been processed. Running this query every time the storefront owner logs in would be slow as it would involve filtering, aggregating, and joining data from at least three out of the five tables mentioned above. The query would look something like the following:

WITH successful_orders AS (
  SELECT o.user_id, 
         o.order_id,
         o.original_amount_usd, 
         o.discount,
         o.final_amount_usd, 
         op.payment_amount_usd,
         o.order_date,
         TO_CHAR(order_date, 'YYYY-MM') order_month
    FROM orders o
    INNER JOIN order_payments op ON o.order_id = op.order_id
   WHERE o.order_status = 'SUCCESSFUL'
     AND op.order_payment_status = 'SUCCESSFUL'
     AND ROUND(o.final_amount_usd,2) = ROUND(op.payment_amount_usd,2)
     AND op.refund_processed = FALSE
)

SELECT 
    order_month,
    COUNT(DISTINCT user_id) unq_customers,
    COUNT(order_id) unq_orders,
    SUM(original_amount_usd) total_original_amount,
    SUM(payment_amount_usd) total_payment_amount
FROM successful_orders
GROUP BY order_month
ORDER BY order_month DESC

Running this query on demand could be quite expensive and frustrating for the storefront owner if they log in to check the status several times daily. In this situation, having a materialized view based on this query that materializes every few hours will be quicker and cheaper as the materialized view's performance will be similar to that of a regular table. Here's how you can create a materialized view:

CREATE MATERIALIZED VIEW mv_successful_orders AS ... ;

Once this materialized view is created, it will hold the results of the underlying query until the next time it's refreshed. You can access a materialized view using the same SELECT statement as you do for a standard view or a table. Deleting a materialized view is similar to deleting any other database object, like a table or a standard view, for which you can use the DROP statement:

DROP MATERIALIZED VIEW mv_successful_orders;

Creating and dropping a materialized view works in a similar way to creating and dropping tables and regular views. If you want to reload data into a static table, you would need to TRUNCATE and INSERT the new records, or you can DROP and CREATE the table again. Regular views don't have this requirement as they don't store data on disk. Materialized views, on the other hand, need to be refreshed to provide a fresher copy of the data from the underlying query.

Refreshing Materialized Views in Supabase

I'll now explain how to keep the data fresh and up-to-date based on your application's needs in the materialized view.

Running the underlying query (for example, the one defined using a CTE (common table expression) in the previous section) brings the up-to-date data into the materialized view. This process is called a refresh. Refreshing a materialized view is simple:

REFRESH MATERIALIZED VIEW mv_successful_orders;

There are many ways to refresh materialized views. You can run an on-demand refresh from the command line, refresh based on a database trigger, or refresh based on logic built into the application code. Each of these methods might be suitable for separate roles, such as data engineers, frontend engineers, and so on.

Supabase Edge Functions

You can call Supabase Edge Functions directly from your application, and you can conditionally refresh the materialized view based on the conditions specified in the function code, preventing unnecessary refreshes. Here's a code snippet showing how to call the previously defined mv_successful_orders using a Supabase Edge Function:

  const response = await fetch(`${supabaseUrl}/functions/v1/refresh-materialized-view`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${supabaseAnonKey}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      mv_name: 'mv_successful_orders',
      use_concurrent: true
    })
  })

  const result = await response.json()

Manual Refreshes Using SQL Statements

You can run SQL statements directly or wrapped in Postgres functions using the Supabase dashboard or your client application using the Supabase SDK via RPCs (remote procedure calls). An RPC call would look something like the following:

  const result = await supabase.rpc('refresh_mv_successful_orders')

Automatic Refreshes Using Postgres Triggers

You can also trigger materialized view refreshes using database changes using Supabase Realtime Broadcast or Supabase Realtime Postgres changes. This method is more configured on the backend rather than the frontend as it is based on the database change log and database triggers.

Defining Your Refresh Strategy

No matter what method you use, you'll need a smart refresh strategy to benefit from the performance and cost gains of using materialized views. Running refreshes frequently can cause the same performance issues you might encounter with regular views or on-the-fly SQL queries. That's why materialized views are most effective when you strike the right balance between data freshness and query performance. Using materialized views can worsen the performance if you don't have the right refresh strategy. To decide on the correct strategy, you need to answer the following:

Data freshness: How frequently the application needs the data to be refreshed is a key factor that helps you decide the strategy for the data freshness requirement. For example, if a report in the application is seen by application users typically every hour, it would make sense to refresh the materialized view once an hour.
Cost and maintenance: You must understand the cost of running frequent refreshes, including query execution, index recreation, and potential blocking of CRUD operations during refreshes. This impact grows with large data sets. To reduce cost and contention, limit the data in the materialized view, such as by filtering by date ranges.
Time of refresh: It helps to refresh materialized views when the database usage is low in terms of compute and memory or when there are less critical workloads running on the database. This way, even if the refresh takes up a lot of resources, it doesn't impact other important database operations.

As mentioned earlier, the best way to determine the right strategy is to strike a balance among the three factors mentioned above. All in all, you want to come up with a strategy that fulfills the data freshness needs, doesn't refresh more frequently than required, and times the refreshes intelligently based on existing database workloads.

Best Practices for Materialized Views: When and How to Use Them

Materialized views can be a powerful tool for optimizing database performance, but they're not suitable for every scenario. Understanding when and how to use them effectively is crucial for maintaining both performance and system efficiency.

For instance, if your application needs fresh data all the time or if the resulting data sets are huge, materialized views aren't typically a suitable option. However, materialized views would be a good option for an application that requires a lightweight reporting and analytics layer, but where creating a dedicated data warehouse or data lake would involve excessive cost and maintenance overhead. They can serve predefined reports, dashboards, and data extracts without overburdening the database and the application with repetitive and unnecessary queries.

Use materialized views for the following cases:

Use materialized views for complex queries that deal with multiple tables, especially with large volumes of data.
Use materialized views for underlying tables that have a high read-to-write ratio. It means that the data doesn't change often; hence, the materialized view doesn't need to be refreshed frequently.
Use materialized views when your queries have complex logic with filters, aggregates, and window functions as these are operations that can slow down and potentially block crucial CRUD operations of your application.
Use materialized views for queries that are run frequently. The maximum positive impact of using materialized views occurs when your application uses the precomputed data in the materialized view frequently as it saves computation cost every time the query is executed.

Materialized views shouldn't be used for the following:

Do not use materialized views if you need the data to be refreshed frequently as, performance-wise, the experience will somewhat be equivalent to running live complex queries on tables, resulting in little or no benefits regarding cost or performance.
Do not use materialized views when the underlying data is too small and the queries on the tables or standard views are fast enough for your application's performance needs. Precomputation only helps if the queries are slowing down your application's or the database's performance.
Do not overuse materialized views. Materialized views come with their own complexities around index maintenance and vacuuming, among other things. Overusing materialized views might end up increasing the maintenance overhead while giving you some performance benefits.

If you're using materialized views, you should also follow some best practices to optimize them:

Use indexes to enable faster point lookups and aggregates. Just like Postgres tables, materialized views also support various types of indexes, which can significantly speed up point lookups and aggregate queries.
Implement a refresh strategy that doesn't burden the database with lock contentions—that is, don't refresh multiple heavy materialized views that use the same underlying tables or standard views underneath.

To reiterate, the decision to use materialized views, like many other database features such as triggers, user-defined functions, and stored procedures, depends on your specific business use case, cost considerations, and maintenance overhead tolerance. You need to find the right balance to determine if materialized views make sense for your particular scenarios.

Conclusion

This article covered some of the key benefits of materialized views, how to create, delete, and refresh materialized views in Supabase, and some of the best practices and recommendations for using them.

With materialized views, you can offload some of the read-heavy and compute-heavy queries to precompute the results that your application needs. You can schedule or trigger the materialized view refreshes at a time when the database utilization isn't high, essentially freeing up resources for the application to use and also turbocharging the performance with faster, table-like reads, which is especially beneficial for reporting, analytics, and other OLAP (online analytical processing) type of use cases.

DEV Community