View vs. Materialized View: What are the differences between these two, and let's try creating them on Redshift and Athena.
What is a View Table?
Normally, when querying data from a database or warehouse, we retrieve it directly from the table, right? However, if we need to repeatedly use the same query or if we need to use the query we wrote in other places, such as in the backend, ETL, or other ELT processes, we have to include the SQL we wrote. Now, imagine that the SQL we wrote is over 100 lines long. What happens? It's messy! Managing the script becomes difficult.
For example, as seen in the picture, this SQL is very long and complex. Copying it to different places or trying to write more code based on it becomes difficult and has a high chance of errors. This means that after copying, the syntax might get distorted.
Therefore, something called a View table was created to reduce the complexity of querying. As a result, the SQL we saw above will be reduced to what's shown in the example below.
-- Example usecase, which is not really a good usecase LOL
SELECT * FROM company_table
WHERE member_id IN (SELECT employee_id FROM view_table)
-- If we not using view table
SELECT * FROM company_table
WHERE member_id IN (
SELECT employee_id
FROM abc as a
LEFT JOIN efg as e
ON a.id = e.id
.
.
.
.
WHERE id IS NOT NULL);
It will be similar to querying a regular table, except that the data returned is the result of the query underlying that View table.
However! Modern data warehouses have both View Tables and Materialized View Tables. What are the differences between these two? The next section will explain each type of View.
View Table
View Table is a virtual representation of a table. In other words, when we create a View table, the physical data is not stored in the database. Instead, when we query it, the database executes the query that was used to create the View table. Therefore, no matter how many View tables we create, the storage space does not increase. Let's see how to create a View table.
-- syntax
CREATE [ OR REPLACE ] VIEW name [ ( column_name [, ...] ) ] AS query
[ WITH NO SCHEMA BINDING ] -- [] optional
-- example use
CREATE VIEW vw_myevents
AS
SELECT id FROM mockl;
After creating a View table and refreshing, we will find that the View table is located in the View section of both Redshift and Athena. Both services use similar syntax, which is CREATE VIEW.
In Redshift, we can see the query underlying the View by right-clicking and selecting 'Show view definitions'. We can also edit it directly from there.
In Athena, you can view and edit the query by right-clicking and selecting 'Show/edit Query'. You can also make additional edits directly from there.
Materialized View
Materialized View is a View that is similar to a regular View table, but with an added feature: Materialized View takes the data resulting from the query and creates a physical table that is stored in the database.
Querying data is faster with Materialized View compared to a regular View table. For example, if we have a join between table A and table B, with a regular View table, every time we query the view, the system has to process the join between table A and table B again. However, with Materialized View, the join processing is already done because the joined data is pre-stored.
Let's try creating a Materialized View in Redshift. (Athena cannot create Materialized Views because Athena is a serverless query engine and does not store any data. Therefore, it cannot create them. The workaround is to create a table from the query and store it in S3.🤓)
-- syntax
CREATE MATERIALIZED VIEW mv_name
[ BACKUP { YES | NO } ]
[ table_attributes ]
[ AUTO REFRESH { YES | NO } ]
AS query
-- example use
CREATE MATERIALIZED VIEW "public"."mvw_myevent_demo"
AUTO REFRESH YES
AS
SELECT
id, first_name, last_name
FROM
"public"."mockl"
After refreshing the Editor page, you'll find that the Materialized View is now in the View section. As mentioned, Materialized View stores data in the database. Therefore, when you query SVV_TABLE_INFO (a table that stores information about system tables and user-defined tables), you'll see only the Materialized View, not the View table, and it also shows the data size.
The limitations of Materialized View are quite significant. Here are some examples of what cannot be used to create Materialized Views:
- Standard views or system tables and views
- Temporary tables
- User-defined functions
- ORDER BY, LIMIT, or OFFSET clauses
Pros and Cons of View and Materialized View
View Pros:
- Simplifies queries, reducing lengthy queries to just one line.
- Provides up-to-date data every time it's queried, reflecting the underlying data in the View (e.g., if there's no WHERE clause, you get the latest data).
View Cons:
- Modifying a View requires deleting and recreating it.
- Changes in the source tables of the View's query can break the View.
- While it provides fresh data, it cannot be indexed or optimized directly;
- optimization must be done on the underlying tables.
Materialized View Pros:
- Faster query performance because it retrieves data from the pre-stored Materialized View, not the source tables.
- Reduces ETL/ELT steps and simplifies modifications compared to creating new tables via ETL/ELT.
- Redshift offers auto-refresh, updating data from source tables, eliminating the need for separate data update pipelines.
Materialized View Cons:
- Increases database storage consumption, similar to creating a new table.
- Not suitable for real-time data with constant updates; better for batch processing (even with manual refresh).
- Cannot be created on Athena.
Regarding the selection of which to use, I have these general guidelines:
View:
- If you want to avoid increasing storage costs.
- If you need to use it with real-time data.
- If the query is simple and not complex.
Materialized View:
- If you want to improve query performance for dashboard displays, especially when pre-processing is required.
- To reduce processing overhead from complex queries, as Materialized View pre-processes and stores data as a physical table.
Hope this help!
originated from my article in Thai: https://clouddatalabor.com/2025/03/24/view-vs-materialized-view-a-beginners-guide-with-aws-athena-redshift/
Top comments (0)