Written by Sarah Chima Atuonwu ✏️
PostgreSQL v14 was recently released with some major improvements. Some of these improvements include enhancements for heavy workload performance, improved support for distributed workload, and security enhancements.
One of the most exciting releases is enhanced support for JSON, which will be the focus of this blog post.
Before we delve into these improvements, let’s get a good understanding of how JSON has been used in PostgreSQL. This will help us to appreciate the improvements better.
A brief history of JSON in PostgreSQL
JSON was first introduced in Postgres with its v9.2 release. While it was a very exciting development, its implementation was not perfect. Postgres basically validated that the JSON to be stored is valid JSON and stored it as a text string.
A major improvement came with the JSONB type, which was released in v9.4. This is often referred to as the “better JSON” for good reasons. JSONB is stored in a decomposed binary format, which adds a little conversion overhead while storing it but is very efficient at manipulating and querying JSON. JSONB also supports the indexing of the data. Most people prefer to use JSONB instead of JSON in Postgres.
In Postgres v12, JSONPath was added to improve the efficiency of query JSON data.
That brings us to the present. Now let’s consider the improvements to JSON that came with v14.
JSON conveniences with v14
Postgres v14 allows you to access and manipulate JSON data in the conventional way. Let us use some examples to explain this.
Assume that we have a table for blog posts with a data column stored in JSON. In this post, we will use the JSONB type for all the examples:
CREATE TABLE blogs (
id serial,
data JSONB
)
We also insert some test values into it:
INSERT INTO
blogs (data)
VALUES
('{"title": "blog one", "author": {"first_name": "Ada", "last_name": "Love"}}' ),
('{"title": "blog two", "author": {"first_name": "Star", "last_name": "Work"}}' );
This will result in the following table:
SELECT *
FROM blogs;
id | data
----+------------------------------------------------------------------------------
1 | {"title": "blog one", "author": {"last_name": "Love", "first_name": "Ada"}}
2 | {"title": "blog two", "author": {"last_name": "Work", "first_name": "Star"}}
(2 rows)
Let’s see the v14 improvements.
Accessing JSON data using subscripts
In Postgres 13 and earlier, if you wanted to find the title of all blogs where the author’s first name was “Ada,” you would do the following:
SELECT data -> 'title' as title
FROM blogs
WHERE data -> 'author' ->> 'first_name' = 'Ada' ;
title
------------
"blog one"
(1 row)
Notice the operators we used to get this data:
-
->
is used to get the JSON array element by key indexed from zero or the JSON object field by key -
->>
is used to get the JSON array element or JSON object field as text
While this works, remembering this syntax is not the easiest. This is because the syntax is different from the conventional way of accessing JSON data. What if we could access stored JSON data in Postgres using subscripts like we are used to? This is what Postgres v14 brings to us.
Let’s try to refetch the data we got above, but this time the Postgres v14 way, using subscripts:
SELECT data['title'] as title
FROM blogs
WHERE data['author']['first_name'] = '"Ada"';
title
------------
"blog one"
(1 row)
Note that when doing a comparison with subscripting, you have to use a JSON string.
Updating JSON with subscripting
Updating JSON data stored in Postgres is also easier with subscripting. To update JSON in v13 and earlier, we needed to use the jsonb_set
function with the following signature:
jsonb_set (target jsonb, path text[], new_value jsonb [, create_if_missing boolean ])
In this code:
-
target
is the JSONB column to update -
path
indicates which JSON key you want to update -
new_value
is the new value of the item to be updated -
create_if_missing
is an option parameter that specifies if the key/value should be created if the key specified by the path does not exist
Now, let us use this function to update the data column in the example above. For example, if we want to update the last name of the author of the blog with id 1
, we do this:
UPDATE blogs
SET data = jsonb_set(data, '{author, last_name}', '"Sarah"', false)
WHERE id = 1;
This will result in:
SELECT * FROM blogs;
id | data
----+------------------------------------------------------------------------------
2 | {"title": "blog two", "author": {"last_name": "Work", "first_name": "Star"}}
1 | {"title": "blog one", "author": {"last_name": "Sarah", "first_name": "Ada"}}
(2 rows)
With Postgres v14, we do not need to use the jsonb_set
function to update JSONB data. We can do this instead:
UPDATE blogs SET data['author']['first_name'] = '"Sarah"' WHERE id = 2;
//id is different in this case it updates a different row
This will result in:
select * from blogs;
id | data
----+---------------------------------------------------------------------------------
1 | {"title": "blog one", "author": {"last_name": "Sarah", "first_name": "Ada"}}
2 | {"title": "blog two", "author": {"last_name": "Work", "first_name": "Sarah"}}
(2 rows)
The second row is updated.
Important things to note while updating JSON using subscripting
Using JSON assignment via subscripting handles some edges cases differently than jsonb_set
. Let’s consider some of them:
- If the value of the JSON key that is being updated is
null
, assignment via subscripting will act as if the value of the key is an empty object or array
So in our example above, if we try to update a row with tags, which does not exist on any of the rows like below:
UPDATE blogs SET
data['tags'] =' ["postgresql"] '
WHERE id = 1;
We get this result:
SELECT *
FROM blogs
WHERE id = 1;
id | data ----+-----------------------------------------------------------------------------------------------------
1 | {"tags": ["postgresql"], "title": "blog one", "author": {"last_name": "Sarah", "first_name": "Ada"}}
(1 row)
The tags
is always added to the row. There’s no option to prevent it from adding a nonexistent column like the jsonb_set create_optional
parameter.
- If an index is specified for an array, and the array contains too few elements,
null
is appended until the index is reached
So if we try to update the tags field we added in the previous example with an index that is more the current length of the array like this:
UPDATE blogs SET
data['tags'][4] =' "javascript" ' WHERE
id = 1;
We get this result:
SELECT *
FROM blogs
WHERE id = 1;
id | data
----+-------------------------------------------------------------------------------------------------------------------------------------
1 | {"tags": ["postgresql", null, null, null, "javascript"], "title": "blog one", "author": {"last_name": "Love", "first_name": "Ada"}}
(1 row)
Notice that null
is added until the specified index is reached.
- If a JSON value is assigned to a nonexistent subscript path, and the last existing element to be transversed is an object or array, the nested array or object will be created. However, like in the example above,
null
will be added until the index indicated is reached and the created object or array is placed
So in our example, if we do the following:
UPDATE blogs
SET data['otherdata'][3]['address'] =' "New York" '
WHERE id = 2;
We get the following result:
id | data ----+--------------------------------------------------------------------------------------------------------------------------------------
2 | {"title": "blog two", "author": {"last_name": "Work", "first_name": "Star"}, "otherdata": [null, null, null, {"address": "New York"}]}
(1 row)
You can see that the object is created. However, null
is appended until the index is reached.
Conclusion
It is really exciting to see how JSON support has tremendously improved in Postgres over the years. In this article, we have seen how JSON subscripting, which was added in Postgres v14, has made it easy to both access and update JSON. We have also considered some points to note while assigning values to JSON while using subscripting.
Does this mean that subscripting is fully replacing the old ways of accessing JSON data? Not necessarily. For instance, we saw that with updating JSON values, there’s no way of preventing the creating of the column if it does not exist. The jsonb_set
function gives us that option. Also, indexes are not always leveraged while accessing your JSON data with subscripting. However, subscripting is fine to use for noncomplex queries.
Who knows what improvements the newer versions of Postgres will bring? Better index leverage while using subscripts? Accessing JSON via dot notation? Only the future can tell.
Check out the official docs on JSON subscripting for more information.
You can also find out about other releases in Postgres v14 here:
LogRocket: Full visibility into your web apps
LogRocket is a frontend application monitoring solution that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.
In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single-page apps.
Top comments (2)
Great post but the code in update json set is a lot of scrolling can you fix it please
Good post - thanks. Some of your SQL is mis-formatted.