Before we move on to manipulating array values stored in the database, it is useful to first learn the basics.
Representing a list
SELECT '{a,b,c}'::text[];
text
----------
{a,b,c}
(1 row)
We have to specify the type. Else postgres would not know the type since it looks like a string.
Or we could also use the ARRAY[]
style
# SELECT ARRAY['a','b','c'];
array
-------
{a,b,c}
SELECT
ing FROM
an array
So why doesn’t the following work?
# SELECT * FROM ARRAY['a','b','c']
ERROR: 42601: syntax error at or near "ARRAY"
LINE 1: SELECT * FROM ARRAY['a','b','c'];
^
LOCATION: scanner_yyerror, scan.l:1134
SELECT FROM
expects that the data source is:
- a set of rows
- OR points to a set of word (table)
ARRAY[]
returns an atomic value. Which is why it works with just SELECT
but fails when FROM
is used.
Convert arrays into rows using UNNEST
# SELECT UNNEST(ARRAY['a','b']);
unnest
--------
a
b
(2 rows)
Convert rows into arrays using ARRAY()
Watch out. We are not using ARRAY[]
. We are using array()
.
SELECT array(
SELECT UNNEST(ARRAY['a','b'])
);
array
-------
{a,b}
(1 row)
To summarize:
UNNEST
converts array into a series of rows.array()
converts a bunch of rows into an array.ARRAY[]
used to represent/define an array.
Concatenate arrays using array_cat()
# SELECT array_cat('{a,b}'::text[], '{b,c}'::text[]);
array_cat
-----------
{a,b,b,c}
(1 row)
We coerced the argument into text arrays to indicate to array_cat
what data types it was dealing with. If we used the ARRAY[]
initializer, this would not be required. Example below.
SELECT array_cat(ARRAY['a','b'], ARRAY['b', 'c']);
Notice that the final array contains b
twice. We would love to remove duplicates. There are two ways to do it and we'll explore both below.
Removing duplicates from an array using DISTINCT
This method works with integers, text, float, etc
DISTINCT
helps remove duplicates from all data types. But it requires rows as input. Not arrays.
- So first convert an array to rows using
UNNEST
- Pass it to DISTINCT
- And then if you still require an array as output, use
array()
SELECT DISTINCT UNNEST(
'{apple, banana, cat, apple}'::text[]
);
unnest
--------
banana
cat
apple
(3 rows)
SELECT DISTINCT UNNEST(
'{1.245, 1.3, 1.5, 1.2}'::float[]
) ORDER BY 1;
And we have columns. But we need an array back as output. So convert using array()
SELECT array(
SELECT DISTINCT UNNEST(
'{apple, banana, cat, apple}'::text[]
)
);
array
--------------------
{banana,cat,apple}
(1 row)
What if we want it sorted? Use ORDER BY 1
SELECT array(
SELECT DISTINCT UNNEST(
'{apple, banana, cat, apple}'::text[]
) ORDER BY 1
);
ORDER BY 1
would sort by the first column (We could starting with1
here). In this case we are working with on-the-fly arrays, without giving them column names. So we use column number references.
Want to try this with a float array? Let's do it.
SELECT array(
SELECT DISTINCT UNNEST(
'{1.2, 1.245, 2.3, 1.245, 1.246}'::text[]
)
);
array
-----------------------
{1.2,1.246,2.3,1.245}
(1 row)
Removing duplicates from integer arrays using sort
and uniq
Caveat: This method only works for integer arrays (yeah so no floats too).
sort
and uniq
are functions available in the intarray
extension. Enable the extension in your database by running the following statement.
CREATE EXTENSION IF NOT EXISTS intarray;
Try using the uniq
function.
SELECT uniq(
'{1,2,1,3}'::int[]
);
uniq
-----------
{1,2,1,3}
(1 row)
Oops. Does not work. uniq
only works on a sorted list. So use the sort
function to sort it first.
SELECT uniq(
sort(
'{1,2,1,3}'::int[]
)
);
uniq
---------
{1,2,3}
(1 row)
I haven't yet benchmarked in detail which method works the fastest for sorting integers. I used both methods as a part of another benchmarking to update 100k rows (coming soon). For now, all I can say is, both aren't slow. I'll try a more detailed benchmarking another day.
What else?
These are essentials that I find useful to know. There are a lot more array functions that Postgres supports. Go check it out.
I'll be sharing more from my notes on PostgreSQL. Subscribe to get updates.
Top comments (2)
thanks for sharing. Good stuff
How would I go about creating a unique set out a query like this
the expected result would be {thoughts,energy,mind, postgres,"data base"}
lovely article