DEV Community

Francesco Tisiot
Francesco Tisiot

Posted on • Originally published at ftisiot.net

How to use PostgreSQL® SUBSTRING

Need to extract a specific substring out of a text in PostgreSQL®? Read here how!

PostgreSQL SUBSTRING allows you to extract a particular string out of a text column.

The text column can be of:

  • character
  • character varying
  • text

As per note in the documentation, all the string functions are declared to accept and return type text, all the other types will be converted to it.

If you need a FREE PostgreSQL database? 🦀 Check Aiven's FREE plans! 🦀
If you need to optimize your SQL query? 🐧 Check EverSQL! 🐧

Note: For most use cases, you can also use the SUBSTR function, the cases and related syntax are included in the blog

Use SUBSTRING to extract a string from a specific index

Let's say we have the following three values in the database:

  • I Love Pizza with Mushroom
  • Pizza is Home
  • I had Pizza yesterday

with substring I could:

  • Extract from a precise string index, e.g. retrieve all the substring starting from 5th character until the end of the string:
SELECT SUBSTRING(mystr from 5)
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

With output

       substring        
------------------------
 ve Pizza with Mushroom
 a is Home
 d Pizza yesterday
Enter fullscreen mode Exit fullscreen mode

You can use the equivalent SUBSTR(mystr, 5) or SUBSTRING(mystr, 5) statement instead of the SUBSTRING(mystr from 5)

  • Extract from a precise string index, e.g. retrieve all the substring starting from 5th characte of width 3 characters:
SELECT SUBSTRING(mystr FROM 5 FOR 3)
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

With result

 substring 
-----------
 ve 
 a i
 d P
Enter fullscreen mode Exit fullscreen mode

You can use the equivalent SUBSTR(mystr, 5, 3) or SUBSTRING(mystr, 5, 3) statement instead of the SUBSTRING(mystr from 5 for 3)

  • Extract a precise set of characters from the beginning of the string. E.g. extract the first 3 characters from the string:
SELECT SUBSTRING(mystr FOR 3)
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

With result

 I L
 Piz
 I h
Enter fullscreen mode Exit fullscreen mode

You can use the equivalent SUBSTR(mystr, 1, 3) or SUBSTRING(mystr, 1, 3) statement instead of the SUBSTRING(mystr for 3)

Use POSITION and SUBSTRING to extract a string from a specific word

The examples above assumed the start from the string was a static character number. What about retrieving all the strings after Pizza? To do so we need to first locate the Pizza substring using POSITION and then use the SUBSTRING function.

SELECT SUBSTRING(mystr FROM POSITION('Pizza' IN mystr))
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

Results

      substring      
---------------------
 Pizza with Mushroom
 Pizza is Home
 Pizza yesterday
Enter fullscreen mode Exit fullscreen mode

What if we want only what follows Pizza? We need to add the length of the Pizza string (plus maybe the space if necessary) to the FROM position:

SELECT SUBSTRING(mystr FROM POSITION('Pizza' IN mystr) + LENGTH('Pizza'))
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

Result (note the space at the beginning of the string)

   substring    
----------------
  with Mushroom
  is Home
  yesterday
Enter fullscreen mode Exit fullscreen mode

Use SUBSTRING to extract a string using regular expressions

Reusing the case before, we could rewrite the extraction using POSIX regular expressions

SELECT SUBSTRING(mystr FROM 'Pizza.*$')
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

The above, instead of using POSITION to find the index of the Pizza word, it defines it as regular expression. RegEx are are powerful yet dangerous way to define the pattern you want find in a specific text. In the example above the Pizza.*$ defined to retrieve everyting starting with Pizza and then followed by any character .* until the end of the string $.

E.g. if I want to find the last word in each sentence I could write

SELECT SUBSTRING(mystr FROM '[A-Za-z]*$')
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

The result is

 substring 
-----------
 Mushroom
 Home
 yesterday
Enter fullscreen mode Exit fullscreen mode

Use SUBSTRING to extract a string using SQL regular expressions

Similar to the above, instead of using POSIX, we could use SQL regular expressions.

To get the first word after Pizza:

SELECT SUBSTRING(mystr SIMILAR '%Pizza #"[A-Za-z]*#"%' ESCAPE '#')
FROM
    (VALUES('I Love Pizza with Mushroom'),('Pizza is Home'),('I had Pizza yesterday')) as tbl(mystr);
Enter fullscreen mode Exit fullscreen mode

In the above we are using the %Pizza #"[A-Za-z]*#"% expression which states:

  • %Pizza: Anything starting with Pizza
  • #"[A-Za-z]*#": Print anything that contains single characters in upper or lowercase
  • % any trailing characters doesn't need to be displayed

Results

 substring 
-----------
 with
 is
 yesterday
Enter fullscreen mode Exit fullscreen mode

Top comments (0)