DEV Community

wanglei
wanglei

Posted on

Mode Matching Operators

There are three separate approaches to mode matching provided by the database: traditional SQL LIKE operator, SIMILAR TO operator, and POSIX-style regular expression. Besides these basic operators, some functions can be used to extract or replace matching substrings and to split a string at matching locations.

LIKE
Description: Specifies whether the string matches the mode string following LIKE. The LIKE expression returns true if the string matches the provided mode. (As expected, the NOT LIKE expression returns false if the LIKE expression returns true, and vice versa.)

Matching rules:

This operator can succeed only when its mode matches the entire string. If you want to match a sequence in any position within the string, the mode must begin and end with a percent sign (%).

The underscore (_) represents (matches) any single character. The percent sign (%) indicates the wildcard character of any string.

To match a literal underscore or percent sign, the respective character in the mode must be preceded by an escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause.

To match escape characters, enter two escape characters. For example, to write a mode constant containing a backslash (), you need to enter two backslashes in SQL statements.

NOTE: When standard_conforming_strings is set to off, any backslashes you write in literal string constants will need to be doubled. So, writing a mode that matches a single backslash actually involves writing four backslashes in the statement (you can avoid this by selecting a different escape character with the ESCAPE clause so that the backslash is no longer a special character of LIKE. But the backslash is still a special character of the character text analyzer, so you still need two backslashes.) In a MySQL-compatible mode, it is also possible to select no escape character by writing ESCAPE ''. This effectively disables the escape mechanism, which makes it impossible to turn off the special meaning of underscores and percent signs in the mode.

The ILIKE keyword can be used to replace LIKE to make the match case-insensitive.

Operator ~~ is equivalent to LIKE, and operator ~~* corresponds to ILIKE.

Example:

openGauss=# SELECT 'abc' LIKE 'abc' AS RESULT;
 result
-----------
 t
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' LIKE 'a%' AS RESULT;
 result
-----------
 t
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' LIKE '_b_' AS RESULT;
 result
-----------
 t
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' LIKE 'c' AS RESULT;
 result
-----------
 f
(1 row)
Enter fullscreen mode Exit fullscreen mode

SIMILAR TO
Description: Returns true or false depending on whether the mode matches the given string. It is similar to LIKE, but differs in that SIMILAR TO uses the regular expression understanding mode defined by the SQL standard.

Matching rules:

Similar to LIKE, this operator succeeds only when its mode matches the entire string. If you want to match a sequence in any position within the string, the mode must begin and end with a percent sign (%).

The underscore (_) represents (matches) any single character. The percent sign (%) indicates the wildcard character of any string.

SIMILAR TO supports these mode-matching metacharacters borrowed from POSIX regular expressions.

Image description

4.A preamble escape character disables the special meaning of any of these metacharacters. The rules for using escape characters are the same as those for using LIKE.

Example:

openGauss=# SELECT 'abc' SIMILAR TO 'abc' AS RESULT;
 result
-----------
 t
(1 row)

Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' SIMILAR TO 'a' AS RESULT;
 result
-----------
 f
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' SIMILAR TO '%(b|d)%' AS RESULT;
 result
-----------
 t
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' SIMILAR TO '(b|c)%'  AS RESULT;
 result
-----------
 f
(1 row)
Enter fullscreen mode Exit fullscreen mode

POSIX Regular Expressions
Description: A regular expression is a character sequence that is an abbreviated definition of a set of strings (a regular set). If a string is a member of a regular set described by a regular expression, the string matches the regular expression. POSIX regular expressions provide more powerful means for mode matching than the LIKE and SIMILAR TO operators. Table 1 lists all available operators for mode matching using POSIX regular expressions.

Table 1 Regular expression matching operators

Image description

Matching rules:

Unlike LIKE, a regular expression is allowed to match anywhere within a string, unless the regular expression is explicitly anchored to the beginning or end of the string.

Besides the metacharacters mentioned above, POSIX regular expressions also support the following mode matching metacharacters:

Image description

Example:

openGauss=#  SELECT 'abc' ~ 'Abc' AS RESULT;
result 
--------
 f
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' ~* 'Abc' AS RESULT;
 result 
--------
 t
(1 row)

Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' !~ 'Abc' AS RESULT;
 result 
--------
 t
(1 row)

Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc'!~* 'Abc' AS RESULT;
 result 
--------
 f
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' ~ '^a' AS RESULT;
 result 
--------
 t
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' ~ '(b|d)'AS RESULT;
 result 
--------
 t
(1 row)
Enter fullscreen mode Exit fullscreen mode
openGauss=# SELECT 'abc' ~ '^(b|c)'AS RESULT;
 result 
--------
 f
(1 row)
Enter fullscreen mode Exit fullscreen mode

Although most regular expression searches can be executed quickly, regular expressions can still be artificially made up of memory that takes a long time and any amount of memory. It is not recommended that you accept the regular expression search mode from a non-secure mode source. If you must do this, you are advised to add the statement timeout limit. The SIMILAR TO search has the same security risks, as SIMILAR TO provides many capabilities that are the same as those of POSIX- style regular expressions. The LIKE search is much simpler than the other two options. Therefore, it is more secure to accept the non-secure mode source search.

Top comments (0)