DEV Community

Cong Li
Cong Li

Posted on

Detailed Explanation of the SUBSTRING_INDEX Function in GBase 8s

In database string processing, it is often necessary to extract a specific part of a string based on a particular delimiter. The SUBSTRING_INDEX function provided by the GBase 8s database is a powerful tool for this purpose. It allows developers to easily perform complex string extraction operations. This article will provide a detailed introduction to the SUBSTRING_INDEX function and demonstrate its application in data processing through practical examples.

1. Function Overview

The SUBSTRING_INDEX function is one of the string processing functions in the GBase 8s database. Its primary function is to extract a specific part of a string based on a specified delimiter. The return type of the function matches the data type of the input string. When processing strings that contain multiple delimiters, SUBSTRING_INDEX can extract a substring based on the number of occurrences and position of the delimiter.

Note: The examples in this article use the GBase8sV8.8_3.5.1 database version.

2. Function Syntax

The SUBSTRING_INDEX function searches for a string of specified delimiter characters and returns a substring of the leading or trailing characters based on the count of the delimiter specified as a parameter.

The standard syntax of the SUBSTRING_INDEX function is as follows:

SUBSTRING_INDEX(source_string, delimiter, n)
Enter fullscreen mode Exit fullscreen mode
  • source_string: The string to be processed.
  • delimiter: The delimiter.
  • n: Specifies the position (or count) of the delimiter, used to determine which part of the substring to return. If n is a positive number, the function returns the content to the left of the nth delimiter from the left. If n is negative, it returns the content to the right of the nth delimiter from the right.

3. Important Notes

1) If the input parameter source_string is NULL, or delimiter is NULL, the function will return NULL. When n is NULL, the entire source_string is returned.

Image description

2) If fewer than n delimiters are found in source_string, the return value is the entire source_string.

Image description

3) When n = 0, the function returns the entire content of source_string.

Image description

4) In a multibyte character set environment, the return value is based on the logical character sequence rather than simple byte positions.

4. Practical Examples

Suppose we have a string 'www.example.com', and we want to extract different parts of this string.

1) Extract the part before the domain name (including the first dot):

Image description

Result: 'www'

Explanation: Returns the entire content before the first . from the left.

2) Extract the top-level domain:

Image description

First, SUBSTRING_INDEX('www.example.com', '.', 2) returns 'www.example', and the second SUBSTRING_INDEX(..., '.', -1) returns the content after the first . from the right, which is 'example'. However, to get the top-level domain .com, the correct usage is shown below:

Image description

Result: 'com'

Explanation:

3) If n is 0 or a negative number but its absolute value is greater than the number of delimiters:

When n = 0, the SUBSTRING_INDEX function returns the entire string (this behavior is consistent with MySQL).

Image description

If the absolute value of n is greater than the number of delimiters, the function returns the string corresponding to the actual number of delimiters. For example:

Image description

Result: 'www.example.com'

Explanation: Since there is only one ., regardless of whether n is a positive number greater than 1, the entire string is returned.


The SUBSTRING_INDEX function plays a crucial role in string processing within the GBase database. With its flexible functionality, developers can efficiently handle various string extraction tasks. Through this detailed introduction and practical examples, we hope to help users gain a deeper understanding of this function and use it effectively to enhance data processing efficiency and accuracy.

Top comments (0)