MySQL REGEXP_SUBSTR() Function
The REGEXP_SUBSTR() function in MySQL is used for pattern matching. This function returns the substring from the input string that matches the given regular expression pattern. If there is no match found, it will return NULL. If the expression or pattern is NULL, the function will return NULL.
The REGEXP_SUBSTR() is the same as the SUBSTRING function, but instead of extracting only a given substring, this function allows us to search a string for a regular expression pattern also.
Syntax
The following is a basic syntax to use this function in MySQL:
Parameter Explanation
The explanation of the REGEXP_SUBSTR() function parameters are:
expression: It is an input string on which we will perform searching through regular expressions.
pattern: It represents the regular expression pattern for a substring.
The REGEXP_SUBSTR() function uses optional parameters also that are given below:
pos: It is used to specify the position in expression within the string to start the search. If we omit this parameter, it starts at position 1.
occurrence: It is used to specify for which occurrence of a match we are going to search. If we omit this parameter, the first occurrence is used.
match_type: It is a string that allows us to refine the regular expression. It uses the following possible characters to perform matching.
- c: It represents a case-sensitive matching.
- i: It represents a case-insensitive matching.
- E: It is used to extract a substring using a subexpression.
- m: It represents a multiple-line mode that recognizes line terminators within the string. By default, this function matches line terminators at the start and end of the string.
- n: It is used to modify the . (dot) character to match line terminators. By default, it will stop at the end of a line.
- u: It represents Unix-only line endings that recognize only the newline character by the ., ^, and $ match operators.
Let us understand it with various examples.
Example
The following statement explains the basic example of the REGEXP_SUBSTR function in MySQL.
This statement found the match and returns the below output:
Suppose there are multiple matches found in the input string. In that case, the first occurrence of the matched substring is returned by default. However, we can also specify another occurrence if needed. See the below statement:
This statement found the match and returns the first occurrence of the matched substring because we have not specified any particular occurrence. See the below output:
If the input string and pattern (substring) string have no matching, this function returns the NULL value. See the below example:
Here is the output:
If we want to return the substring by specifying a starting position, we can use the REGEX_SUBSTR function as follows:
In this statement, we have specified the starting position at 2. Executing this query, we will get the below output where we can see that the first position of the matched substring is not returned.
Let us see another example to understand it more clearly:
Here is the output:
If we want to specify the specific occurrence for returning the matched substring, we can use this function as follows:
In this example, we have specified the starting position of the matched substring is 1, and the occurrence of the matched substring is 3. Hence we will get the below output:
Here is another example where we have specified the starting position at 2 and occurrences as 1, 2, and 3.
This function will give the below output because the starting position came after the first occurrence had started. Therefore, this function assumes occurrence 2 as occurrence 1 and occurrence 3 as occurrence 2. And then there are no more occurrences found so that the result of occurrence 3 became NULL.
We can provide an additional parameter to refine the regular expression by using the match type arguments. For example, we can use it to verify whether the match is case-sensitive or include line terminators. See the below example where we are specifying a case-sensitive and case-insensitive match:
Here is the output: