Regular expression or in short Regex is a string of text that lets you create patterns that help match, locate, and manage text. It’s an important tool in a wide variety of computing applications, from programming languages like JS, Java and Perl, to text processing tools like grep, sed, and vim.
Here are a few helpers to refresh your mind when you need some ‘simple’ regex to do the job.
Characters
| Characters | Legend | Example | Sample Match |
|---|---|---|---|
| [abc], [a-c] | Match the given characters/range of characters | abc[abc] | abca, abcb, abcc |
| [^abc], [^a-c] | Negate and match the given characters/range of characters | abc[^abc] | abcd, abce, abc1 |
| . | Any character except line break | bc. | bca, bcd, bc1, b. |
| \d | Any numeric character (equivalent to [0-9]) | c\d | c1, c2, c3 |
| \D | Any non-numeric character (equivalent to [^0-9]) | c\D | ca, c., c* |
| \w | Any alphanumeric character (equivalent to [A-Za-z0-9_]) | a\w | aa, a1, a_ |
| \W | Any non-alphanumeric character (equivalent to [A-Za-z0-9_]) | a\W | a), a$, a? |
| \s | Usually used for white space , but can be used for new line , tab , etc | a\s | a |
| \S | Not a white space or equivalent like new line , tab , etc | a\S | aa |
| \t | Matches a horizontal tab | T\tab | T ab |
| \r | Matches a carriage return | AB\r\nCD | AB |
| CD | |||
| \n | Matches a linefeed | AB\r\nCD | AB |
| CD | |||
| \ | Escapes special characters | \d | 0, 1 |
| x | y | Matches either “x” or “y” | a |
Assertions
| Characters | Legend | Example | Sample Match |
|---|---|---|---|
| ^ | Start of string or start of line depending on multiline mode | ^abc.* | abc, abd, abcd |
| $ | End of string or start of line depending on multiline mode | .*xyz$ | xyz, wxyz, abcdxyz |
| \b | Matches a word character is not followed by another word-character | My.*\bpie | My apple pie |
| \B | Matches a non-word boundary | c.*\Bcat | copycat |
| x(?=y) | Lookahead assertion : Matches “x” only if “x” is followed by “y” | \d+(?=€) | $1 = 0. 9 8€ |
| x(?!y) | Negative Lookahead assertion : Matches “x” only if “x” is followed not by “y” | \d+\b(?!€) | $ 1 = 0.98€ |
| (?<=y)x | Lookbehind assertion : Matches “x” only if “x” is preceded by “y” | (?<=\d)\d | $1 = 0.9*8*€ |
| (?<!y)x | Negative Lookbehind assertion : Matches “x” only if “x” is not preceded by “y” | (?<!\d)\d | $ 1 = 0. 9 8€ |
Groups
| Characters | Legend | Example | Sample Match |
|---|---|---|---|
| (x) | Capturing group : Matches x and remembers the match | A(nt | pple) |
| (?x) | Capturing group : Matches x and stores it in the mentioned variable | A(?nt | pple) |
| (?:name>x) | Non-capturing group : Matches x and does not remember the match | A(?:nt | pple) |
| _n_ | Back reference to the last substring matching the n parenthetical | (\d)+(\d)=\2+\1 | 5+6=6+5 |
Quantifiers
| Characters | Legend | Example | Sample Match |
|---|---|---|---|
| x* | Matches the preceding item “x” 0 or more times | a* | a, aa, aaa |
| x+ | Matches the preceding item “x” 1 or more times, equivalent to {1,} | a+ | aa, aaa, aaaa |
| x? | Matches the preceding item “x” 0 or 1 time | ab? | a, ab |
| x{n} | Matches the preceding item “x” n times (n = positive integer ) | ab{5}c | abbbbbc |
| x{n,} | Matches the preceding item “x” at least n times (n = positive integer ) | ab{2,}c | abbc, abbbc, abbbbc |
| x{n,m} | Matches the preceding item “x” at least n times & at most m times (n<m) | ab{2,3}c | abbc, abbbc |
NOTE
By default quantifiers are greedy (they try to match as much of the string as possible).
The ? character after the quantifier makes the quantifier non-greedy (it will stop as soon as it finds a match).
For Example: \d+? for a test string 12345 will match only 1, but \d+ will match the entire string 12345
Flags
Flags are put at the end of the regular expression. They are used to modify how the regular expression behaves.
For Example: /a/ for a test string a will match a only, but adding the flag i (/a/i) would match both a and A
| Characters | Legend |
|---|---|
| d | Generate indices for substring matches |
| g | Global search |
| i | Case-insensitive search |
| m | Multi-line search |
| s | Allows . to match newline characters
|
| u | Treats a pattern as a sequence of Unicode code points |
| y | Perform a sticky search that matches starting at the current position in the target string |
If you wish to test your knowledge:


Top comments (0)