RegEx 101

#javascript #webdev #developer

Regular expression or in short Regex is a string of text that lets you create patterns that help match, locate, and manage text. It’s an important tool in a wide variety of computing applications, from programming languages like JS, Java and Perl, to text processing tools like grep, sed, and vim.

Here are a few helpers to refresh your mind when you need some ‘simple’ regex to do the job.

Characters

Characters	Legend	Example	Sample Match
[abc], [a-c]	Match the given characters/range of characters	abc[abc]	abca, abcb, abcc
[^abc], [^a-c]	Negate and match the given characters/range of characters	abc[^abc]	abcd, abce, abc1
.	Any character except line break	bc.	bca, bcd, bc1, b.
\d	Any numeric character (equivalent to [0-9])	c\d	c1, c2, c3
\D	Any non-numeric character (equivalent to [^0-9])	c\D	ca, c., c*
\w	Any alphanumeric character (equivalent to [A-Za-z0-9_])	a\w	aa, a1, a_
\W	Any non-alphanumeric character (equivalent to [A-Za-z0-9_])	a\W	a), a$, a?
\s	Usually used for white space , but can be used for new line , tab , etc	a\s	a
\S	Not a white space or equivalent like new line , tab , etc	a\S	aa
\t	Matches a horizontal tab	T\tab	T ab
\r	Matches a carriage return	AB\r\nCD	AB
CD
\n	Matches a linefeed	AB\r\nCD	AB
CD
\	Escapes special characters	\d	0, 1
x	y	Matches either “x” or “y”	a

Assertions

Characters	Legend	Example	Sample Match
^	Start of string or start of line depending on multiline mode	^abc.*	abc, abd, abcd
$	End of string or start of line depending on multiline mode	.*xyz$	xyz, wxyz, abcdxyz
\b	Matches a word character is not followed by another word-character	My.*\bpie	My apple pie
\B	Matches a non-word boundary	c.*\Bcat	copycat
x(?=y)	Lookahead assertion : Matches “x” only if “x” is followed by “y”	\d+(?=€)	$1 = 0. 9 8€
x(?!y)	Negative Lookahead assertion : Matches “x” only if “x” is followed not by “y”	\d+\b(?!€)	$ 1 = 0.98€
(?<=y)x	Lookbehind assertion : Matches “x” only if “x” is preceded by “y”	(?<=\d)\d	$1 = 0.98€
(?<!y)x	Negative Lookbehind assertion : Matches “x” only if “x” is not preceded by “y”	(?<!\d)\d	$ 1 = 0. 9 8€

Groups

Characters	Legend	Example	Sample Match
(x)	Capturing group : Matches x and remembers the match	A(nt	pple)
(?x)	Capturing group : Matches x and stores it in the mentioned variable	A(?nt	pple)
(?:name>x)	Non-capturing group : Matches x and does not remember the match	A(?:nt	pple)
_n_	Back reference to the last substring matching the n parenthetical	(\d)+(\d)=\2+\1	5+6=6+5

Quantifiers

Characters	Legend	Example	Sample Match
x*	Matches the preceding item “x” 0 or more times	a*	a, aa, aaa
x+	Matches the preceding item “x” 1 or more times, equivalent to {1,}	a+	aa, aaa, aaaa
x?	Matches the preceding item “x” 0 or 1 time	ab?	a, ab
x{n}	Matches the preceding item “x” n times (n = positive integer )	ab{5}c	abbbbbc
x{n,}	Matches the preceding item “x” at least n times (n = positive integer )	ab{2,}c	abbc, abbbc, abbbbc
x{n,m}	Matches the preceding item “x” at least n times & at most m times (n<m)	ab{2,3}c	abbc, abbbc

NOTE

By default quantifiers are greedy (they try to match as much of the string as possible).

The ? character after the quantifier makes the quantifier non-greedy (it will stop as soon as it finds a match).

For Example: \d+? for a test string 12345 will match only 1, but \d+ will match the entire string 12345

Flags

Flags are put at the end of the regular expression. They are used to modify how the regular expression behaves.

For Example: /a/ for a test string a will match a only, but adding the flag i (/a/i) would match both a and A

Characters	Legend
d	Generate indices for substring matches
g	Global search
i	Case-insensitive search
m	Multi-line search
s	Allows `.` to match newline characters
u	Treats a pattern as a sequence of Unicode code points
y	Perform a sticky search that matches starting at the current position in the target string

If you wish to test your knowledge:

Have a good weekend!

DEV Community