My experience with regex
I have always stayed far away from regex. In one of my first year computer science labs, there was an exercise that involved regex. I think that was the very first time I was introduced to it. I thought it was cool at the time but it seemed too hard so I've been avoiding it or just googling how to solve a certain regex problem. But I finally took some time to learn it properly π
After reading some resources and dabbling around, it is safe to say I am not afraid of regex anymore! I found myself using it in many of the coding exercises I've been doing. All it takes is practice! Below is a cheatsheet (with examples) I've compiled of the regex I've learned and the resources I used π
The cheatsheet
I've included some of the regex I've learned that is not available in Javascript. For these, I commented them out. Remember the "g" modifier if you need it! For my examples, I left modifiers out.
let regex;
/* matching a specific string */
regex = /hello/; // looks for the string between the forward slashes (case-sensitive)... matches "hello", "hello123", "123hello123", "123hello"; doesn't match for "hell0", "Hello"
regex = /hello/i; // looks for the string between the forward slashes (case-insensitive)... matches "hello", "HelLo", "123HelLO"
regex = /hello/g; // looks for multiple occurrences of string between the forward slashes...
/* wildcards */
regex = /h.llo/; // the "." matches any one character other than a new line character... matches "hello", "hallo" but not "h\nllo"
regex = /h.*llo/; // the "*" matches any character(s) zero or more times... matches "hello", "heeeeeello", "hllo", "hwarwareallo"
/* shorthand character classes */
regex = /\d/; // matches any digit
regex = /\D/; // matches any non-digit
regex = /\w/; // matches any word character (a-z, A-Z, 0-9, _)
regex = /\W/; // matches any non-word character
regex = /\s/; // matches any white space character (\r (carriage return),\n (new line), \t (tab), \f (form feed))
regex = /\S/; // matches any non-white space character
/* specific characters */
regex = /[aeiou]/; // matches any character in square brackets
regex = /[ck]atherine/; // matches catherine or katherine
regex = /[^aeiou]/; // matches anything except the characters in square brackets
/* character ranges */
regex = /[a-z]/; // matches all lowercase letters
regex = /[A-Z]/; // matches all uppercase letters
regex = /[e-l]/; // matches lowercase letters e to l (inclusive)
regex = /[F-P]/; // matches all uppercase letters F to P (inclusive)
regex = /[0-9]/; // matches all digits
regex = /[5-9]/; // matches any digit from 5 to 9 (inclusive)
regex = /[a-zA-Z]/; // matches all lowercase and uppercase letters
regex = /[^a-zA-Z]/; // matches non-letters
/* matching repetitions using quantifiers */
regex = /(hello){4}/; // matches "hellohellohellohello"
regex = /hello{3}/; // matches "hellooo" and "helloooo" but not "helloo"
regex = /\d{3}/; // matches 3 digits ("312", "122", "111", "12312321" but not "12")
regex = /\d{3,7}/; // matches digits that occur between 3 and 7 times (inclusive)
regex = /\d{3,}/; // matches digits that occur at least 3 times
/* matching repetitions using star and plus */
regex = /ab*c/; // matches zero or more repetitions of "b" (matches "abc", "abbbbc", "ac")
regex = /ab+c/; // matches one or more repetitions of "b" (matches "abc", "abbbbc", but not "ac")
/* matching beginning and end items */
regex = /^[A-Z]\w*/; // matches "H", "Hello", but not "hey"
regex = /\w*s$/; // matches "cats", "dogs", "avocados", but not "javascript"
/* matching word boundaries
positions of word boundaries:
1. before the first character in string (if first character is a word character)
2. after the last character in the string, if the last character is a word character
3. between two characters in string, where one is a word character and the other isn't */
regex = /\bmeow\b/; // matches "hey meow lol", "hey:meow:lol", but not "heymeow lol"
/* groups */
regex = /it is (ice )?cold outside/; // matches "it is ice cold outside" and "it is cold outside"
regex = /it is (?:ice )?cold outside/; // same as above except it is a non-capturing group
regex = /do (cats) like taco \1/; // matches "do cats like taco cats"
regex = /do (cats) like (taco)\? do \2 \1 like you\?/; // matches "do cats like taco? do taco cats like you?"
//branch reset group (available in Perl, PHP, R, Delphi... commented out because this is a js file)
// regex = /(?|(cat)|(dog))\1/; // matches "catcat" and "dogdog"
/* alternative matching */
regex = /i like (tacos|boba|guacamole)\./; // matches "i like tacos.", "i like boba.", and "i like guacamole."
/* forward reference (available in Perl, PHP, Java, Ruby, etc... commented out because this is a js file) */
// regex = /(\2train|(choo))+/; // matches "choo", "choochoo", "chootrain", choochootrain", but not "train"
/* lookaheads */
regex = /z(?=a)/; // positive lookahead... matches the "z" before the "a" in pizza" but not the first "z"
regex = /z(?!a)/; // negative lookahead... matches the first "z" but not the "z" before the "a"
/* lookbehinds */
regex = /(?<=[aeiou])\w/; // positive lookbehind... matches any word character that is preceded by a vowel
regex = /(?<![aeiou])\w/; // negative lookbehind... matches any word character that is not preceded by a vowel
/* functions I find useful */
regex.test("hello"); // returns true if found a match, false otherwise
regex.exec("hello"); // returns result array, null otherwise
"football".replace(/foot/,"basket"); // replaces matches with second argument
Thank you Sarthak for creating a GitHub gist of my cheatsheet and Xian-an for translating to Chinese π
Resources
- The "Regular Expressions" challenges that is part of the "Javascript Algorithms and Data Structures Certification" on FreeCodeCamp
- MDN Regular Expression Docs
- RegexOne
- Regex101 for testing (you can also use the Chrome Developer Console)
- HackerRank regex challenges for practice
That's it folks! Hope this was helpful βΊοΈ
Top comments (42)
I would like to share my two favorites tools to create, edit, visualize and debug regex:
Debuggex.com
my favorite tool, it can make a diagram of how your regex will work, you can add multiple lines to test if the regex match the strings that you expect and also has a cheatsheet
RegExr.com
Similar to debuggex except it doesnt generate the diagram, but in my opinion their cheatsheet is cleaner and easier to find what you need.
Regex101 also seems like a great and quick site to know if your regex works!
Hey @catherinecodes , I think you've missed the greedy vs lazy matching:
Nice! I didn't learn those yet
You can achieve laziness with negated set too:
Just to nitpick, the second wildcard example should be
.*
, and your group examples will fail because you didn't capture a space (it is ice cold outside
andit isΒ Β cold outside
match but notit is cold outside
)Regex is great! And not at all as hard as it looks. Although I still don't have the hang of lookahead/lookbehinds yet π
Good catch Ryan! Thank you! I've updated the cheatsheet βΊοΈ
I still need some more practice as well π lookaheads and lookbehinds were totally new to me! I didn't know about them during my CS lab π²
Regex = Black Magic
Convince me if not! Never sure why something works. I just don't touch it and put the script inside as many as folders possible in case some evil spirit leaks out to our world.
Hide it away!!! πππ
I am going to bookmark the Gist version of your very wonderful cheat sheet. I look forward to using it every time I work with regex. The samples appear in a very logical progression that makes it very easy to understand and use. Thank you!
So happy you find it useful!!! βΊοΈβΊοΈβΊοΈ
You should add a note about lookbehind compatibility. It's not available in all browsers quite yet.
github.com/tc39/proposal-regexp-lo...
tc39.github.io/proposal-regexp-loo...
Very nice and brainstorming regex challenge is RegexGolf - alf.nu/RegexGolf :)
Another view on "regex games" is Regex Crosswords. Very nice idea. regexcrossword.com/
These look fun! Thanks for sharing :)
If you are adding usefull regex functions at the end of the code, you could also add information on replace function taking a callback as an argument.
const p = 'The quick brown fox jumps over the lazy dog. If the dog reacted, was it really lazy?';
p.replace('dog', (match, matchIndex, originalString) =>{
return match+"test"+matchIndex // match based replacer
})
Lookaheads and lookbehinds would be the features closest to "if-then" logic one uses while coding.
Just finished Bonnie Schulkin's course on regular expressions on Udemy. Great course, learned a lot about regex.
Another really great resource is regular-expressions.info/