TextKit

Posted on Mar 5

Regex Cheat Sheet: 10 Patterns That Handle 90% of Real Work

#regex #javascript #beginners #webdev

I've been writing regex for years and I still look things up constantly. The problem with most cheat sheets is they list every possible syntax token without telling you which ones you'll actually use.

Below are my working references, the patterns I reach for over and over.

The six characters you need to know

\d  →  any digit (0-9)
\w  →  any word character (letter, digit, underscore)
\s  →  any whitespace (space, tab, newline)
\D  →  any NON-digit
\W  →  any NON-word character
\S  →  any NON-whitespace

Uppercase = inverse. That's the whole pattern.

Quantifiers

+      one or more
*      zero or more
?      zero or one (optional)
{3}    exactly 3
{2,5}  between 2 and 5
{3,}   3 or more

The * vs + distinction matters: \d* matches an empty string (zero digits is fine). \d+ requires at least one digit. When in doubt, you want +.

The 10 patterns I copy-paste the most

1. Email

[\w.-]+@[\w.-]+\.\w{2,}

Not RFC-perfect. Doesn't need to be. Handles real-world emails.

const emails = text.match(/[\w.-]+@[\w.-]+\.\w{2,}/g);

2. URLs

https?:\/\/[\w\-._~:\/?#\[\]@!$&'()*+,;=%]+

The s? makes "s" optional; it catches both http and https.

const urls = text.match(/https?:\/\/[\w\-._~:\/?#\[\]@!$&'()*+,;=%]+/g);

3. US phone numbers

\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}

Handles 123-456-7890, (123) 456-7890, 123.456.7890, and 1234567890.

4. IP addresses (IPv4)

\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

The \b word boundaries are important. Without them you'd match numbers inside longer strings.

5. Dates (YYYY-MM-DD)

\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])

Validates format and checks month is 01-12, day is 01-31.

6. Hex colors

#(?:[0-9a-fA-F]{3}){1,2}\b

Matches both short #fff and long #ff00aa format.

7. Everything between double quotes

"([^"]*)"

The capture group ([^"]*) grabs the content. [^"]* means "any character except a quote, zero or more times."

8. Whole word match

\bword\b

\b is the word boundary anchor. \bcat\b matches "cat" but not "catch" or "concatenate".

9. Numbers with optional decimals

-?\d+\.?\d*

Matches 42, 3.14, -7, -0.5.

10. Multiple whitespace (for cleanup)

\s{2,}

Find two or more consecutive whitespace characters. Replace with a single space.

const clean = text.replace(/\s{2,}/g, ' ');

The three mistakes I see constantly

1. Not escaping periods. . matches ANY character. \. matches an actual period. This one bites everyone at some point.

2. Greedy vs lazy. ".*" on the string "hello" and "world" matches "hello" and "world", basically everything from first quote to last. Use ".*?" to match shortest: "hello" and "world" separately.

3. Forgetting the g flag. Without it, you only get the first match. Add g for global.

// Only first match
'abc 123 def 456'.match(/\d+/)    // ["123"]

// All matches
'abc 123 def 456'.match(/\d+/g)   // ["123", "456"]

Try it live

I built a regex tester with real-time match highlighting and a built-in cheat sheet. Paste a pattern, paste some text, see matches instantly. Runs in-browser, nothing stored.

Full version of this cheat sheet with lookahead/lookbehind and more examples: textkit.dev/blog/regex-cheat-sheet-beginners

DEV Community