I've been writing regex for 8 years. I still google basic patterns.
The problem isn't regex itself — it's that every cheatsheet shows you the theory without the practice. Here's the one I wish existed when I started.
The Essentials (90% of what you'll ever need)
| Pattern | Means | Example | Matches |
|---|---|---|---|
. |
Any character | h.t |
hat, hit, hot |
\d |
Any digit | \d{3} |
123, 456 |
\w |
Letter, digit, _ | \w+ |
hello, test_1 |
\s |
Whitespace | \s+ |
spaces, tabs |
^ |
Start of string | ^Hello |
"Hello world" |
$ |
End of string | end$ |
"the end" |
* |
0 or more | ab*c |
ac, abc, abbc |
+ |
1 or more | ab+c |
abc, abbc |
? |
0 or 1 | colou?r |
color, colour |
{n} |
Exactly n | \d{4} |
2026 |
{n,m} |
Between n and m | \d{2,4} |
12, 123, 1234 |
[abc] |
a, b, or c | [aeiou] |
vowels |
[^abc] |
Not a, b, or c | [^0-9] |
non-digits |
(...) |
Capture group | (\d+)px |
captures "12" from "12px" |
| `\ | ` | Or | `cat\ |
Real-World Patterns (Copy-Paste Ready)
Email Validation
{% raw %}
import re
# Simple but effective
pattern = r'^[\w.+-]+@[\w-]+\.[\w.]+$'
re.match(pattern, 'user@example.com') # ✓
re.match(pattern, 'user@.com') # ✗
re.match(pattern, 'user+tag@gmail.com') # ✓
URL Extraction
pattern = r'https?://[\w.-]+(?:/[\w./-]*)?'
text = 'Visit https://example.com/path and http://test.org'
urls = re.findall(pattern, text)
# ['https://example.com/path', 'http://test.org']
Phone Numbers (US)
pattern = r'\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'
# Matches: (555) 123-4567, 555-123-4567, 5551234567, 555.123.4567
IP Address
pattern = r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b'
text = 'Server 192.168.1.1 responded, backup at 10.0.0.1'
ips = re.findall(pattern, text)
# ['192.168.1.1', '10.0.0.1']
HTML Tag Content
pattern = r'<(\w+)[^>]*>(.*?)</\1>'
html = '<div class="main">Hello World</div>'
match = re.search(pattern, html)
# group(1) = 'div', group(2) = 'Hello World'
Date (Multiple Formats)
# YYYY-MM-DD or DD/MM/YYYY
pattern = r'\d{4}-\d{2}-\d{2}|\d{2}/\d{2}/\d{4}'
text = 'Created 2026-03-25, updated 25/03/2026'
dates = re.findall(pattern, text)
# ['2026-03-25', '25/03/2026']
Password Strength
# At least 8 chars, 1 upper, 1 lower, 1 digit, 1 special
pattern = r'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$'
CSV Line Parser
# Handle quoted fields with commas inside
pattern = r'"([^"]*)"|([^,]+)'
line = 'John,"Smith, Jr.",42,"New York, NY"'
fields = [g1 or g2 for g1, g2 in re.findall(pattern, line)]
# ['John', 'Smith, Jr.', '42', 'New York, NY']
Python vs JavaScript Differences
| Feature | Python | JavaScript |
|---|---|---|
| Named groups | (?P<name>...) |
(?<name>...) |
| Non-greedy | .*? |
.*? |
| Flags | re.IGNORECASE |
/pattern/i |
| Multiline | re.MULTILINE |
/pattern/m |
| Find all | re.findall() |
str.matchAll() |
| Replace | re.sub() |
str.replace() |
Pro Tips
-
Always use raw strings in Python:
r'\d+'not'\\d+' -
Be lazy, not greedy:
.*?instead of.*when possible -
Use named groups for readability:
(?P<year>\d{4}) - Test at regex101.com — it explains your pattern step by step
- Don't parse HTML with regex (use BeautifulSoup/lxml instead)
Resources
- regex101.com — Interactive regex tester
- Web Scraping Cheatsheet — Scraping patterns
- Python Project Template — Start coding
What's the most complex regex you've ever written? Or the one that took you the longest to debug? 👇
Tutorials at dev.to/0012303
Top comments (0)