Every time I need a regex, I end up Googling the same patterns. So I made myself a reference — here are 15 of the most useful from my full 50-pattern cookbook.
Each works across Python, JavaScript, Ruby, Go, and any PCRE engine.
Validation
Email (simplified RFC 5322)
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
URL (http/https)
^https?://[^\s/$.?#].[^\s]*$
Strong password (8+ chars, mixed case, digit, special)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
UUID v4
^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$
Extraction
All URLs from text
https?://[^\s<>"{}\\^`\[\]]+
Email addresses from text
\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}\b
Currency amounts
\$\d{1,3}(?:,\d{3})*(?:\.\d{2})?
Version numbers (semver)
\b\d+\.\d+\.\d+\b
Cleaning
Strip HTML tags
Pattern: <[^>]+> — replace with empty string
Collapse multiple spaces
Pattern: \s+ — replace with single space
Mask credit card numbers
Pattern: \b(\d{4})\d{8}(\d{4})\b — replace with $1********$2
CamelCase to snake_case
Pattern: ([a-z0-9])([A-Z]) — replace with $1_$2, then lowercase
Log Parsing
Apache/Nginx log line
^(\S+) \S+ \S+ \[([\w:/]+\s[+\-]\d{4})\] "(\S+) (.*?) (\S+)" (\d{3}) (\d+|-)
Captures: IP, timestamp, method, path, protocol, status code, bytes
Python stack trace line
File "([^"]+)", line (\d+), in (\w+)
HTTP status code from log
\s(2|3|4|5)\d{2}\s
Using them in Python
import re
# Compile once, reuse
EMAIL_RE = re.compile(r'\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}\b')
emails = EMAIL_RE.findall(text)
# Replace
cleaned = re.sub(r'<[^>]+>', '', html_text)
# Split on multiple delimiters
tokens = re.split(r'[,;\|\t]', messy_data)
The full 50
These 15 are the ones I use most. The full cookbook has 50 patterns in 10 categories:
- Validation (10) — email, URL, phone, IP, UUID, date, card, password, hex, slug
- Extraction (10) — URLs, emails, hashtags, mentions, numbers, currency, HTML, markdown, code blocks, versions
- Cleaning (10) — strip HTML, collapse whitespace, mask PII, CamelCase convert
- Splitting (6) — CSV, multi-delimiter, camelCase, sentences, words
- Log parsing (4) — Apache, Python, stack traces, HTTP status
- File paths (5) — extensions, Windows, Unix, Docker, Git SHA
- Advanced (5) — balanced parens, strings, JSON keys, SQL
Full cookbook: payhip.com/b/q8xTn ($9)
Also available:
- Python Automation Toolkit — 10 standalone scripts ($12)
- Python Quick Reference Cheat Sheet ($5)
- AI Prompt Pack for Developers — 50 prompts for Claude/ChatGPT ($9)
What regex patterns do you reach for most?
Top comments (0)