Regex Cheatsheet & Pattern Library
Regular expressions don't have to be write-only code. This pack provides a complete regex syntax reference plus a library of 100+ tested, commented patterns organized by use case — input validation, log parsing, data extraction, URL manipulation, and text transformation. Each pattern includes the regex, an explanation of what each part does, test cases that pass and fail, and language-specific usage notes for Python, JavaScript, and Go.
What's Included
- Syntax Reference — Anchors, quantifiers, character classes, groups, lookaheads, lookbehinds
- Input Validation — Email, phone, URL, IP address, credit card, password strength, UUID
- Log Parsing — Apache/Nginx logs, syslog, JSON log lines, stack traces, timestamps
- Data Extraction — HTML tags, CSV fields, key-value pairs, version numbers, file paths
- Text Transformation — Case conversion, whitespace normalization, comment stripping, slug generation
-
Language-Specific Guides — Python
re, JavaScriptRegExp, Goregexp, PCRE differences - Performance Tips — Catastrophic backtracking, atomic groups, possessive quantifiers
- Interactive Test Cases — Every pattern includes strings that should match and strings that should not
Preview / Sample Content
Regex Syntax — Quick Reference
ANCHORS
^ Start of string (or line in multiline mode)
$ End of string (or line in multiline mode)
\b Word boundary
\B Not a word boundary
QUANTIFIERS
* 0 or more (greedy)
+ 1 or more (greedy)
? 0 or 1 (greedy)
{3} Exactly 3
{3,} 3 or more
{3,7} Between 3 and 7
*? 0 or more (lazy / non-greedy)
+? 1 or more (lazy / non-greedy)
CHARACTER CLASSES
. Any character (except newline)
\d Digit [0-9]
\D Not a digit
\w Word character [a-zA-Z0-9_]
\W Not a word character
\s Whitespace [ \t\n\r\f]
\S Not whitespace
[abc] Character set (a, b, or c)
[^abc] Negated set (not a, b, or c)
[a-z] Range
GROUPS & REFERENCES
(abc) Capturing group
(?:abc) Non-capturing group
(?P<name>) Named group (Python)
(?<name>) Named group (JS/PCRE)
\1 Back-reference to group 1
(?=abc) Positive lookahead
(?!abc) Negative lookahead
(?<=abc) Positive lookbehind
(?<!abc) Negative lookbehind
Validation Patterns — The Ones You'll Use Most
import re
# Email (RFC 5322 simplified — covers 99.9% of real addresses)
EMAIL = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
# ✅ user@example.com, first.last+tag@company.co.uk
# ❌ @example.com, user@.com, user@com
# URL (http/https with optional path, query, fragment)
URL = r'^https?://[a-zA-Z0-9][-a-zA-Z0-9]*(\.[a-zA-Z0-9][-a-zA-Z0-9]*)+(:\d+)?(/[-a-zA-Z0-9._~:/?#\[\]@!$&\'()*+,;=%]*)?$'
# ✅ https://example.com, http://sub.example.com:8080/path?q=1
# ❌ ftp://example.com, example.com, http://
# IPv4 Address
IPV4 = r'^(?:(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)$'
# ✅ 192.168.1.1, 10.0.0.255, 0.0.0.0
# ❌ 256.1.1.1, 192.168.1, 1.2.3.4.5
# UUID v4
UUID_V4 = r'^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$'
# ✅ 550e8400-e29b-41d4-a716-446655440000
# ❌ 550e8400-e29b-51d4-a716-446655440000 (version 5, not 4)
# Strong password (8+ chars, upper, lower, digit, special)
STRONG_PASS = r'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*])[A-Za-z\d!@#$%^&*]{8,}$'
# ✅ MyP@ss1rd, Str0ng!Pass
# ❌ password, Pass1234, SHORT!1a
# Semantic version (major.minor.patch with optional pre-release)
SEMVER = r'^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-([\da-zA-Z-]+(?:\.[\da-zA-Z-]+)*))?$'
# ✅ 1.0.0, 2.1.3, 1.0.0-alpha.1
# ❌ 1.0, v1.0.0, 01.0.0
Log Parsing Patterns
# Apache/Nginx combined log format
APACHE_LOG = r'(?P<ip>\S+) \S+ \S+ \[(?P<time>[^\]]+)\] "(?P<method>\S+) (?P<path>\S+) \S+" (?P<status>\d{3}) (?P<size>\d+|-)'
# Extracts: ip, timestamp, method, path, status code, response size
# ISO 8601 timestamp
ISO_TIMESTAMP = r'\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?(?:Z|[+-]\d{2}:?\d{2})'
# ✅ 2024-01-15T14:30:00Z, 2024-01-15T14:30:00.123+05:30
# Python/Java stack trace (first line)
STACK_TRACE = r'^(?:Traceback|Exception|Error|Caused by:|\s+at\s+|File\s+")'
# Key-value pairs from log messages
KV_PAIRS = r'(\w+)=("(?:[^"\\]|\\.)*"|\S+)'
# Matches: user="John Doe" action=login status=200 duration=1.23s
Quick Reference Table
| Pattern | Regex | Matches |
|---|---|---|
| Digits only | ^\d+$ |
123, 0, 99999
|
| Hex color | ^#[0-9a-fA-F]{6}$ |
#FF5733, #000000
|
| Date (YYYY-MM-DD) | ^\d{4}-\d{2}-\d{2}$ |
2024-01-15 |
| Time (HH:MM:SS) | ^\d{2}:\d{2}:\d{2}$ |
14:30:00 |
| Slug | ^[a-z0-9]+(-[a-z0-9]+)*$ |
my-page-slug |
| File extension | \.([a-zA-Z0-9]+)$ |
.py, .tar.gz (captures gz) |
| HTML tag | <([a-zA-Z][a-zA-Z0-9]*)\b[^>]*> |
<div class="x"> |
| Whitespace trim | `^\s+\ | \s+$` |
| Duplicate words | \b(\w+)\s+\1\b |
the the, is is
|
| Non-ASCII | [^\x00-\x7F] |
é, ñ, 中
|
Comparison: Regex Engines
| Feature | Python re
|
JavaScript | Go regexp
|
PCRE/PCRE2 |
|---|---|---|---|---|
| Lookahead | Yes | Yes | No | Yes |
| Lookbehind | Fixed-width | Yes (ES2018+) | No | Variable-width |
| Named Groups | (?P<n>) |
(?<n>) |
(?P<n>) |
Both syntaxes |
| Unicode |
\p{L} (regex module) |
\p{L} with /u
|
\p{L} native |
\p{L} |
| Atomic Groups | No | No | No | (?>...) |
| Verbose Mode | re.VERBOSE |
No | No | (?x) |
| Recursion | No | No | No | (?R) |
Usage Tips
- Start with the validation patterns — email, URL, and IP patterns are battle-tested and handle edge cases.
-
Use named groups (
(?P<name>...)in Python) — they make regex self-documenting and results easier to access. -
Always use raw strings in Python (
r'pattern') to avoid backslash escaping issues. - Test with the included test cases — every pattern lists strings that should and should not match.
- Read the performance section before using regex in loops — catastrophic backtracking can freeze your program.
This is 1 of 11 resources in the Cheatsheet Reference Pro toolkit. Get the complete [Regex Cheatsheet & Patterns] with all files, templates, and documentation for $9.
Or grab the entire Cheatsheet Reference Pro bundle (11 products) for $79 — save 30%.
Top comments (0)