Regular Expressions: The Guide I Always Wanted (2026)
Regex looks like gibberish until it clicks — then it's a superpower. Here's the guide that makes regex actually make sense.
The Mental Model
Think of regex as a pattern-matching machine:
Input string: "hello@example.com"
Pattern: \w+@\w+\.\w+
The engine scans left-to-right, trying to match your pattern
at each position. When the full pattern matches → SUCCESS!
Key insight: Regex isn't about "finding text" — it's about
defining the SHAPE of what you're looking for.
Character Classes: What Are We Matching?
// Literal characters (match exactly)
/hello/ // Matches "hello" in "say hello there"
// Dot (wildcard) — matches ANY single character except newline
/h.t/ // "hat", "hot", "hit", but NOT "h" or "ht"
// Character classes — match ONE character from a set:
/[aeiou]/ // Any vowel
/[a-z]/ // Any lowercase letter
/[A-Z]/ // Any uppercase letter
/[0-9]/ // Any digit (same as /\d/)
/[a-zA-Z0-9]/ // Any alphanumeric (same as /\w/)
/[^0-9]/ // NOT a digit (negation with ^ inside [])
/[^aeiou]/ // Not a vowel
// Shorthand classes (use these!):
/\d/ // Digit: [0-9]
/\D/ // Non-digit: [^0-9]
/\w/ // Word char: [a-zA-Z0-9_]
/\W/ // Non-word char: [^\w]
/\s/ // Whitespace: [ \t\r\n\f\v]
/\S/ // Non-whitespace: [^\s]
// Examples:
/\d{3}-\d{4}/ // Phone-like: "555-1234"
/[A-Z][a-z]+/ // Capitalized word: "Hello", "World"
/#[0-9a-fA-F]{6}/ // Hex color code: "#ff5500"
Quantifiers: How Many Times?
// Quantifiers apply to the PRECEDING element:
/a*/ // Zero or more a's ("" is valid match!)
/a+/ // One or more a's (at least one)
/a?/ // Zero or one a (optional)
/a{3}/ // Exactly 3 a's ("aaa")
/a{2,4}/ // 2 to 4 a's ("aa", "aaa", "aaaa")
/a{2,}/ // 2 or more a's (no upper limit)
// ⚠️ Greedy vs Lazy (CRITICAL concept!)
// GREEDY (default): Match as MUCH as possible
const html = '<div>first</div><div>second</div>';
html.match(/<div>.*<\/div>/);
// Matches: '<div>first</div><div>second</div>' (greedy eats everything!)
// LAZY: Match as LITTLE as possible (add ? after quantifier)
html.match(/<div>.*?<\/div>/);
// Matches: '<div>first</div>' (lazy stops at first opportunity)
// Real-world example: Extract content between quotes
const str = 'He said "hello" and she said "world"';
str.match(/"(.*?)"/g); // ["\"hello\"", "\"world\""]
// Without lazy: would match from first " to last "
Anchors & Boundaries
// Anchors don't match characters — they match POSITIONS:
/^Hello/ // "Hello" at START of string only
/world$/ // "world" at END of string only
/^Hello world$/ // Exact full-string match
// Word boundaries (\b) — position between word char and non-word char:
/\bcat\b/ // Matches "cat" but NOT "catalog" or "scattered"
// Useful for whole-word search!
// Common patterns using anchors:
/^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$/i // Email validation
/^https?:\/\/[^\s]+$/i // URL detection
/^\d{4}-\d{2}-\d{2}$/ // Date YYYY-MM-DD
/^[+-]?\d+(\.\d+)?$/ // Integer or decimal number
Groups & Capturing
// Capturing groups (...) — extract parts of the match:
const date = '2024-06-11';
date.match(/(\d{4})-(\d{2})-(\d{2})/);
// Full match: "2024-06-11"
// Group 1 (year): "2024"
// Group 2 (month): "06"
// Group 3 (day): "11"
// Named capture groups (much more readable!):
const result = 'user@domain.com'.match(/(?<name>\w+)@(?<domain>\w+)\.(?<tld>\w+)/);
result.groups.name; // "user"
result.groups.domain; // "domain"
result.groups.tld; // "com"
// Non-capturing group (?:...) — group without capturing:
/(?:https?:\/\/)?(www\.\w+\.\w+)/
// First group doesn't capture, second does
// Backreferences — refer to earlier captured group:
/(\w+) \1/ // Matches "word word" (repeated word!)
/<(\w+)>.*?<\/\1>/ // Matches <b>...</b> but NOT <b></i>
// Lookahead assertions (match based on what FOLLOWS):
/\d+(?= dollars)/ // Match numbers only if followed by " dollars"
// "I have 100 dollars" → matches "100"
// "I have 100 euros" → no match
// Negative lookahead:
/\d+(?! dollars)/ // Match numbers NOT followed by " dollars"
// "I have 100 euros" → matches "100"
// "I have 100 dollars" → no match
// Lookbehind (what PRECEDES):
/(?<=\$)\d+/ // Match digits preceded by $
// "Price: $99" → matches "99"
/(?<!\$)\d+/ // Match digits NOT preceded by $
// "Price: $99, qty 5" → matches "5" only
Practical Examples You'll Actually Use
// === Validation ===
function validateEmail(email) {
return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email);
// Simple but practical. Don't try RFC-compliant email regex!
}
function validatePassword(password) {
return /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{12,}$/.test(password);
}
function validateURL(url) {
try { new URL(url); return true; }
catch { return false; } // URL constructor > regex for URLs!
}
// === Extraction & Transformation ===
// Extract all hashtags:
'Check out #JavaScript and #WebDev'.match(/#\w+/g);
// ["#JavaScript", "#WebDev"]
// Format phone number:
'5551234567'.replace(/(\d{3})(\d{3})(\d{4})/, '($1) $2-$3');
// "(555) 123-4567"
// CamelCase to snake_case:
'myVariableName'.replace(/[A-Z]/g, '_$&').toLowerCase();
// "my_variable_name"
// Truncate words to max length:
'this is a very long sentence that needs shortening'.replace(/\b(\w{1,8})\s?\b/g, '$1\n');
// Break into lines where words exceed 8 chars
// Remove duplicate lines:
text.split('\n').filter((line, i, arr) => arr.indexOf(line) === i).join('\n');
// Find unquoted strings (complex!):
// Match strings not surrounded by quotes:
/'[^']*'|"[^"]*"|(\b\w{3,}\b)/g // Then filter out captured group 1
// === Search & Replace in Code ===
// Add console.log before each function line (for debugging):
code.replace(/^(function\s+\w+)/gm, 'console.log("$1"); $1');
// Convert require() to import:
code.replace(/const\s+(\w+)\s*=\s*require\(['"](.*)['"]\)/g, 'import $1 from "$2";');
// Remove console.log statements (before deploy):
code.replace(/^\s*console\.(log|debug|info)\(.*\);\s*$/gm, '');
Testing & Debugging Regex
// Test in browser console or Node.js:
const pattern = /your-regex-here/;
pattern.test('test string'); // true/false
'test string'.match(pattern); // Array of matches or null
'test string'.replaceAll(pattern, 'replacement');
// Debugging tips:
// 1. Start simple, build up piece by piece
// 2. Use regex101.com or regexr.com for visual debugging
// 3. Use .source to see the actual pattern string:
console.log(pattern.source);
// 4. Break complex patterns into pieces:
const WORD = /[a-zA-Z]+/;
const SPACE = /\s+/;
const sentencePattern = new RegExp(`^${WORD.source}(${SPACE.source}${WORD.source})*$`);
// 5. Common gotchas:
// - In JS strings, backslashes must be escaped: new RegExp('\\d+') not new RegExp('\d+')
// - . doesn't match newline by default; use [\s\S] instead if needed
// - Always use ^ and $ anchors for full-string validation
// - test() returns true on partial match; use anchors for exact matching
What's your favorite regex trick? What regex problem has been haunting you?
Follow @armorbreak for more practical developer guides.
Top comments (0)