Regular Expressions: The Guide I Always Wanted (2026)
Regex isn't magic. It's a mini-language for pattern matching, and once you learn the basics, you'll use it everywhere.
The Mental Model
A regex is a pattern that matches (or doesn't match) text.
Think of it as: "Find all strings that look like THIS"
Components:
→ Literals: exact characters to match (a, b, 1, @)
→ Character classes: WHAT can match ([a-z], \d, \w)
→ Quantifiers: HOW MANY times (+, *, ?, {3})
→ Anchors: WHERE in the string (^, $, \b)
→ Groups: capture parts of the match ((...))
→ Alternation: OR logic (|)
The secret to reading regex: left to right, character by character.
Character Classes
// Exact match
/hello/ // Matches "hello" in "say hello there"
/Hello/ // Case sensitive! Won't match "hello"
// Dot (matches any single character EXCEPT newline)
/h.t/ // "hat", "hot", "hit", "h3t"... but NOT "ht"
// Character classes [ ] — match ONE from the set
/[aeiou]/ // Any vowel
/[a-z]/ // Any lowercase letter
/[A-Z]/ // Any uppercase letter
/[a-zA-Z0-9]/ // Alphanumeric
/[^0-9]/ // NOT a digit (^ inside [] = negation)
// Shorthand classes
/\d/ // Digit = [0-9]
/\D/ // Non-digit = [^0-9]
/\w/ // Word char = [a-zA-Z0-9_]
/\W/ // Non-word char
/\s/ // Whitespace (space, tab, newline)
/\S/ // Non-whitespace
// Examples
/\d{5}/ // Exactly 5 digits (ZIP code)
/[A-Z]\w+/ // Capital letter + word chars (PascalCase identifier)
/[^ \t]+/ // One or more non-space/tab chars (a word)
Quantifiers
// ? — Zero or one (optional)
/colou?r/ // Matches "color" AND "colour"
/https?/ // Matches "http" AND "https"
// * — Zero or more (greedy: matches as many as possible)
/a*c/ // "c", "ac", "aac", "aaaac"...
// + — One or more
/\d+/ // One or more digits: "123", "007"
/\S+@\S+\.\S+/ // Basic email pattern (simplified!)
// {n} — Exactly n times
/\d{4}/ // Exactly 4 digits (year)
/[A-Z]{2}/ // Exactly 2 uppercase letters (country code)
// {n,m} — Between n and m times
/\d{1,3}/ // 1-3 digits (IP address octet)
/\w{3,16}/ // Username: 3-16 word characters
// {n,} — n or more times
/\d{2,}/ // 2+ digits
// ⚠️ Greedy vs Lazy (CRITICAL!)
/<.+>/ // GREEDY: "<div>hi</div>" → matches ENTIRE string
/<.+?>/ // LAZY: "<div>hi</div>" → matches "<div>" only
// Add ? after any quantifier to make it lazy (match minimum)
Anchors & Boundaries
/^Hello/ // "Hello" at START of string only
/world$/ // "world" at END of string only
/^Hello world$/ // Exact full-string match
// Word boundary (\b) — position between word and non-word
/\bcat\b/ // Matches "cat" but NOT "catalog" or "scatter"
/\b\w+\b/ // Match whole words only
// String boundaries (JavaScript)
/^pattern/m // With 'm' flag, ^/$ match start/end of each LINE
/^$/m // Empty lines (useful for removing blank lines)
// Lookahead (match IF followed by...)
/foo(?=bar)/ // "foo" only if followed by "bar" ("foobar" ✓, "food" ✗)
/foo(?!bar)/ // "foo" only if NOT followed by "bar" ("food" ✓, "foobar" ✗)
// Lookbehind (match IF preceded by...)
/(?<=\$)\d+/ // Digits only if preceded by "$" ($100 ✓, 100 ✗)
/(?<!\$)\d+/ // Digits only if NOT preceded by "$" (100 ✓, $100 ✗)
Groups & Capturing
// Capturing groups (...) — extract matched parts
/(\d{4})-(\d{2})-(\d{2})/ // Date pattern with 3 groups
const str = "Date: 2026-05-30";
const match = str.match(/(\d{4})-(\d{2})-(\d{2})/);
if (match) {
match[0]; // "2026-05-30" (full match)
match[1]; // "2026" (group 1: year)
match[2]; // "05" (group 2: month)
match[3]; // "30" (group 3: day)
}
// Named groups (much more readable!)
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
const match = str.match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/);
if (match) {
const { year, month, day } = match.groups; // Destructured!
}
// Non-capturing group (?:...) — group without capturing
/(?:https?:\/\/)?(?:www\.)?example\.com/
// Groups exist for grouping only, not extracted
// Backreferences — refer to earlier group in same pattern
/(\w+) \1/ // Repeated word: "the the" ✓, "the cat" ✗
/"([^"]*)"/ // Quoted string, extract content without quotes
/([A-Z])\w* \1\w*/ // Words starting with same letter: "Big Bad" ✓
// Replacement with backreferences
"John Smith".replace(/(\w+) (\w+)/, "$2, $1"); // → "Smith, John"
// Clean up phone number: "(555) 123-4567".replace(/\D/g, "") → "5551234567"
Practical Patterns You'll Actually Use
// Email (practical, not RFC-compliant — that's impossible)
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
emailRegex.test("user@example.com"); // true
emailRegex.test("invalid@email"); // false
// URL (basic)
const urlRegex = /^https?:\/\/[^\s<>"]+$/i;
urlRegex.test("https://example.com/path?q=1"); // true
// Username (3-20 alphanumeric + underscore)
const usernameRegex = /^[a-zA-Z0-9_]{3,20}$/;
// Password (at least 8 chars, mixed case, number, special)
const pwdRegex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
// IPv4 address
const ipRegex = /^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/;
// Hex color code
const hexRegex = /^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/;
// Date formats
const dateRegex = /^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/; // YYYY-MM-DD
// Slug (URL-friendly string)
const slugRegex = /^[a-z0-9]+(?:-[a-z0-9]+)*$/;
slugRegex.test("my-blog-post-2026"); // true
slugRegex.test("My_Blog_Post"); // false
// Credit card (Luhn algorithm needed too, this is format only)
const ccRegex = /\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})\b/;
// Time (24h format)
const timeRegex = /^([01]\d|2[0-3]):([0-5]\d)(:[0-5]\d)?$/;
// Extract hashtags from text
const hashtagRegex = /#\w+/g;
"Check out #coding and #webdev #2026".match(hashtagRegex);
// → ["#coding", "#webdev", "#2026"]
// Remove HTML tags (basic)
const stripHtml = /<[^>]*>/g;
"<p>Hello <b>world</b></p>".replace(stripHtml, ""); // → "Hello world"
// Trim whitespace (alternative to .trim())
const trimRegex = /^\s+|\s+$/g;
" hello ".replace(trimRegex, ""); // → "hello"
// CamelCase to snake_case
"camelCaseString".replace(/[A-Z]/g, '_$&').toLowerCase();
// → "camel_case_string"
// Snake_case to camelCase
"snake_case_string".replace(/_([a-z])/g, (_, c) => c.toUpperCase());
// → "snakeCaseString"
// Format number with commas
"1000000".replace(/\B(?=(\d{3})+(?!\d))/g, ","); // → "1,000,000"
Regex Methods in JavaScript
const text = "Hello World! Hello Universe!";
const pattern = /hello/gi; // g = global, i = case-insensitive, m = multiline
// test() — does it match? (boolean)
pattern.test(text); // true
// exec() — find match with details (one at a time, even with /g)
let match;
while ((match = pattern.exec(text)) !== null) {
console.log(match[0]); // Matched text
console.log(match.index); // Position where it started
}
// String.match() — all matches (with /g returns array of strings)
text.match(pattern); // ["Hello", "Hello"]
// String.matchAll() — ALL matches with groups (requires /g!)
for (const m of text.matchAll(/(\w+)!/g)) {
console.log(m[0]); // Full match: "World!"
console.log(m[1]); // Group 1: "World"
}
// String.replace() — replace matches
text.replace(/hello/gi, "hi"); // "hi World! hi Universe!"
// Replace with function (powerful!)
"price: 100, tax: 50".replace(/\d+/g, (num) => `$${Number(num) * 1.1.toFixed(2)}`);
// → "price: $110.00, tax: $55.00"
// String.replaceAll() — simpler than /g for fixed strings
"a b c d e".replaceAll(" ", "-"); // "a-b-c-d-e"
// String.split() — split by regex
"a, b, c, d".split(/\s*,\s*/); // ["a", "b", "c", "d"]
// Trims spaces around commas!
// String.search() — find position of first match
text.search(/world/i); // 6 (index of "World")
// Flags:
// g — global (find all matches)
// i — case insensitive
// m — multiline (^ and $ match per line)
// s — dotAll (dot matches newlines too)
// u — Unicode support (emoji, etc.)
// y — sticky (matches only from lastIndex)
Debugging Regex
// When your regex doesn't work:
// 1. Break it into pieces
// Test each part separately
/\d{4}-\d{2}-\d{2}/
// Does \d{4} work? Yes → try \d{4}-
// Does that work? Yes → keep building up
// 2. Use an online tool (regex101.com, regexr.com)
// Visual breakdown, explanation, test cases
// 3. Common gotchas:
// Forgot to escape: . * + ? ^ $ | \ ( ) [ ] { } /
// → Use RegExp.escape() equivalent or double-backslash in strings
// In JavaScript strings, backslashes need escaping:
new RegExp("\\d{4}-\\d{2}-\\d{2}") // NOT "\d{4}" (that's just "d4")
// Better: use regex literal when possible: /\d{4}-\d{2}-\d{2}/
// 4. Greedy quantifier eating too much
/<div>.*<\/div>/ // Might match across multiple divs!
/<div>.*?<\/div>/ // Lazy: stops at first </div>
// 5. Forgetting /g flag
"hello hello".replace(/l/, "L"); // Only first: "heLlo hello"
"hello hello".replace(/l/g, "L"); // All: "heLLo heLLo"
// 6. ^ and $ behavior
// Without /m: matches start/end of entire string
// With /m: matches start/end of each LINE
What's the most useful regex trick you know? What regex nightmare have you survived?
Follow @armorbreak for more practical developer guides.
Top comments (0)