Regular Expressions: The Guide I Always Wanted (2026)
Regex is everywhere — in your editor, in your code, in your data pipeline. Stop fearing it and start using it like a pro.
The Mental Model
Think of regex as a pattern-matching language:
"Find text that matches this pattern"
→ Pattern = rules describing what you want
→ Match = specific text that fits the rules
Core concept: Position matters
The cat sat on the mat
^^^^ Start of string (anchor)
$$$ End of string (anchor)
Every regex has two modes:
1. Literal matching: "cat" finds "cat" exactly
2. Pattern matching: "c.t" finds "cat", "cot", "cut" (. = any char)
The Essential Syntax
// === Character Classes ===
. // Any single character except newline
\d // Digit [0-9]
\D // Non-digit [^0-9]
\w // Word character [a-zA-Z0-9_]
\W // Non-word character
\s // Whitespace [ \t\r\n\f\v]
\S // Non-whitespace
// Custom classes:
[abc] // a, b, or c
[abc] // Same as above
[^abc] // NOT a, b, or c (negation)
[a-z] // Any lowercase letter
[A-Z0-9] // Uppercase letter or digit
// Predefined (shorthand):
[0-9] == \d
[a-zA-Z0-9_] == \w
[ \t\r\n] == \s
// === Anchors (position, not characters!) ===
^ // Start of string/line
$ // End of string/line
\b // Word boundary (between \w and \W)
\B // Non-word boundary
// Examples:
/^Hello/ // Matches "Hello world" but not "Say Hello"
/end$/ // Matches "the end" but not "ending"
/\bcat\b/ // Matches "cat" but not "catalog" or "scatter"
// === Quantifiers (how many times) ===
* // Zero or more times (greedy: as many as possible)
+ // One or more times
? // Zero or one time (optional)
{3} // Exactly 3 times
{2,4} // Between 2 and 4 times
{2,} // 2 or more times
{,3} // Up to 3 times
// ⚠️ Greedy vs Lazy:
// * + {n,m} are GREEDY (match as much as possible)
// Add ? to make them LAZY (match as little as possible)
"a<b>bold</b> and <b>italic</b>".match(/<b>.+<\/b>/)
// → "<b>bold</b> and <b>italic</b>" (greedy: goes to LAST </b>)
"a<b>bold</b> and <b>italic</b>".match(/<b>.+?<\/b>/)
// → "<b>bold</b>" (lazy: stops at FIRST </b>)
// === Groups and Alternation ===
(abc) // Capturing group (remembers what matched)
(?:abc) // Non-capturing group (doesn't remember)
a|b|c // a OR b OR c (alternation)
// Capturing groups let you extract parts of the match:
const match = "user@domain.com".match(/^(\w+)@(\w+\.\w+)$/);
if (match) {
console.log(match[1]); // "user"
console.log(match[2]); // "domain.com"
}
// Named capture groups (more readable!):
const emailMatch = "alice@example.com".match(
/^(?<name>\w+)@(?<domain>\w+\.(?<tld>\w+))$/
);
if (emailMatch) {
console.log(emailMatch.groups.name); // "alice"
console.log(emailMatch.groups.domain); // "example.com"
console.log(emailMatch.groups.tld); // "com"
}
// Backreferences (refer to earlier captured group):
/(\w+)\s+\1/ // Matches "hello hello" but not "hello world"
// \1 refers to whatever the first group captured
Practical Examples You'll Use Every Day
// === Email validation (practical, not RFC-compliant) ===
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
// Explanation: non-@ chars @ non-@ chars . non-@ chars
// Simple, practical, catches 99% of real errors
// === URL extraction ===
const urlRegex = /https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z()]{2,6}\b([-a-zA-Z0-9()@:%_\+.~#?&=]*)/;
const text = "Visit https://example.com/path?q=1 for more info";
text.match(urlRegex)[0]; // "https://example.com/path?q=1"
// === Phone number normalization ===
const phoneRegex = /^\+?(\d{1,3})?[-.\s]?\(?(?\d{3})\)?[-.\s]?\d{3}[-.\s]?\d{4}$/;
"+1 (555) 123-4567".replace(phoneRegex, "+$1$2$3"); // "+15551234567"
// === Password strength check ===
function checkPasswordStrength(password) {
const checks = {
length: password.length >= 8,
lowercase: /[a-z]/.test(password),
uppercase: /[A-Z]/.test(password),
digit: /\d/.test(password),
special: /[!@#$%^&*(),.?":{}|<>]/.test(password),
};
const score = Object.values(checkes.filter(Boolean)).length;
if (score <= 2) return 'weak';
if (score <= 4) return 'medium';
return 'strong';
}
// === HTML tag stripping ===
const htmlRegex = /<[^>]*>/g;
"<p>Hello <b>world</b></p>".replace(htmlRegex, ''); // "Hello world"
// === Finding duplicate words ===
const duplicateRegex = /\b(\w+)\s+\1\b/gi;
"This is a test test of the regex".replace(duplicateRegex, '$1');
// "This is a test of the regex"
// === CSV parsing (simple cases) ===
const csvLine = '"Smith, John",25,"New York, NY",developer';
const csvRegex = /,(?=(?:(?:[^"]*"){2})*[^"]*$)/;
csvLine.split(csvRegex);
// ['"Smith, John"', '25', '"New York, NY"', 'developer']
// === Date format conversion ===
const dateStr = "05/31/2026";
dateRegex = /^(\d{2})\/(\d{2})\/(\d{4})$/;
const [, month, day, year] = dateStr.match(dateRegex);
console.log(`${year}-${month}-${day}`); // "2026-05-31"
// === Log level extraction ===
const logRegex = /^\[(DEBUG|INFO|WARN|ERROR)\]\[(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})\] (.+)/;
const logLine = "[INFO][2026-06-04T10:30:00] User logged in successfully";
const [, level, timestamp, message] = logLine.match(logRegex);
// === Number extraction from text ===
const numRegex = /-?\d+(?:\.\d+)?/g;
"The price is $42.99 and discount is 15%".match(numRegex);
// ["42.99", "15"]
Regex Methods in JavaScript
const str = "Hello World! Hello Universe!";
const pattern = /hello/gi; // g = global, i = case-insensitive
// test() — Does it match? (true/false)
pattern.test(str); // true
// exec() — Find match with details (returns null or match object)
let match;
while ((match = pattern.exec(str)) !== null) {
console.log(`Found "${match[0]}" at index ${match.index}`);
}
// Found "Hello" at index 0
// Found "Hello" at index 14
// match() — Find all matches (with g flag returns array of strings)
str.match(pattern); // ["Hello", "Hello"]
// Without g flag: returns first match with groups info
// matchAll() — All matches with groups (ES2020!)
for (const m of str.matchAll(/hello (\w+)/gi)) {
console.log(m[0], m[1]); // Full match, then captured group
}
// "Hello World!" "World"
// "Hello Universe!" "Universe"
// replace() — Replace matches
str.replace(/hello/gi, 'hi'); // "hi World! hi Universe!"
str.replace(/hello/gi, (match, offset) => `[${match.toUpperCase()}]`);
// "[HELLO] World! [HELLO] Universe!"
// replaceAll() — With replacement function (ES2021)
'2026-06-04'.replaceAll('-', '/'); // "2026/06/04"
// split() — Split by regex
"one,two;three,four".split(/[,;]/); // ["one", "two", "three", "four"]
// search() — Find index of first match
str.search(/world/i); // 6 (index where "world" starts)
Common Gotchas & How to Avoid Them
// ❌ Forgot global flag (only replaces first occurrence!)
"aaa".replace("a", "b"); // "baa" (only first!)
"aaa".replace(/a/g, "b"); // "bbb" (all!)
// ❌ Dot doesn't match newlines by default!
/multi\nline\nstring/.test("multi\nline\nstring"); // false!
// Fix: Use [\s\S] instead of . or enable dotall mode:
/multi[\s\S]*string/.test("multi\nline\nstring"); // true!
// ❌ Quantifiers are greedy (causes unexpected matches)
"<div><div>content</div></div>".replace(/<div>.*<\/div>/g, '');
// Removes EVERYTHING from first <div> to last </div>!
// Fix: Make it lazy with ?
/<div>.*?<\/div>/g
// ❌ Not escaping special characters
"price: $100 (USD)".replace(/\$(\d+)/, "$1 USD");
// If you forget to escape $: it means "end of string" in regex!
// Characters that MUST be escaped: \ ^ $ . | * + ? ( ) [ ] { } /
const escaped = str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
// ❌ Using regex for complex parsing (HTML, JSON, etc.)
// Don't parse HTML with regex! Use a proper parser.
// Don't parse JSON with regex! Use JSON.parse.
// Regex is for PATTERN MATCHING, not parsing structured formats.
// ✅ Performance tip: Be specific!
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}$/i // Good (specific patterns)
/.+@.+\..+/ // Bad (matches too much, slow on long strings)
// ✅ Compile regex once if reusing (in loops):
const EMAIL_REGEX = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
function validateEmail(email) {
return EMAIL_REGEX.test(email); // Compiled once, reused
}
// vs (bad): new RegExp('...') inside loop (recompiles every iteration)
What's your favorite regex trick? What regex nightmare have you survived?
Follow @armorbreak for more practical developer guides.
Top comments (0)