DEV Community

楊東霖
楊東霖

Posted on • Originally published at devplaybook.cc

How to Test Regular Expressions: Complete Beginner Guide

Regular expressions (regex) can feel intimidating at first. A pattern like ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d]{8,}$ looks like line noise. But regex is a learnable skill, and once you understand the building blocks, you can read and write patterns with confidence.

This guide takes you from zero to writing and testing your own regular expressions. We'll cover the core syntax, how to test patterns step by step, common patterns every developer should know, and how to avoid the mistakes that trip up beginners.


What Is a Regular Expression?

A regular expression is a sequence of characters that defines a search pattern. You use regex to:

  • Check whether a string matches a pattern (validation)
  • Find occurrences of a pattern in text (search)
  • Extract parts of a string that match a pattern (capture)
  • Replace text that matches a pattern (substitution)

Regex is supported in almost every programming language and many text editors. The syntax is largely standardized, though small differences exist between flavors (JavaScript, Python, PCRE, etc.).


The Absolute Basics

Start with these building blocks before anything else.

Literal Characters

The simplest regex is a literal string. The pattern cat matches the word "cat" anywhere in the input:

Pattern: cat
Input:   "I have a cat and a catfish"
Matches: "cat" (at positions 9 and 18)
Enter fullscreen mode Exit fullscreen mode

The Dot (.)

The . matches any single character except a newline:

Pattern: c.t
Input:   "cat cut coat"
Matches: "cat", "cut"
Enter fullscreen mode Exit fullscreen mode

coat doesn't match because oa is two characters between c and t.

Character Classes [ ]

Square brackets match any one character from a set:

Pattern: [aeiou]
Input:   "hello"
Matches: "e", "o"

Pattern: [a-z]      (range: any lowercase letter)
Pattern: [A-Z]      (range: any uppercase letter)
Pattern: [0-9]      (range: any digit)
Pattern: [a-zA-Z0-9] (letters and digits)
Enter fullscreen mode Exit fullscreen mode

A caret inside brackets negates the class:

Pattern: [^aeiou]
Input:   "hello"
Matches: "h", "l", "l" (everything that's NOT a vowel)
Enter fullscreen mode Exit fullscreen mode

Quantifiers

Quantifiers specify how many times a character or group must match:

Quantifier Meaning
* 0 or more
+ 1 or more
? 0 or 1 (optional)
{3} Exactly 3
{3,} 3 or more
{3,6} Between 3 and 6
Pattern: ab+c
Input:   "ac abc abbc abbbc"
Matches: "abc", "abbc", "abbbc"
Enter fullscreen mode Exit fullscreen mode

ac doesn't match because b+ requires at least one b.

Anchors

Anchors match a position, not a character:

Anchor Matches
^ Start of string (or line with m flag)
$ End of string (or line with m flag)
\b Word boundary
\B Non-word boundary
Pattern: ^hello
Input:   "hello world"  → matches ("hello" at start)
Input:   "say hello"    → no match ("hello" not at start)

Pattern: world$
Input:   "hello world"  → matches
Input:   "worldwide"    → no match
Enter fullscreen mode Exit fullscreen mode

Shorthand Character Classes

These shortcuts save typing:

Shorthand Equivalent Matches
\d [0-9] Any digit
\D [^0-9] Any non-digit
\w [a-zA-Z0-9_] Word character
\W [^a-zA-Z0-9_] Non-word character
\s [ \t\r\n\f\v] Whitespace
\S [^ \t\r\n\f\v] Non-whitespace

Flags (Modifiers)

Flags change how the regex engine interprets the pattern:

Flag Name Effect
g global Find all matches, not just first
i case-insensitive Match regardless of case
m multiline ^ and $ match line starts/ends
s dotAll . matches newlines too
u unicode Full Unicode support

In JavaScript:

const regex = /pattern/gi;  // global + case-insensitive

// Equivalent with RegExp constructor
const regex2 = new RegExp('pattern', 'gi');
Enter fullscreen mode Exit fullscreen mode

Capture Groups

Parentheses create capture groups that let you extract specific parts of a match:

Pattern: (\d{4})-(\d{2})-(\d{2})
Input:   "Meeting on 2026-03-15"
Full match: "2026-03-15"
Group 1: "2026" (year)
Group 2: "03"   (month)
Group 3: "15"   (day)
Enter fullscreen mode Exit fullscreen mode

In JavaScript:

const datePattern = /(\d{4})-(\d{2})-(\d{2})/;
const result = '2026-03-15'.match(datePattern);
console.log(result[1]); // "2026" — year
console.log(result[2]); // "03"   — month
console.log(result[3]); // "15"   — day
Enter fullscreen mode Exit fullscreen mode

Named Capture Groups

Named groups make code more readable:

const datePattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const result = '2026-03-15'.match(datePattern);
console.log(result.groups.year);  // "2026"
console.log(result.groups.month); // "03"
console.log(result.groups.day);   // "15"
Enter fullscreen mode Exit fullscreen mode

Non-Capturing Groups

If you need grouping for quantifiers but don't need to capture the value, use (?:...):

Pattern: (?:https?://)?[\w.]+
Enter fullscreen mode Exit fullscreen mode

This makes the https?:// part optional without creating a capture group.


How to Test a Regex Step by Step

Testing a regex before using it in production prevents bugs. Here's a systematic approach.

Step 1: Use an Interactive Tester

DevPlaybook's Regex Tester shows matches highlighted in real time as you type. Paste your test string, enter your pattern, and see immediately what matches and what doesn't.

Benefits of an interactive tester:

  • Visual highlighting shows exactly what's matching
  • Match count and position details
  • Group captures displayed separately
  • Test against multiple strings at once
  • No restart required when you adjust the pattern

Step 2: Test Against Known-Good Inputs

Start with examples you're certain should match:

For an email validation pattern, try:

  • user@example.com — should match
  • user+tag@subdomain.example.co.uk — should match
  • admin@company.io — should match

Step 3: Test Against Known-Bad Inputs

Then test inputs that should NOT match:

For email validation:

  • notanemail — should not match
  • @nodomain.com — should not match
  • user@ — should not match
  • user @example.com — should not match (space)
  • user@.com — should not match (no domain name)

Step 4: Test Edge Cases

Think about the boundaries of your pattern:

  • Empty string: ""
  • Minimum valid input: "a@b.co"
  • Maximum length strings
  • Special characters: !, #, %, unicode characters
  • Whitespace: leading/trailing spaces, tabs, newlines

Step 5: Check for Catastrophic Backtracking

Some patterns cause the regex engine to slow down exponentially on certain inputs. This is called catastrophic backtracking (or ReDoS). Test your pattern with a long string of repeated characters that should not match:

Input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaax"
Enter fullscreen mode Exit fullscreen mode

If the test hangs for more than a second, your pattern has a backtracking problem. Nested quantifiers like (a+)+ or (a*)* are common culprits.


Testing Regex in Code

JavaScript

// test() — returns true/false
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
console.log(emailRegex.test('user@example.com')); // true
console.log(emailRegex.test('notanemail'));        // false

// match() — returns match array or null
const input = 'Contact us at support@example.com or sales@company.io';
const allEmails = input.match(/[^\s@]+@[^\s@]+\.[^\s@]+/g);
console.log(allEmails); // ["support@example.com", "sales@company.io"]

// exec() — returns detailed match with groups
const dateRegex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/g;
let match;
const dates = [];
while ((match = dateRegex.exec('Events: 2026-03-15 and 2026-04-01')) !== null) {
  dates.push(match.groups);
}
console.log(dates);
// [{ year: "2026", month: "03", day: "15" }, { year: "2026", month: "04", day: "01" }]

// replace() — substitute matches
const censored = 'Call 555-1234 or 555-5678'.replace(/\d{3}-\d{4}/g, '[PHONE]');
console.log(censored); // "Call [PHONE] or [PHONE]"
Enter fullscreen mode Exit fullscreen mode

Python

import re

# search() — find first match anywhere in string
match = re.search(r'\d+', 'There are 42 apples')
if match:
    print(match.group())  # "42"

# match() — match at start of string only
match = re.match(r'\d+', '42 is the answer')
print(match.group())  # "42"

# findall() — return list of all matches
emails = re.findall(r'[^\s@]+@[^\s@]+\.[^\s@]+', 'Contact support@example.com or sales@company.io')
print(emails)  # ['support@example.com', 'sales@company.io']

# sub() — substitute
result = re.sub(r'\d{3}-\d{4}', '[PHONE]', 'Call 555-1234 or 555-5678')
print(result)  # "Call [PHONE] or [PHONE]"

# Capture groups
pattern = re.compile(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')
match = pattern.search('Meeting: 2026-03-15')
if match:
    print(match.group('year'))   # "2026"
    print(match.group('month'))  # "03"

# Compile for reuse (better performance)
email_re = re.compile(r'^[^\s@]+@[^\s@]+\.[^\s@]+$')
print(email_re.match('user@example.com'))  # Match object
print(email_re.match('notanemail'))        # None
Enter fullscreen mode Exit fullscreen mode

10 Regex Patterns Every Developer Should Know

1. Email Validation (Practical)

^[^\s@]+@[^\s@]+\.[^\s@]+$
Enter fullscreen mode Exit fullscreen mode

Not a perfect email validator (the full spec is extremely complex), but catches 99% of invalid inputs in practice.

2. URL

https?://[\w\-\.]+\.[a-z]{2,}(\/[\w\-\.\?=&%#]*)*
Enter fullscreen mode Exit fullscreen mode

Matches http:// and https:// URLs with optional paths and query strings.

3. IP Address (IPv4)

^(25[0-5]|2[0-4]\d|[01]?\d\d?)(\.(25[0-5]|2[0-4]\d|[01]?\d\d?)){3}$
Enter fullscreen mode Exit fullscreen mode

Validates each octet is between 0 and 255.

4. Phone Number (flexible)

[\+]?[(]?[0-9]{3}[)]?[-\s\.]?[0-9]{3}[-\s\.]?[0-9]{4,6}
Enter fullscreen mode Exit fullscreen mode

Matches common formats: (555) 123-4567, 555.123.4567, +1-555-123-4567.

5. Date (YYYY-MM-DD)

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Enter fullscreen mode Exit fullscreen mode

Validates year format and restricts months to 01-12, days to 01-31.

6. Strong Password

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Enter fullscreen mode Exit fullscreen mode

Requires at least: 8 characters, one lowercase, one uppercase, one digit, one special character. Uses lookaheads.

7. Hex Color Code

^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$
Enter fullscreen mode Exit fullscreen mode

Matches #RGB (3-digit) and #RRGGBB (6-digit) hex color codes.

8. Credit Card Number (basic format check)

^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})$
Enter fullscreen mode Exit fullscreen mode

Basic format check for Visa (starts with 4), Mastercard (51-55), AmEx (34/37). Never use regex alone for payment validation — use a proper library.

9. Slug (URL-friendly string)

^[a-z0-9]+(?:-[a-z0-9]+)*$
Enter fullscreen mode Exit fullscreen mode

Validates lowercase alphanumeric strings with hyphens (no leading/trailing hyphens, no consecutive hyphens).

10. HTML Tag Extraction

<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1>
Enter fullscreen mode Exit fullscreen mode

Extracts content from HTML tags. Note: use a real HTML parser for production — regex can't handle nested tags reliably.


Common Regex Mistakes to Avoid

Forgetting to Escape Special Characters

These characters have special meaning in regex: . * + ? ^ $ { } [ ] | ( ) \

To match them literally, escape with \:

Pattern to match "3.14": 3\.14  (not 3.14 which matches "3X14" too)
Pattern to match "$5":   \$5
Pattern to match "(a)":  \(a\)
Enter fullscreen mode Exit fullscreen mode

Using Greedy Quantifiers When You Want Lazy

Input: "<div>first</div><div>second</div>"

Pattern: <div>.*</div>     (greedy)
Match:   "<div>first</div><div>second</div>" — matches everything

Pattern: <div>.*?</div>    (lazy)
Matches: "<div>first</div>" and "<div>second</div>" — two matches
Enter fullscreen mode Exit fullscreen mode

Not Anchoring Validation Patterns

// Wrong — matches any string containing an email
const emailRegex = /[^\s@]+@[^\s@]+\.[^\s@]+/;
emailRegex.test('not-an-email but has@something.here'); // true

// Right — must match the whole string
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
emailRegex.test('not-an-email but has@something.here'); // false
Enter fullscreen mode Exit fullscreen mode

Overcomplicating Simple Patterns

Regex becomes unreadable quickly. If a pattern exceeds two lines, consider splitting the validation into multiple simpler checks in code rather than one complex regex.


Using DevPlaybook Regex Tester

DevPlaybook's Regex Tester provides:

  • Real-time highlighting — matches highlight as you type the pattern
  • Match list — see every match with its position in the string
  • Capture group viewer — see what each () group captures
  • Flag toggles — turn g, i, m, s on and off with checkboxes
  • Multiple test strings — test your pattern against several inputs at once
  • Replace mode — test substitution patterns before using them in code

Bookmark it for daily use — it's faster than switching to an IDE, writing a test script, and running it every time you need to check a pattern.


Conclusion

Regular expressions are a power tool. The learning curve is real, but the syntax follows consistent rules: literals match themselves, metacharacters have special meanings, quantifiers control repetition, and anchors constrain position.

Start by learning the core building blocks in this guide, then practice by writing patterns for real problems you encounter. Use DevPlaybook's Regex Tester to build and verify patterns interactively — seeing the highlighting in real time accelerates learning significantly.

The most important habit: always test with both matching and non-matching cases before shipping a regex to production.


Need to extract data from JWT claims using regex patterns? The JWT Decoder shows the full payload structure. Working with JSON data that needs pattern-matching? The JSON Formatter Pro helps you see the structure clearly.


Level Up Your Dev Workflow

Found this useful? Explore DevPlaybook — cheat sheets, tool comparisons, and hands-on guides for modern developers.

🛒 Get the DevToolkit Starter Kit on Gumroad — 40+ browser-based dev tools, source code + deployment guide included.

Top comments (0)