The 5 Regex Patterns That Cover 80 Percent of Real-World Use Cases

#javascript #webdev #beginners #tutorial

I have been writing regular expressions for 12 years. I use maybe 15 patterns regularly. Five of them cover the vast majority of what I need in production code.

These are not the clever ones. They are the patterns that show up in pull requests, log parsers, form validators, and data cleaning scripts week after week.

1. Extract all URLs from a block of text

https?://[^\s<>"']+

This matches http:// or https:// followed by any non-whitespace, non-bracket, non-quote characters. It is intentionally loose. A strict URL regex exists (RFC 3986 compliant) and it is 6,300 characters long. Nobody uses it.

The loose version catches real URLs in emails, Slack messages, log files, and documentation. It sometimes grabs a trailing period or comma from sentences like "Visit https://example.com." but trimming punctuation from the end is a one-line post-processing step.

const urls = text.match(/https?:\/\/[^\s<>"']+/g) || [];

I reach for this pattern at least once a week.

2. Validate email format (the practical version)

^[^\s@]+@[^\s@]+\.[^\s@]+$

This checks for: something, then @, then something, then a dot, then something. No spaces anywhere.

It will accept a@b.c which is technically valid. It will reject user @domain.com which is technically invalid. For a form field where you need to catch typos and obvious non-emails, this is enough.

The "correct" email regex per RFC 5322 handles quoted strings, IP address literals, comments, and nested parentheses. It is several hundred characters long and rejects emails that every mail server on earth will accept. Do not use it.

const isEmail = /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(input);

The real email validation happens when you send a confirmation link. The regex is just a first filter for obvious mistakes.

3. Extract numbers from messy strings

-?\d+\.?\d*

This matches an optional minus sign, one or more digits, an optional decimal point, and zero or more decimal digits. It handles integers, decimals, and negative numbers.

Given the string "Temperature: -12.5°C (dropped 3.2 degrees from yesterday's 8.7)", this pattern extracts [-12.5, 3.2, 8.7].

const numbers = text.match(/-?\d+\.?\d*/g)?.map(Number) || [];

I use this constantly when parsing CSVs with inconsistent formatting, scraping data from web pages, and cleaning log files.

4. Split on any whitespace (including tabs, newlines, multiple spaces)

\s+

Two characters. The most useful regex that exists.

const words = text.trim().split(/\s+/);

This handles tabs, multiple spaces, newlines, carriage returns, and any combination. It turns "hello world\tgoodbye\nfriend" into ["hello", "world", "goodbye", "friend"].

String.split(" ") fails on tabs. It creates empty strings for double spaces. Using \s+ solves every whitespace splitting problem in one pattern.

5. Replace template variables

\{\{(\w+)\}\}

This matches {{variableName}} and captures the variable name in group 1.

const result = template.replace(/\{\{(\w+)\}\}/g, (match, key) => {
  return data[key] !== undefined ? data[key] : match;
});

This is a basic template engine in three lines. It powers quick string interpolation in config files, email templates, and generated reports where pulling in a full template library is overkill.

The pattern I avoid

Parsing HTML with regex. It is technically possible for simple cases. It is a maintenance nightmare for anything beyond extracting a single known tag. Use a DOM parser. Always.

The one exception: stripping all HTML tags from a string for plain-text extraction. /<[^>]+>/g works for that specific case and nothing else.

I test all my patterns at zovo.one/free-tools/regex-tester before putting them in production code. Seeing matches highlighted in real time catches edge cases that staring at the pattern never does.

I'm Michael Lip. I build free developer tools at zovo.one. 500+ tools, all private, all free.