Hommies

Posted on May 21

LeetCode Solution: 10. Regular Expression Matching

#leetcode #dsa #programming #tutorial

Cracking the Regex Code: LeetCode 10 - Regular Expression Matching (Beginner-Friendly DP!)

Hey fellow coders! 👋

Today, we're diving into a classic LeetCode problem that might seem daunting at first glance but becomes incredibly elegant with the right approach: 10. Regular Expression Matching.

If you've ever used regular expressions (regex) in your programming, you know how powerful they are. This problem asks us to implement a simplified version of that power from scratch! Don't worry, we'll break it down step by step.

The Problem: Regular Expression Matching 🤯

We're given two strings: an input string (s) and a pattern (p). Our mission? Determine if the pattern p entirely matches the input string s.

The tricky part comes with two special characters in our pattern:

. (Dot): This little guy is a wildcard! It matches any single character. Think of it as a Joker card in a deck.
* (Asterisk): This one is a bit more complex. It matches zero or more occurrences of the preceding element. The "preceding element" could be a regular character (like a in a*) or even a dot (. in .*).

Crucial Note: The match must cover the entire input string s, not just a part of it.

Let's look at a few examples to make it super clear:

Example 1:
- s = "aa", p = "a"
- Output: false
- Why? The pattern "a" only matches the first 'a'. It doesn't cover the entire "aa".
Example 2:
- s = "aa", p = "a*"
- Output: true
- Why? The * means "zero or more of the preceding 'a'". If we repeat 'a' once, it becomes "aa", which matches s.
Example 3:
- s = "ab", p = ".*"
- Output: true
- Why? . matches "any single character". * means "zero or more of the preceding element". So, .* means "zero or more of any character". This can match "ab" (specifically, it matches a once, then b once).

Constraints:

s and p are pretty short (length up to 20). This is a hint that solutions with complexity like O(len(s) * len(p)) might pass!
s only has lowercase English letters.
p has lowercase English letters, ., and *.
Every * in p is guaranteed to have a valid preceding character (no *abc).

The Intuition: "Aha!" - It's a Matching Game! 🎮

When you see problems asking if one string "matches" another, especially with wildcards or characters that depend on previous elements, your brain should immediately start thinking about Dynamic Programming (DP).

Why DP? Because matching involves checking if sub-parts of s match sub-parts of p. If s[0...i] matches p[0...j], it likely depends on whether s[0...i-1] matched p[0...j-1], or some other combination. We can build up our solution from smaller, simpler matches!

Think of it like this: If you're trying to match s and p, you look at their last characters.

If they're simple characters and match, great! You just need to check if the rest of s matches the rest of p.
If p has a . (dot), it's a free pass! It matches any character in s. So again, check the rest.
If p has a * (asterisk), this is where it gets interesting. A * can represent zero occurrences, one occurrence, or multiple occurrences. This means we have choices, and DP is excellent for exploring choices efficiently!

The Approach: Building Our DP Table 🏗️

Let's define our DP state:

dp[i][j] will be a boolean value, true if the first i characters of s (i.e., s[0...i-1]) match the first j characters of p (i.e., p[0...j-1]), and false otherwise.

We'll create a (m+1) x (n+1) 2D array (or list of lists in Python), where m = len(s) and n = len(p). The extra row/column is for handling empty prefixes.

1. Initialization (Base Cases)

dp[0][0] = true: An empty string "" always matches an empty pattern "". This is our starting point!
dp[i][0] = false for i > 0: A non-empty string s can never match an empty pattern "". These will remain false by default.
dp[0][j] for j > 0: An empty string s can match patterns like "a*" or "a*b*".
- If p[j-1] is *, then dp[0][j] can be true if dp[0][j-2] is true. This means the X* part (where X is p[j-2]) is effectively skipped, matching zero occurrences of X.
- Example: For s="", p="a*b*".
  - dp[0][0] = true
  - p[1] is *. dp[0][2] (for a*) = dp[0][0] = true.
  - p[3] is *. dp[0][4] (for a*b*) = dp[0][2] = true.

2. Filling the DP Table (Iterative Logic)

We'll iterate i from 1 to m (for s) and j from 1 to n (for p).

For each dp[i][j], we consider s[i-1] (the current character in s) and p[j-1] (the current character in p).

Case 1: p[j-1] is NOT * (it's a regular character or .)

If p[j-1] is a regular letter OR p[j-1] is . (the wildcard):

We need s[i-1] to match p[j-1] (or p[j-1] is . which matches anything).
If they match, then dp[i][j] is true if and only if the previous parts also matched: dp[i-1][j-1].

if p[j-1] == s[i-1] or p[j-1] == '.':
dp[i][j] = dp[i-1][j-1]

Case 2: p[j-1] IS *

This is the most complex part, as * gives us two main options for the preceding element p[j-2]:

Option A: * matches zero occurrences of p[j-2]
- If * matches zero occurrences, it's like p[j-2] and p[j-1] (*) aren't even there.
- So, dp[i][j] would be true if s[0...i-1] matches p[0...j-3] (which is dp[i][j-2]).
- dp[i][j] = dp[i][j-2] (initial state for this cell)
Option B: * matches one or more occurrences of p[j-2]
- This is only possible if s[i-1] (the current character in s) actually matches p[j-2] (the character before *). Remember . also counts as a match!
- If s[i-1] matches p[j-2] (or p[j-2] is .):
  - * matches one occurrence: s[0...i-1] matches p[0...j-1] if s[0...i-2] matched p[0...j-1] (meaning s[i-1] is another occurrence of p[j-2]). This is dp[i-1][j].
  - We combine this with Option A: dp[i][j] = dp[i][j] OR dp[i-1][j]
- if p[j-2] == s[i-1] or p[j-2] == '.': dp[i][j] = dp[i][j] or dp[i-1][j]

Putting it all together for *:

elif p[j-1] == '*':
dp[i][j] = dp[i][j-2] # Option A: '*' matches zero preceding elements
if p[j-2] == s[i-1] or p[j-2] == '.':
dp[i][j] = dp[i][j] or dp[i-1][j] # Option B: '*' matches one or more

Finally, after filling the entire table, our answer is simply dp[m][n].

The Code ✨

class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        m, n = len(s), len(p)

        # dp[i][j] will be true if s[0...i-1] matches p[0...j-1]
        # (m+1) rows for string s, (n+1) columns for pattern p
        dp = [[False] * (n + 1) for _ in range(m + 1)]

        # Base case: Empty string matches empty pattern
        dp[0][0] = True

        # Handle patterns like "a*", "a*b*", etc., matching an empty string s
        # dp[i][0] for i > 0 will remain False as a non-empty string cannot match an empty pattern.
        for j in range(1, n + 1):
            # If the current pattern character is '*', it might match zero occurrences
            # of the preceding element. So, dp[0][j] depends on dp[0][j-2].
            if p[j - 1] == '*':
                dp[0][j] = dp[0][j - 2]

        # Fill the DP table
        for i in range(1, m + 1):
            for j in range(1, n + 1):
                # Case 1: Current pattern character is a letter or '.'
                # If s[i-1] matches p[j-1] (or p[j-1] is '.'), then the match depends on
                # whether the preceding parts (s[0...i-2] and p[0...j-2]) matched.
                if p[j - 1] == s[i - 1] or p[j - 1] == '.':
                    dp[i][j] = dp[i - 1][j - 1]

                # Case 2: Current pattern character is '*'
                elif p[j - 1] == '*':
                    # Option A: '*' matches zero occurrences of the preceding element p[j-2]
                    # In this scenario, we effectively ignore p[j-2] and p[j-1] ('*').
                    # So, dp[i][j] is true if dp[i][j-2] is true.
                    dp[i][j] = dp[i][j - 2]

                    # Option B: '*' matches one or more occurrences of the preceding element p[j-2]
                    # This is only possible if s[i-1] matches p[j-2] (or p[j-2] is '.').
                    # If it matches, then dp[i][j] can also be true if dp[i-1][j] is true.
                    # dp[i-1][j] means s[0...i-2] already matched p[0...j-1], and now s[i-1]
                    # is another instance matching p[j-2], so the '*' consumes s[i-1].
                    if p[j - 2] == s[i - 1] or p[j - 2] == '.':
                        dp[i][j] = dp[i][j] or dp[i - 1][j]

        return dp[m][n]

Time & Space Complexity Analysis ⏱️

Time Complexity: O(m * n)
- We are filling an m x n DP table. Each cell calculation takes constant time (a few comparisons and lookups).
- Given m, n <= 20, this is 20 * 20 = 400 operations, which is extremely fast!
Space Complexity: O(m * n)
- We use a 2D dp array of size (m+1) x (n+1) to store our results.
- Similarly, for m, n <= 20, this is 21 * 21 = 441 booleans, a very small memory footprint.

Key Takeaways ✨

Dynamic Programming (DP) for String Matching: Many string-related problems, especially those involving subproblems and choices (like * representing zero or more), are excellent candidates for DP.
Careful Base Cases: Correctly initializing your DP table, especially for empty strings and patterns, is crucial.
Handling Wildcards: Special characters like . and * require breaking down their behavior into distinct cases (. is simple, * involves "zero occurrences" vs. "one or more occurrences").
Building from Smaller Problems: The beauty of DP is seeing how dp[i][j] relies on previously computed dp values, allowing us to build up the complete solution efficiently.

This problem is a fantastic way to solidify your understanding of DP and string manipulation. Keep practicing, and you'll be a regex master in no time!

Author Account: p1Hzd8mRM8
Published: 2026-05-21 14:35:24

DEV Community