loading...
Cover image for The extraordinary behavior of match()

The extraordinary behavior of match()

jalal246 profile image Jalal ・2 min read

If you had a very rough week as I had. Let's do some coding therapy. Something healing and productive at the same time.

There are many ways to describe my relationship with regex. Complicated at its best, confusing most of the time.

It's something I usually try to avoid, but eventually, you have to face it and
get over it. And yes, no matter how much I pretend to know it, inside, I know I don't.

That was the case when our friends from the lovely dev community pointed out to another solution can be used to count and get statistics from a string by using match instead of split.

String.prototype.match()

Match() itself is simple. It looks into a given string, returns an array of results, based on regex expression.

const regex = /cat/g;
"fat cat, fast cat".match(regex);

// (2) ["cat", "cat"]

/cat/g, will look for c followed by a followed by t. Let's see the result for this one:

- "fat cat, fast cat".match(regex);
+ "category: fat cat, fast cat".match(regex);
"category: fat cat, fast cat".match(/cat/g);

// (3) ["cat", "cat", "cat"];

Unexpected? Maybe. But it's also clear, you got what you asked for. cat is in category. Do you need a different output? Use extra options.

Let's change the pattern, I need to match cat which starts with whitespace \s followed by character c followed by a followed by t, ends with space or comma or dot [\s|.|,]

const regex = /\s(cat)[\s|.|,]/g;
"category. fat cat, fast cat. category".match(regex);

// (2)[" cat,", " cat."];

A better result indeed. At least category is not counted.

So, to continue what we've already started in the previous post, let's recap some shorthands we need to know before we start counting:

\w: matches alphanumeric characters with numbers [a-zA-Z0-9_]
+: matches preceding symbol

Which means \w+ matches the whole word.

"fat cat".match(/\w+/g);
// (2) ["fat", "cat"]

\n: matches newline

"fat cat".match(/\n/g);
// null

"fat cat \n fast cat".match(/\n/g);
// (1) ["↵"]

Since the initial result is zero, we have to add +1 to the result.

\s: matches a whitespace character including newline \n and tab \t

"fat cat, fast cat".match(/\s/g);
// (3) [" ", " ", " "]

"fat cat\n fast cat".match(/\s/g);
// (4) [" ", " ", "↵", " ", " "]

Spaces = str.match(/\s/g) - str.match(/\n/g)

Building count()

const str = "Hello World\n How are you doing";

function count(str) {
  const lines = (str.match(/\n/g) || []).length;
  // (1) ["↵"]

  const spaces = (str.match(/\s/g) || []).length;
  // (6) [" ", "↵", " ", " ", " ", " "]
  // 6 - 1 = 5

  const words = str.match(/\w+/g) || [];
  // (6) ["Hello", "World", "How", "are", "you", "doing"]

  const total = str.length;
  // 30

  return {
    lines: lines + 1,
    spaces: spaces - lines,
    words,
    total,
  };
}

Note: Using str.match(reg) || [] just in case match not found which returns null.

Here's a good resource for learning regex github/learn-regex. You can also practice regex live via regexr.


Please leave ⭐️ if you like it. Feedbacks more than welcome 👋👋👋

Posted on Apr 5 by:

jalal246 profile

Jalal

@jalal246

I love to create things in JS.

Discussion

markdown guide