DEV Community

Cover image for A few things about regular expressions in JavaScript
Yeom suyun
Yeom suyun

Posted on

A few things about regular expressions in JavaScript

Regular expressions are a powerful tool for matching and manipulating text in JavaScript. They have been supported since the ES3 specification in 1999, as JavaScript was originally designed for processing HTML strings.
While complex regular expressions can be slower than optimized JavaScript logic, in general, using regular expressions to process strings is faster than not using them.

Processing order of regular expressions

JavaScript regular expressions work in three steps.

  1. When we declare a regular expression, the JavaScript engine compiles it.
  2. When we call a function on the regular expression or the string, the compiled regular expression program is passed the string, and match data is returned.
  3. The function that was called returns the appropriate result using the string and regular expression match data.
// 1. A regex that matches all n~n ranges behind or in ahead of "AAA"
const regex = /(?<=AAA)|(?=AAA)/g

// 2. A total of 6 ranges from 0 ~ 0, 1 ~ 1... 5 ~ 5 are matched by the regex.
"AAAAA".replace(regex, "B")

// 3. The result is "BABABABABAB" because the ranges are replaced with "B".
Enter fullscreen mode Exit fullscreen mode

Reading order of regular expressions

JavaScript regular expressions can use backtracking to find a match, which can lead to catastrophic backtracking problems if there is a mismatch.
For example, the number of backtracking attempts of a regular expression can be represented by a function, depending on the number of a's in a regular expression of the form /(a+)+b/.test("aaac").

function cases(n) {
  if (n == 1) return 1
  let acc = 1
  for (let i = 1; i < n; i++) {
    acc += i
  }
  return acc + cases(n - 1)
}

/(a+)+b/.test("ac")// 1: (a)c
/(a+)+b/.test("aac") // 3: (aa)c, (a)(a)c, a(a)c
/(a+)+b/.test("aaac") // 7: (aaa)c, (aa)(a)c, (a)(aa)c, (a)(a)(a)c, a(aa)c, a(a)(a)c, aa(a)c
/(a+)+b/.test("aaaac") // 14: ...
/(a+)+b/.test("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaac") // 4525
Enter fullscreen mode Exit fullscreen mode

Lookaround's evaluation method

Lookaround in regular expressions can be considered a kind of conditional statement.
It evaluates the condition at the current position.

Pattern Type Matches
X(?=Y) Positive lookahead X if followed by Y
X(?!Y) Negative lookahead X if not followed by Y
(?<=Y)X Positive lookbehind X if after Y
(?<!Y)X Negative lookbehind X if not after Y

For example, the regular expression /a(b)c(?=.*\1)/g first matches the string "abc", then checks if the first group, "b", is present in the following characters.

"abczb".match(/a(b)c(?=.*\1)/g) //=> ["abc"]
Enter fullscreen mode Exit fullscreen mode

Similarly to how positive lookahead assertions test all possible cases until a match is found, negative lookahead assertions also test all possible cases until a match is found.
This can be used to determine if a string does not contain a specific character, even without using the $ symbol.

with (console) {
  log(/^[^a]*$/.test("bcdef")) //=> true
  log(/^[^a]*$/.test("bcdefa")) //=> false
  log(/^(?!.*a)/.test("bcdef")) //=> true
  log(/^(?!.*a)/.test("bcdefa")) //=> false
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

I personally like regular expressions very much. This is because it is a way to improve the performance and simplify the code of JS, while reducing the number of characters, unlike WASM, which increases the number of characters due to glue code and its own size. I hope you will check and use some of the precautions of regular expressions.

Thank you.

Top comments (0)