loading...

Use stateful JavaScript Regular Expressions

newyorkanthonyng profile image Anthony Ng ・3 min read

When I ran this regular expression, I got back the result that I expected.

const regex = /abc/g;

const text = 'abc';

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]

But when I ran it a second time, I got back null.

const regex = /abc/g;

const text = 'abc';

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]

regex.exec(text);
// => null

Strange 🤔

And when I ran it a third time, it worked.

const regex = /abc/g;

const text = 'abc';

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]

regex.exec(text);
// => null

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]

The regular expression works every other time.

const regex = /abc/g;

const text = 'abc';

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]

regex.exec(text);
// => null

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]

regex.exec(text);
// => null

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]

What's happening?

I was using a regular expression with a global flag. This creates a stateful Regular Expression object (RegExp).

JavaScript RegExp objects are stateful when they have the global or sticky flags set (e.g. /foo/g or /foo/y).

Source - MDN

The RegExp has a property called "lastIndex." "lastIndex" tracks the last place it searched for text. "lastIndex" is also where the RegExp will start its next search. The RegExp object remembers the last place it searched for text.

We can print out "lastIndex" and see how it changes.

const regex = /abc/g;

const text = 'abc';

regex.exec(text)
console.log(regex.lastIndex)
// => 3

regex.exec(text);
console.log(regex.lastIndex)
// => 0

regex.exec(text)
console.log(regex.lastIndex)
// => 3

regex.exec(text);
console.log(regex.lastIndex)
// => 0

After the first search, the RegExp "remembers" that it ended its previous search at index 3. The next time the RegExp runs, it starts its search at index 3. It looks at the end of the string for another match, and it can't find one. So it returns null.

There are rules about how "lastIndex" resets itself. See MDN for more details.

The lastIndex is a read/write integer property of regular expression instances that specifies the index at which to start the next match.

Source - MDN

In our scenario, "lastIndex" sets itself back to 0 when it can't find a match. This explains why the results alternated between right and wrong.

Workarounds

Some workarounds would be to reset the "lastIndex" to 0 after every search.

const regex = /abc/g;

const text = 'abc';

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]
regex.lastIndex = 0;

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]
regex.lastIndex = 0;

This tells the RegExp to start the next search at index 0, which is the start of the string.

Or remove the "global" flag if you don't actually need it.

const regex = /abc/; // removed "global" flag

const text = 'abc';

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]
regex.lastIndex = 0;

regex.exec(text)
// => [ 'abc', index: 0, input: 'abc', groups: undefined ]
regex.lastIndex = 0;

So be on the lookout for this "lastIndex" if you see any weirdness with your regular expressions.

What does the global flag actually do?

Regular Expressions with the global flag don't quit after finding its first match. This has some useful applications.

You can do global string replacements.

const nonGlobalRegex = /foobar/;
let string = 'foobar foobar foobar';
let result = string.replace(nonGlobalRegex, 'marsha');

console.log(result);
// marsha foobar foobar

const globalRegex = /foobar/g;
result = string.replace(globalRegex, 'marsha');
console.log(result);
// => marsha marsha marsha

You can also iterate over your string for each Regular Expression match that you find.

let string = 'foobar_foobar_foobar_foobar';

let index = 0;
let regex = /foobar/g;

let currentMatch = regex.exec(string);
while(currentMatch !== null) {
  console.log(currentMatch);
  currentMatch = regex.exec(string);
}

console.log(string);

References

Posted on by:

newyorkanthonyng profile

Anthony Ng

@newyorkanthonyng

Full stack developer interested in creating accessible, useful, and performant applications.

Discussion

pic
Editor guide