re: Exploring the Linguistics Behind Regular Expressions VIEW POST

VIEW FULL DISCUSSION

I love it :D I'm a regex fiend and I never get tired of them.

The one thing I wanted to mention, and it might just be I'm misreading, but I know it took me a long time to figure out when I first learned about it so I always like to clarify it: my understanding and experience thus far is that, assuming no flags, /+/ and /*/ are both greedy and NOT possessive in isolation; it is specifically appending + to itself or a /*/ that creates the non-backtracking.

ie:
/r*r/ will match "narwhal" and "narrative". It will also match the contrived "narrrrrrrp". It will match one less than the pattern needs, giving one back.

Similarly, as expected, /r+r/ will not match "narwhal" but will match "narrative". Same for "narrrrrrrp". (I feel like I missed a pirate joke opportunity...) If you do "narrrp" you can see that it is not possessive: it will match, whereas otherwise it shouldn't because it wouldn't allow the pattern to backtrack for that last "r".

/r*+r/ will match none of those, no matter how long a pirate noise you make.

I played around some on Regex101 with it--I added a capture group just to make it clearer. The /*/ and /+/ always matches one less in its capture group 1 than in the full capture group--until the match flat-out fails with /*+/. Same for "abcorgi"-- /.*corgi/ and /.+corgi/ both match "abcorgi"; but I believe /.*?corgi/ matches "bcorgi", /.+corgi/ matches "abcorgi", and /.*+corgi/ fails. The final version is "take any character as an option, as many as possible, and don't give any back"--which is going to stop just about everything else.

Thank you for taking the time to write out these examples! You're correct that I have typos in two of the string examples I gave (which I'll update now).

my understanding and experience thus far is that, assuming no flags, /+/ and // are both greedy and NOT possessive in isolation; it is specifically appending + to itself or a // that creates the non-backtracking

You taught me something new today, thank you. :)

code of conduct - report abuse