DEV Community

Discussion on: Daily Challenge #232 - Regex Pattern

Collapse
 
qm3ster profile image
Mihail Malo • Edited

// Javscript

const regex_contains_all = chars =>
  new RegExp(
    chars
      .split("")
      .map(c => `(?=.*${c})`)
      .join("")
  )

// Test

const abc = 'abc'
const re = regex_contains_all(abc)
;[
  re.test('bca'),
  re.test('baczzz'),
  re.test('ac'),
  re.test('cb')
]

This uses "positive lookahead" patterns to match multiple patterns to the same spans.
This allows me to avoid listing all the permutations.
That is not to say that the runtime will be anything but atrocious.

Truly a oneliner

const regex_contains_all = chars => new RegExp(chars.replace(/./g,'(?=.*$&)'))
Collapse
 
kenbellows profile image
Ken Bellows • Edited

Winner right here. Lookaheads are the one regex feature I have not spent nearly enough time investigating. From what I've seen they dramatically broaden the scope of what you can do with regexes

Collapse
 
quoll profile image
Paula Gearon

I don't consider myself a regex expert, but this is a feature I've never used and had forgotten about. Very nice, thank you!

Collapse
 
craigmc08 profile image
Craig McIlwrath

I was interested in how the positive lookahead compares to a list of all permutations. I did some basic benchmarking and determined that your solution gives what appears to be logarithmic time, which seems strange. This could be an error of my testing. Whatever is the case with that, listing all permutation is certainly much, much slower. Here's a chart: (y axis is logarithmic scale and in nanoseconds):

graph

The data for "All Permutations" isn't complete. 7 characters is all the node.js regex engine can handle (any longer and it the regex is too large). I only did 6 because I was getting impatient waiting.

Collapse
 
qm3ster profile image
Mihail Malo

Wow!
How long are the strings you test against?
I'd be interested in ridiculously long strings which include a lot of some of the letters, but not others, and how finally including one of them at the beginning affects the runtime.

Thread Thread
 
craigmc08 profile image
Craig McIlwrath

Each test was on 100k strings of length [30..39] composed of random lowercase letters. The patterns were slices of increasing length from the beginning of the alphabet ('a', 'ab', 'abc', etc.). The same test cases were used for all data points.

If I have some time tomorrow, I may do some more tests with longer strings/placement of the required letters. I'll update you!

Maybe I'll add in a regex-less implementation for comparison too.