DEV Community

Discussion on: Regex was taking 5 days to run. So I built a tool that did it in 15 minutes.

Collapse
 
vinaypai profile image
Vinay Pai

Out of curiosity, how long would a simple replace() take on your document set? Regular expressions are a good tool to use when you need to do complex matches, but are pretty inefficient when you're doing a simple text replacement.

str.replace() is likely to be far more efficient than re.sub() when you're just doing simple string matching and not really using any of the power of regular expressions.

Collapse
 
vinaypai profile image
Vinay Pai

To be clear, it's still likely to be a good bit slower than FlashText, but I'm just curious what the difference is.

Collapse
 
vi3k6i5 profile image
Vikash Singh

Hey Vinay,

I had 10K+ terms. It simply didn't make sense to do 20K replace calls. Plus I need word boundaries to be honoured, So the only choice for me was some re library. Hope that answers your question.

PS: each str.replace() will go over the entire document/string. so 20K * no of docs (will be too much complexity. + it won't take word boundaries into consideration.