DEV Community

Discussion on: Regex was taking 5 days to run. So I built a tool that did it in 15 minutes.

Collapse
 
matthewsj profile image
Jacob Matthews

I would be interested in the comparison between flashtext versus RE2. In my experience you almost always want to use RE2 in any language that supports it. It is vastly faster than any other real RE implementation I've used and its limitations (i.e., you can't use the non-regular PCRE features) only stop you from doing things you shouldn't be doing with a regular expression anyway.

Collapse
 
dbeecham profile image
Daniel Beecham

I agree. I would be even more interested in a comparison between flashtext and a well designed Ragel parser (or similar; I've heard that the Rust RE is pretty good too).

Thread Thread
 
vi3k6i5 profile image
Vikash Singh

I will checkout RE2 and Ragel. I had tried Regex in java also and some advanced Regex libraries in python.

The conclusion I came to was that FlashText is fast because it doesn't do backtracking. Any Regex library that does backtracking would be slower, time complexity wise. C/C++ might execute faster, but backtracking will be slow Algo.

I will still checkout the libraries you guy suggested, Thanks :)

Thread Thread
 
matthewsj profile image
Jacob Matthews

Yes -- RE2 doesn't do backtracking precisely so that it can run faster, which is why I thought it would be an interesting comparison. Here is a page that describes the basic idea, and here is a larger pool of resources for efficient regex implementation (considering "real" regular expressions only, not PCRE-compatible regexes that may contain non-regular constructs that would need to be implemented with backtracking).