DEV Community

Discussion on: Regex was taking 5 days to run. So I built a tool that did it in 15 minutes.

 
vi3k6i5 profile image
Vikash Singh

I will checkout RE2 and Ragel. I had tried Regex in java also and some advanced Regex libraries in python.

The conclusion I came to was that FlashText is fast because it doesn't do backtracking. Any Regex library that does backtracking would be slower, time complexity wise. C/C++ might execute faster, but backtracking will be slow Algo.

I will still checkout the libraries you guy suggested, Thanks :)

Thread Thread
 
matthewsj profile image
Jacob Matthews

Yes -- RE2 doesn't do backtracking precisely so that it can run faster, which is why I thought it would be an interesting comparison. Here is a page that describes the basic idea, and here is a larger pool of resources for efficient regex implementation (considering "real" regular expressions only, not PCRE-compatible regexes that may contain non-regular constructs that would need to be implemented with backtracking).