DEV Community

ib-dev-cpp
ib-dev-cpp

Posted on

A NEW REGEX ENGINE IS HERE!

In the last week, I've been working on a new regex engine called Mregex, a free and open regular expression engine and still under development.

Enough talking let's explore its features.

As any regex engine it supports the following:

  • sub-expressions:
#include <Mregex.h>

int main ( void ) {
    char regex [] = "(he), \0llo", str [] = "he, hello";
    short c = Regex ( regex, str, NULL, 0 );
    // Will return SUCCES
}
Enter fullscreen mode Exit fullscreen mode

the above code has defined two strings the regex string which
has one expression that will match "he" and then will match
", " and then will match the first sub-expression again and
then will match the "llo".

  • Anchors: ^: will match the beginning of the string or the beginning of the line in multi-Line mode ( more on that later). $: will match the ending of the string or the ending of the line in multi-Line mode
#include <Mregex.h>

int main ( void ) {
    char regex [] = "^ $", str [] = " ";
    short c = Regex ( regex, str, NULL, 0 );
    // Will return SUCCES
}
Enter fullscreen mode Exit fullscreen mode
  • Character class:
    [char-group]: Match any single character in the char-group.
    [start-end]: Char range: Match any character between the start and end in the ASCII Table.
    . : Match any single character in the ASCII table instead '\n' and NULL Byte

  • Quantifier:
    *: Match the previous element zero or more times
    +: Match the previous element one or more times
    ?: Match the previous element zero or one time
    {min, max}: Match the previous element min times or more but no more than max

  • Modes:
    NoCase: use it whit case insensitive
    NoWhit: ignore all the white spaces
    MultiLine: Use multiline mode, where ^ and $ match the beginning and end of each line (instead of the beginning and end of the input string)

The Regex function

The Regex function has 4 parameters
the first parameter is the regex string
the second parameter is the text string (the string to be matched)
the third parameter is not yet implemented
the fourth parameter is the modes where it can be ( 0 or NoCase | NoWhite)

Regex ( regex, str, NULL, MultiLine | NoWhite )

  • IMPORTANT: Whenever you use '.+' or '.*' make sure that the next character is a valid character or escaped character like '(' or 'v' but MAKE SURE IT'S NOT '('

So, what are you waiting for, go ahead and try it out.
Found a bug Let us know at modernbinaryutility@gmail.com

Top comments (0)