loading...

The beauty of VerbalExpressions

rapidnerd profile image George Marr ・1 min read

So at some point in our lives we've all had to encounter the wonders of Regular Expressions (Regex). If you don't know what Regex is in short terms its a special type of String that allows you to input a description to search for a pattern. It tends to be very useful in development, I commonly finding myself using it a lot (sometimes way too much). It generally will look something on the lines of this

alt text

So this looks a little messy and confusing, although that Regex is very powerful in a lot of fields I don't think anybody likes writing it. That was until I discovered a little beauty called VerbalExpressions. VerbalExpressions takes normal Regex and nulls it down a lot to make it more efficient and easier to use. Here for example

alt text

Now comparing this to above it makes it a lot easier to follow, more efficient to work with and your code a whole lot cleaner. Personally I love using this tool, for a long time I found that working in Regex would always cause me a lot of problems seeing as one missing or misplaced character would mean that the entire thing would be a pain to fix.

And the best thing about it? It's available in 32 different languages. All of which can be found right here on their github https://github.com/VerbalExpressions

Posted on by:

rapidnerd profile

George Marr

@rapidnerd

My curiosity is easily stimulated

Discussion

markdown guide
 

"I don't think anybody likes writing it."
=> I do !

RegEx are life, RegEx are everything !

 

Same. Writing a complex RegEx is like a mini-game.

 

Hey buddy!!!

Can you please suggest me a good tutorial to learn regex??

 

Regex has minor different syntax's for every language. Personally I learnt it from this website regular-expressions.info/tutorial....

I was looking for Java RegEx.

Getting prompt suggestions from experts is the beauty of this site!!
Thanks a lot George.

May code bless you... 🙂

rubular.com is another great resource. The basics are going to be pretty much the same from language to language.

 

I have always turned to regexr.com/ when I have needed help with RegEx.

 

Not a tutorial perse, but this puzzle-game is quite a fun practice for them: regexcrossword.com

 
 

I don't like this syntax. It seems overly bloated compared to the equivalent regex. Your example also doesn't include grouping, which is vital to extracing information from regex parsed strings. It also doesn't show how counted repitions, like in the regex graphic, would be used.

I'm afraid this would be ridiculously verbose for non-trival expressions.

 

Personally I kind of see it per project, how much it'll be used and on the language in general. For some minor usage of it without having to make something overly complicated I find this would be perfect for what is needed. However I do agree with your points, some of my projects (Java and C# specifically) I've stuck with standard Regex due.

 

Perhaps there is a happy middle ground. Certainly when I'm doing basic matching I definitely prefer functions like endsWith, contains and startsWith compared to the equivalent regex.

 

Don't use the email regex given in example, this one is wrong.
Here are some valid email address according to RFC822 that will not match the regex :

  • me@example.museum (the tld contains 6 characters)
  • John Smith <john@smith.com> (Yes, you can put a name in front of an enclosed email address like that)

Some times, using RegEx (or VerbalExpression) is not appropriate. In this case, you should probably write a parser. Here is the implementation of this parser in go : golang.org/src/net/mail/message.go...

 

A few months ago, I needed a really complicated Regex for a project that I was working on. After struggling for a very long time with the syntax, I ended up writing a small library to generate the Regex for me. The thing is: I never used it again, so I never got to write a decent documentation.

Using my library, your example would look like that:

import { is, start, end, maybe, not } from 'frogjs';

new Regex(is(
  start, 
  'http', maybe('s'), '://',
  notBefore(' ', maybe('www.'))
  end
));

If you wanna check it out, take a look here: github.com/danielbastos11/frogjs
The test fold is quite complete (I'm surprised, actually).

 

This is awesome! How is this not a billion times more popular than it is?? Thanks for sharing!

 

I think this is a very bad idea, it is much better to learn regex from regular-expressions.info.

I think having a single line complex regex is a code smell. It is a much better approach to build it from sub-patterns or use multiple regex patterns or don't use regex at all. I Have a small example as evidence (note that the code is not tested or not even written in an editor):

Using a single complex pattern (common golden hammer mistake):

function checkIP(address){
    if (!(/^(?:\d|[1-9]\d|1\d{2,2}|2[0-4]\d|25[0-5])(?:\.(?:\d|[1-9]\d|1\d{2,2}|2[0-4]\d|25[0-5])){3,3}$/).test(address))
        throw new InvalidIP();
}

Using multiple small patterns in combination with language features:

function checkIP(address){
    let parts = address.split(".");
    if (parts.length != 4)
        throw new InvalidIP();
    parts.forEach((part){
        if (!(/^\d{1,3}$/).test(part))
            throw new InvalidIP();
        if ((/^0./).test(part))
            throw new InvalidIP();
        let num = parseInt(part);
        if (num>255)
            throw new InvalidIP();
    });
}

Building the pattern from sub-patterns:

function checkIP(address){
    let lessThan256 = "(?:\d|[1-9]\d|1\d{2,2}|2[0-4]\d|25[0-5])";
    let pattern = `^${lessThan256}(?:\.${lessThan256}){3,3}$`;
    if (!(new RegExp(pattern)).test(address))
        throw new InvalidIP();
}
 

Wow man! you've made my life 100x easier!
Thanks

 

Man I freaking hate regex because it makes me feel so dumb. No matter how much I try to learn it, I never seem to really understand it. I hate it lol