DEV Community

Asking for review on non-string regular expressions

Rémy 🤖 on July 26, 2019

There is this idea I have and that I want to push forward about "non-string regular expressions". I explained things as I could on the repo. ...
Collapse
 
rootfsext2gz profile image
rootfs.ext2.gz

I had a brief look at it, and whilst I'm predominantly a Java developer, I have done some Python scripting in the past so I can look at the code and feel comfortable with it.

I'll try and answer your questions:

Do I understand what this is?

I think so - I believe it's a way to search for a value in complicated objects, and those values and/or objects may or may not be strings.

It feels like it is making Regex more human readable.

Do I see applications for this?

Kind of - I saw right at the bottom that the performance for this was quoted as being "terrible" so it's not something I'd happy use in a production environment, but I can imagine it would be super excellent for searching for a piece of data in a complex JSON or XML object, and that would be super dandy if I say so myself - but only if the performance is decent.

Does the API look nice?

This is where me being a prominent Java developer will probably fail me. To me, it actually reminds me a lot of old-school Java, in that it's very verbose to express something simple. What I would imagine would be nice would be something like

re.on(datatype).match(expression) and have a limited set of expressions be available for the datatype. But I expect that would be more lengthier to code and maintaining a codebase like that would be hell.

But then again I'm not an expert in Python

What features would I want to see around that?

Mainly efficient, easy-to-understand regex formatting with various data types, like JSON, XML, CSS, perhaps even just a massive String which represents a text file.

What would I want before using this in production?

Mostly speed to be honest, and possibly support for the above data types? But that might be a stretch. The API is a nice to have but I rather not enforce that on any Python developer as someone who is interested in the code but isn't an expert on the language.


Honestly I think what you made is cool, even if is in another language that I don't use! 😂 Oh well. It's a good start and I think it definitely has promise!

Keep up the good work! 👍

Collapse
 
anpos231 profile image
anpos231

Maybe I fail to understand, but how is this:

from nsre import *

re = AnyNumber(
    Symbol(KeyHasValue("type", "image")) + Maybe(KeyHasValue("type", "caption"))
) + Range(KeyHasValue("type", "text"), min=1)

assert re.match(
    [
        {"type": "image", "url": "https://img1.jpg"},
        {"type": "image", "url": "https://img2.jpg"},
        {"type": "image", "url": "https://img3.jpg"},
        {"type": "caption", "text": "Image 3"},
        {"type": "image", "url": "https://img4.jpg"},
        {"type": "caption", "text": "Image 4"},
        {"type": "image", "url": "https://img5.jpg"},
        {"type": "text", "text": "Hello"},
        {"type": "text", "text": "Foo"},
        {"type": "text", "text": "Bar"},
    ]
)

Better than this:

[
  {"type": "image", "url": "https://img1.jpg"},
  {"type": "image", "url": "https://img2.jpg"},
  {"type": "image", "url": "https://img3.jpg"},
  {"type": "caption", "text": "Image 3"},
  {"type": "image", "url": "https://img4.jpg"},
  {"type": "caption", "text": "Image 4"},
  {"type": "image", "url": "https://img5.jpg"},
  {"type": "text", "text": "Hello"},
  {"type": "text", "text": "Foo"},
  {"type": "text", "text": "Bar"},
]
  .filter(x => (x.type === "image") || (x.type === "caption"))
  .filter(x => x.text)
  .map(x => x.text.length)
Collapse
 
megazear7 profile image
megazear7

The power of regular expressions in my opinion is to be able to encapsulate complex logic in a few characters with a standardized syntax. Being able to do the same type of thing on an array of JavaScript objects might open up some options, but the stream operations of map, reduce, etc are already powerful and flexible so this would need to provide something either different, cleaner, simpler, or more concise. However I do understand what it is doing and sometimes it's good to build tools and then see what unforseen things they can do after you have them to mess around with.