Ben Halpern

Posted on Jan 5, 2022

How do you regex?

#regex #discuss

I made a thread yesterday called How do you feel about regex?

Lots of fun discussion. Lots of people weighed in on their tools and tactics, but I'd love to move the conversation that way.

How do you go about using regex.

What tools do you use? (if not from full memory)
How do you encapsulate/label/comment regex?
What types of problems do you most solve with regex?

Looking forward to any and all comments!

Latest comments (23)

Riccardo Bernardini • Jan 14 '22 • Edited

Tools? None, most of the times. For most complex regex, I wrote myself a nice "regexp generator" library that allows me to "build" regexp in more readable way.
Problems? It depends on the language, actually. In Ruby I use them mostly to parse simple line-based text files. In Ada I use them more sparingly, usually when I need some kind of lexical analyzer. The reason for this difference is that in Ruby using regexp is much easier: just write something like /[a-z][0-9]+/, in Ada it is a bit more involved.

lepinekong • Jan 9 '22 • Edited

My approach is to use iterative method to do regex decomposition and I write down the mental process to achieve this or I will just learn and forget, just learn and forget "Ad vitam Eternam" like I used to in the past :D example below - I'm using figjam document to create code notes (free with figma.com) to do this (with the help of a plugin I'm building to generate the whole expression from the parts) - especially important to be able to understand a regex you wrote x weeks or months before so I also not matching and non matching samples for above each regex part. I also embed regexr.com/ playground in figjam doc. In the future by improving the plugin I will be able to have direct real time playground while playing with the parts.

That's how I'm now much more confident to write my own regex without googling which is by the way an increase in productivity ;)

૮༼⚆︿⚆༽つ • Jan 7 '22

ask copilot
got dumb answer? so $askagain
test that! curl_cat_whatever $something | rg -n -w $regex
bug? visualize regex as railroad diagram (I'm still searching CLI similar to npeg but for regex while being a stand-alone binary)
ditch regex and just use PEG 😂

Matt Ellen-Tsivintzeli • Jan 6 '22

I mostly use regex for searching files for something. A string I sort of remember, but not quite, or if I want to double check the "find all instances of" type thing in an IDE.

For whatever reason I find grep easier to use than "find in files" of most IDEs.

I don't usually have to look up how anything works, because I've learnt that now, with the exception of look ahead and look behind, and sometimes I forget which is the end anchor and which is the start anchor.

The problem I hit most often is forgetting what I have to escape. For example, what I have to escape in emacs is different to what I have to escape in bash.

Rafi • Jan 6 '22

I use cli tool grex it generates regex given example text to match.

Bernd Wechner • Jan 6 '22

My contexts for RE us in order of frequency of late:

Python
Bash
.NET (C#)

What tools do I use? In order again:

My IDE (as in I just write the thing, been writing Res since the '80s so pretty familiar with them)
The documentation for the tool (because different flavours trip me up from time tot time of course)
General on-line search (which generally takes me to the first)
On-line testing and diagnostic tools if I can't work out why my RE isn't matching when I think it should, including:
- regexr.com/
- regexper.com/

What types of problems?

RE problems, doh! ;-).
More seriously, the class of problem REs are idea for, which includes primarily:
- Any spot need in Python or C# that I have to detect or extract specific patterns in a string
- Using CLI tools in bash or writing shell scripts, it's not long before and RE in a grep or sed or such is called for.

Yeah, could be an age or generation thing but I call them REs not regexes so much.

Essentially REs fill the gap between:

The very basic string find, extract and split tools that many languages provide like like Excel does for example and basic string types in many modern languages provide
Full on grammar parsing.

In between these two extremes is a rich territory of spot pattern testing and string manipulation that a terse pattern definition language provides and the one that essentially came to dominate is called "regular" ;-), probably mainly because in its earliest inceptions it was designed and intended to be, supported by diverse tools in the *nix landscape of the day.

Vicente G. Reyes • Jan 6 '22

I wrote(WIP) an article which I shared with my colleagues which explains regex. These colleagues have little to no knowledge on how regex works hence the urge to write and help them get started using it since we're the last people who work on clients' project before giving it back to them. Our work includes making sure the data's clean and consistent. notion.so/vicentereyes/Introductio...

scottshipp • Jan 5 '22 • Edited

Some people, when confronted with a problem, think 'I know, I’ll use regular expressions.' Now they have two problems.

—Jamie Zawinski

Hence, I only use regex when I have to. And I usually just end up using the built-in language features for it.

Calin Baenen • Jan 5 '22

What tools do you use? (if not from full memory)

regex101, as stated yesterday.
Otherwise I try to do things from memory, or look shiz up.

How do you encapsulate/label/comment regex?

I don't.

What types of problems do you most solve with regex?

ParseJS with "abstract tokens", that'd allow for you to make a programming language with usable identifiers.

Derek Enos • Jan 5 '22 • Edited

I usually work from memory and try to use as many named capturing groups as possible because I find that it serves to provide basic, inline, documentation of the pattern itself, and provides a more expressive way of accessing the groups on the match result:

const regex = /(?<first>[^\-])-(?<second>[^\-])-(?<rest>.+)/

const { groups } = regex.exec("1-2-3-4-5")

groups.first
'1'
groups.second
'2'
groups.rest
'3-4-5'

View full discussion (23 comments)