DEV Community

Ghazanfar Uruj
Ghazanfar Uruj

Posted on

Find the regex that can freeze your Python service - before it ships

A regex like ^(\w+)+$ looks harmless. But feed it a long string that almost matches and it can backtrack exponentially — a couple dozen characters can take seconds, and a slightly longer one can peg a CPU core and hang the thread indefinitely. That's a ReDoS (Regular-expression Denial of Service), and a single user-facing input field is all it takes.

I wanted those patterns caught automatically, in CI, so I built redos — a small, dependency-free CLI.

redos flagging a vulnerable regex

How it works

redos never imports or runs your code. It parses each .py file with the standard-library ast module, collects every literal pattern passed to re / regex (re.compile, re.match, re.search, …), then parses each one with Python's own regular-expression parser and inspects the parsed structure for the two classic causes of catastrophic backtracking:

  • Nested quantifiers — a repeated group that itself repeats, like (a+)+, (a*)*, or ([a-z]+)*.
  • Ambiguous alternation under a repeat — branches that can match the same text, like (a|a)+.

Because it analyses the pattern after Python's own optimiser runs, it doesn't flag patterns the engine already makes safe — (abc|abd)+ has its common prefix factored out, (\w|\d)+ collapses to a single character class. That keeps false positives low, which is the thing that actually kills adoption of a linter.

Using it

pip install redos
redos .                  # scan the current project
redos . --format json    # machine-readable output
Enter fullscreen mode Exit fullscreen mode

It exits non-zero when it finds a risk, so redos . drops straight into CI or a pre-commit hook:

repos:
  - repo: https://github.com/gazzycodes/redos
    rev: v0.1.0
    hooks:
      - id: redos
Enter fullscreen mode Exit fullscreen mode

Why a static tool?

An AI assistant can spot a dangerous regex if you ask it about that specific line — but redos is deterministic, runs headless in CI on every commit in milliseconds, reasons over your entire project at once, and never sends your code anywhere. It's the cheap, reliable gate, the same reason linters and type checkers keep thriving.

It's open source (MIT) and early — feedback and contributions welcome: https://github.com/gazzycodes/redos

Top comments (0)