DEV Community

Tony Metzidis
Tony Metzidis

Posted on

Streaming regex scanner — regexpscanner

Go's regexp module falls short with stream processing-- nearly all methods require a string or []byte. The regexpscanner module makes it easy to extract tokens that match regular expression patterns.

https://pkg.go.dev/github.com/tonymet/regexpscanner

Install Module

go get github.com/tonymet/regexpscanner@latest
Enter fullscreen mode Exit fullscreen mode

Example Usage

use ProcessTokens when a simple callback-based stream tokenizer is needed .
ProcessTokens calls handler(string) for each matching token from the Scanner.

package main

import (
    "fmt"
    "regexp"
    "strings"

    rs "github.com/tonymet/regexpscanner"
)

func main() {
    rs.ProcessTokens(
        strings.NewReader("<html><body><p>Welcome to My Website</p></body></html>"),
        regexp.MustCompile(`</?[a-z]+>`),
        func(text string) {
            fmt.Println(text)
        })
}
Enter fullscreen mode Exit fullscreen mode

Output

<html>
<body>
<p>
</p>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

Give it a try and see the Go Module Page for more examples

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay