Go's regexp module falls short with stream processing-- nearly all methods require a string
or []byte
. The regexpscanner module makes it easy to extract tokens that match regular expression patterns.
https://pkg.go.dev/github.com/tonymet/regexpscanner
Install Module
go get github.com/tonymet/regexpscanner@latest
Example Usage
use ProcessTokens
when a simple callback-based stream tokenizer is needed .
ProcessTokens
calls handler(string) for each matching token from the Scanner.
package main
import (
"fmt"
"regexp"
"strings"
rs "github.com/tonymet/regexpscanner"
)
func main() {
rs.ProcessTokens(
strings.NewReader("<html><body><p>Welcome to My Website</p></body></html>"),
regexp.MustCompile(`</?[a-z]+>`),
func(text string) {
fmt.Println(text)
})
}
Output
<html>
<body>
<p>
</p>
</body>
</html>
Give it a try and see the Go Module Page for more examples
Top comments (0)