DEV Community

Mainak Bhattacharjee
Mainak Bhattacharjee

Posted on

From Regex Rampage to Lazy Bliss: My rjq Performance Adventure

Hey there, fellow Rustaceans 🦀!

I've been building a JSON filter tool called rjq, inspired by the awesome jq. But things took a turn for the worse when I hit a performance wall during lexing. The culprit? Compiling regular expressions in a hot loop . It turns out, regexes are like hungry hippos – they chomp up performance if you're not careful!
Here's the story of how I tamed the regex beast and saved my program from a slow, sluggish fate:

The Regex Rampage 🦖:

At first, I naively compiled the regex patterns within the lexing loop. This meant every iteration involved creating a brand new regex object. Think of it like baking a whole new pizza for every bite – inefficient, right? This constant creation caused a major performance bottleneck i.e. ~80% execution time was consumed by this.

The Lazylock Solution 🧙‍♂️:

Thankfully, the Rust gods (and some helpful folks on the r/Rust subreddit) pointed me towards lazy_static and a technique called lazy initialization. This magic combo allowed me to compile the regex only once and store it in a thread-safe location using a LazyLock. Now, it's like having a box of pizza ready with a fresh slices whenever you need it – much more efficient!

The Lazy Bliss ✨:

The impact was phenomenal! Performance soared, and my lexing code became as smooth as butter . No more regex rampage, just happy filtering .
Want to See the Code?
Curious about the details? Head over to my GitHub repo for rjq: https://github.com/mainak55512/rjq

Lessons Learned 📚:

  • Regex compilation can be expensive, avoid hot loops!
  • Embrace lazy initialization for performance gains.
  • There's always a better way to do things in Rust (and life!)

So, the next time you encounter a performance bottleneck, remember – there might be a lazy solution waiting to be discovered!

P.S. If you have any other tips or tricks for optimizing JSON filtering in Rust, leave a comment below!

But wait, there's more!

Let's dive deeper into the technical aspects of this adventure.
Understanding lazy_static and LazyLock

  • lazy_static: This macro provides a way to declare static variables that are initialized only once, even in a multi-threaded environment.
  • LazyLock: This is a type provided by the lazy_static crate that ensures thread-safety during initialization.

Here's a simplified example of how I used these to optimize the regex compilation in rjq:

Outside the hot loop:

static MATCH_NUMBER: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"^\d+\.?\d+").unwrap());

...and so on
Enter fullscreen mode Exit fullscreen mode

Inside the hot loop:

    if MATCH_NUMBER.is_match(&source_string[cursor..]) {
        match MATCH_NUMBER
            .find(&source_string[cursor..])
            .map(|x| x.as_str())
        {
            Some(val) => {
                cursor += val.len();
                token_array.push_back(token(TokenType::NUMBER, val.to_string()));
            }
            None => (),
        }
    } else if ... so on
Enter fullscreen mode Exit fullscreen mode

As you can see, the MATCH_NUMBER variable is declared using LazyLock, and it's initialized only once when the code is first executed. The LazyLock within the code ensures that the initialization is thread-safe.

Additional Performance Tips

  • Profiling: Use tools like perf or cargo-flamegraph to identify other performance bottlenecks in your code.
  • Data Structures: Choose appropriate data structures for your use case. For example, consider using HashMap for efficient lookups.
  • Algorithms: Optimize algorithms to reduce computational complexity.
  • Memory Management: Be mindful of memory allocations and deallocations.

By following these tips and leveraging techniques like lazy initialization, you can significantly improve the performance of your Rust applications.

Happy coding 🎉!

Heroku

Build apps, not infrastructure.

Dealing with servers, hardware, and infrastructure can take up your valuable time. Discover the benefits of Heroku, the PaaS of choice for developers since 2007.

Visit Site

Top comments (0)

Cloudinary image

Zoom pan, gen fill, restore, overlay, upscale, crop, resize...

Chain advanced transformations through a set of image and video APIs while optimizing assets by 90%.

Explore

👋 Kindness is contagious

Dive into an ocean of knowledge with this thought-provoking post, revered deeply within the supportive DEV Community. Developers of all levels are welcome to join and enhance our collective intelligence.

Saying a simple "thank you" can brighten someone's day. Share your gratitude in the comments below!

On DEV, sharing ideas eases our path and fortifies our community connections. Found this helpful? Sending a quick thanks to the author can be profoundly valued.

Okay