DEV Community

amaendeepm
amaendeepm

Posted on

How Flamegraph Helped Me Optimize a Rust Application for Intensive Data Transformation and Migration

Rust is having its own reputation for performance and speed. Therefore when working on performance-critical Rust applications, especially those dealing with intensive data transformation and migration, pushing for faster performance by identifying bottlenecks can be challenging. I faced this exact challenge with a Rust application responsible for processing and migrating large datasets. The initial implementation was slow, and I needed a way to pinpoint the performance issues. Flamegraph—a visualization tool that helped me optimize my code effectively though. And here's how I used Flamegraph to transform my application's performance

The Problem: Slow Data Migration
My Rust application was designed to migrate data from one system to another, involving complex transformations and large datasets. Despite Rust's reputation for performance, the process was taking hours to complete. I suspected inefficiencies in the code, but without concrete data, optimizing it felt like guesswork.

That's when I decided to use Flamegraph to visualize where the CPU cycles were being spent. Flamegraph provided a clear, hierarchical view of my application's execution, making it easy to identify hotspots.

What is Flamegraph?
Flamegraph is a profiling tool that generates an interactive SVG graph representing your application's call stack. The width of each stack frame corresponds to the amount of time spent in that function, allowing you to quickly identify performance bottlenecks.

For Rust, tools like cargo-flamegraph make it easy to generate flamegraphs. It integrates seamlessly with Cargo and provides a straightforward way to profile your application.

Setting Up Flamegraph for Rust
Here’s how I set up Flamegraph for my Rust application:

Install cargo-flamegraph:

cargo install flamegraph
Enter fullscreen mode Exit fullscreen mode

Profile Your Application:
Run your application with cargo flamegraph. This generates a flamegraph SVG file.

cargo flamegraph --bin my_app
Enter fullscreen mode Exit fullscreen mode

Analyze the Flamegraph:

Open the generated SVG file in your browser. The flamegraph shows a hierarchical view of your program's execution, with the most time-consuming functions at the top.

What the Flamegraph Revealed
When I first generated the flamegraph for my application, the results were dazzling to the eye, and my underlying curosity as well. Here's what I discovered:

Inefficient Database Queries:
The flamegraph showed that a significant amount of time was spent executing database queries. This was due to fetching large datasets in a single query without proper batching.

Expensive Data Transformation:
A hotspot was identified in the transmute_response function, which was responsible for transforming data. The function was processing data sequentially, leading to unnecessary delays.

Unnecessary Cloning:
The flamegraph highlighted that a lot of time was spent cloning data structures, particularly in the persist_identifiable_message function. This was due to passing owned data instead of references.

Optimizing the Code
With this information, I made the following changes to my application:

Batched Database Queries:
I refactored the database queries to fetch data in smaller batches. This reduced memory usage and improved query performance.

Parallel Data Transformation:
I parallelized the transmute_response function using futures::future::join_all to process multiple records concurrently. This significantly reduced the time spent on data transformation.

Reduced Cloning:
I refactored the code to avoid unnecessary cloning by using references wherever possible. This reduced memory allocations and improved overall performance.

Here’s a snippet of the optimized code:

let batch_size = 5;
for chunk in records.chunks(batch_size) {
    let futures = chunk.iter().map(|rsm34| {
        let transmuted_message = transmute_response(rsm34.complete_payload.clone().into()).unwrap();
        persist_identifiable_message(
            db_conn_pool.clone(),
            rsm34.peek_trace_id.clone().into(),
            transmuted_message.clone(),
        )
    });

    let results = join_all(futures).await;
    for result in results {
        if result.is_err() {
            panic!("Error {:?}", result.err());
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

The Results

After applying these optimizations, I regenerated the flamegraph to verify the improvements. The new flamegraph showed a much more balanced distribution of CPU time, with the previously identified hotspots no longer dominating the execution.

The most satisfying part? The data migration process, which previously took days, now seems to get complete much sooner. The application is not only faster but also more efficient in terms of memory usage.

Lessons Learned

Profiling is Essential:
Without profiling tools like Flamegraph, optimizing performance is a guessing game. Profiling gives you concrete data to work with.

Rust's Performance is Only as Good as Your Code:
While Rust is a high-performance language, it's still possible to write inefficient code. Tools like Flamegraph help you stay on track.

Optimization is Iterative:
Profiling and optimization are iterative processes. Each round of profiling can reveal new bottlenecks to address.

Give Flamegraph a Try!

If you're working on performance-critical Rust applications, I highly recommend giving Flamegraph a try. It's easy to set up, and the insights it provides are invaluable. Whether you're dealing with data transformation, migration, or any other CPU-intensive task, Flamegraph can help you identify and eliminate bottlenecks.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more