loading...

rav1e 0.2.0 - Winter Solstice

luzero profile image Luca Barbato Updated on ・2 min read

Yesterday we made the second release of rav1e.

Winter Solstice Saturnalia Yuletide

The second official release of rav1e was focused mainly on speed. Later I'll write down what we used and what, in my opinion, are the best practices when you have to optimize a codebase like this one.

What's new

Since we focused mainly on speed, there aren't many user-facing changes.

New in the command line

  • --benchmark: Since I started to use gnu time to get some coarse information, I decided to add a getrusage() report directly in the cli. Currently it is unix-like only, if you like Windows a pull request to add the same feature is really welcome!
  • advanced --save-config / advanced --load-config: The options depend on building with --features=serialize, they let you dump and restore the full encoder configuration, it is quite useful if you want to tweak settings.

Building features

  • tracing: Enables hawktracer probes.
  • serialize: Enables serde support, it is optional mainly because it would nearly double the build time. Suggestions on something smaller are welcome.

More performance

Alt Text

Speed was the focus on this release and we tried our best not to increase the resident set, nor the overall memory usage.

Alt Text

The memory fragmentation that was quite significant in 0.1.0

Alt Text

Had been addressed fairly well in 0.2.0 and probably we'll return to it in 0.4.0:

Alt Text

We are still not using as much cpu and memory as possible and there are still large inefficiencies to address in the future, some of it will be solved by the rayon developers, some are on us:

Alt Text

In the future the lookahead computation we have in send_frame will happen in a separate thread and it will not stall the rest of the encoding process this much.

Future plans

The focus for 0.2.0 was speed and we got between 40% and 70% faster in about 40 days.

The focus for 0.3.0 will be quality, striving to keep the encoding speed as fast as it is if not faster, we'll see how it will go next year.

Thanks for reading!

Discussion

pic
Editor guide
Collapse
jeikabu profile image
jeikabu

I'd be interested in seeing more details regarding specific perf/memory optimizations.
If reducing allocations was accomplished by reusing buffers, how thorny was that in Rust?

The first link in the post needs a v in front of the tag name, i.e. v0.2.0.

Collapse
luzero profile image
Luca Barbato Author

You are right, updated!

I'll try to write more about optimizations soon :)