The fastest and safest AV1 encoder.
Table of Content
rav1e is an AV1 video encoder. It is designed to eventually cover all use cases, though in its current form it is most suitable for cases where libaom (the reference encoder) is too slow.
- Intra, inter, and switch frames
- 64x64 superblocks
- 4x4 to 64x64 RDO-selected square and 2:1/1:2 rectangular blocks
- DC, H, V, Paeth, smooth, and all directional prediction modes
- DCT, (FLIP-)ADST and identity transforms (up to 64x64, 16x16 and 32x32 respectively)
- 8-, 10- and 12-bit depth color
- 4:2:0 (full support), 4:2:2 and 4:4:4 (limited) chroma sampling
- 11 speed settings (0-10)
- Near real-time encoding at high speed levels
- Constant quantizer and target bitrate (single- and multi-pass) encoding modes
- Still picture mode
If you want to know what rav1e is, this is a good place to start.
- Rate: Rate of bitstream output of the encoder.
- Distortion: Deviation from the original image.
- RD cost: Rate distortion cost is the trade-off cost between bitrate and quality loss. (We want to optimize this trade-off)
- Lambda: Lagrangian multiplier used to optimize the RD cost.
- tx: Transform
- CDEF: Constrained Directional Enhancement Filter
- LRF: Loop Restoration Filter
rav1e makes use of these RDO techniques to reach the saddle point in the trade-off between compression and quality.
A full exhaustive search for best RD-cost among all the tools and combinations is infeasible since the cost grows exponentially with every dimension that a tool adds to the search.
Many tools are pruned based on the speed settings and some tools like motion search are approximate by default.
But at the slowest speed setting, there is a search among nearly all available tools in rav1e.
During the encoding process, we measure RD cost using distortion.
cost = distortion + lambda * rate
During the post-processing filter stage, we measure RD cost using error.
cost = error + lambda * rate
- Inter modes
- Inter Compound modes
- Tx type
- Tx size
- Intra modes
- Chroma mode RDO for predicted luma
- Partition decision
- CDEF + LRF combination
Each tile is encoded independent of the other tiles. We can observe RD Optimization abundantly in the encoding process. Even the post-processing filters are chosen with RD costs taken into consideration.
1. Build coarse motion vectors of the tile for INTER blocks only. 2. For each superblock in tile: 1. Motion estimation: Using either Diamond search or Full search on subsampled blocks. 2. encode_partition_bottomup / encode_partition_topdown. 3. rdo_loop_decision. 3. CDEF and LRF.
- RDO for modes
- RDO during partitioning