This is the last of a series of articles, in which I examined in detail the Huffman algorithm and its connection to HPACK for HTTP/2. In previous ones, I explained what the Huffman algorithm is, and started to explain how the encoding process of messages works, and in the last article HPACK: Huffman translation matrix. I already started the topic of decoding and have described the first part of the decoding process.

The algorithm for decoding the canonical Huffman algorithm for HPACK is being executed based on a matrix, where the Huffman tree is shown in the form of the 2-dimensional table and is made for a specific number of bits being read at the time.

In this last article, we decided that our decoder will decode the Huffman sequence by reading 2 bits at a time. For this purpose, we created a matrix, which will enable us to reverse the coded message back to the original content.

Path | ID | SYM | LFT | 00 | 01 | 10 | 11 |
---|---|---|---|---|---|---|---|

// | 0 | - | - | 1 | 2 | 3 | 6 |

//00 | 1 | A | 0 | - | - | - | - |

//01 | 2 | B | 0 | - | - | - | - |

//10 | 2 | B | 0 | - | - | - | - |

//10 | 3 | - | - | 4 | 4 | 5 | 5 |

//100X | 4 | C | 1 | - | - | - | - |

//101X | 5 | D | 1 | - | - | - | - |

//11 | 6 | E | 0 | - | - | - | - |

As an example we are using a sequence of characters in the order: A, D, and B.

```
ADE = 0010101
```

The Huffman sequence will be decoded by reading 2 bits at a time. Every reading begins at the root symbol `//`

. First, we read the first two bits `00`

. In line one of the matrix at `ID=0`

, we need to check where this code leads, or if it corresponds to any of the characters. Read bits lead to the second line with `ID=1`

and they represent the letter A.

The process is repeated for the next 2 bits `10`

. This code leads us to the line with `ID=3`

which doesnβt represent a character, so we continue the process for the next 2 bits `10`

. This code then leads us to line 5, representing the letter D. Here we can see that the value of the column `LFT=1`

, meaning that there is a leftover 1. This means that in order to continue reading bits we have to shift to one bit back and continue the process there.

We position ourselves back to the root position while keeping the last bit `0`

, and keep reading until we reach the sum of 2 bits. This means that we need to read only 1 bit `1`

. Code `01`

corresponds with character B and with this we conclude the decoding process.

```
00XXXXX => A
XX10XXX => continue
XXXX10X => D
XXXXX01 => B
```

With the use of the translation matrix, which we created to read 2 bits at a time, we successfully decoded the Huffman sequence back into readable characters. This is how HPACK in HTTP/2 decodes header literal values. The process is optimal, while it is best for web servers to read more bits at a time. Considering that the shortest Huffman code for an individual character is 5 bits long, itβs optimal, for the best ratio between speed and used resources, to read 4 bits at a time. More bits at a time mean faster decoding but at the same time a larger translation table and with it a higher memory footprint.

I made a complete decoder implemented in Rust. It is available open-source at the public Github repository.

## Top comments (0)