DEV Community

Johannes Lichtenberger
Johannes Lichtenberger

Posted on

Rolling, secure hashes for nodes in a tree / How to reduce on-disk space consumption?

Hi all,

I've implemented the storage of rolling, secure hashes for a temporal document store called SirixDB.

During bulk inserts hashes are built while traversing the built tree in postorder. During updates, that is deletes, inserts or value updates hashes of ancestor nodes are adapted. We have unique node-IDs and hashes are built taking neighbour nodes into account (the 64 Bit node-IDs pointing to the sibling nodes).

For instance during an update the old hash is subtracted and a new hash basically added to parent node and that's bubbling up for all ancestors.

Now I wanted to reduce the collision possibility to a minimum and used Sha256 truncated to 128 Bits.

However, now every node optionally stores this hash, which is an additional 16 Bytes.

My idea would be to store all hashes of the nodes at the beginning of the variable sized page in a delta-encoding, for instance subtracting each consecutive hash from the former and storing some kind of variable size encoding.

Do you have any ideas how to best "compress" the hashes on-disk? Currently at most 512 nodes are stored in a page meaning 512*16 bytes only for the hashes.

Kind regards
Johannes

Top comments (0)