DEV Community

Johannes Lichtenberger
Johannes Lichtenberger

Posted on

What are the advantages/disadvantages of memory mapped files for a DBMS?


I'm developing a temporal Open Source data store[1] with features such as

  • the storage engine is written from scratch
  • completely isolated read-only transactions and one read/write transaction concurrently with a single lock to guard the writer. Readers will never be blocked by the single read/write transaction and execute without any latches/locks. Likewise the writer is not blocked by read-only transactions
  • variable-sized pages
  • lightweight buffer management with a "kind of" pointer swizzling
  • dropping the need for a write-ahead log due to atomic switching of an UberPage
  • rolling merkle hash tree of all nodes built during updates optionally
  • ID-based diff-algorithm to determine differences between revisions taking the (secure) hashes optionally into account
  • non-blocking REST-API, which also takes the hashes into account to throw an error if a subtree has been modified in the meantime concurrently during updates
  • versioning through a huge persistent and durable, variable-sized page tree using copy-on-write
  • storing delta page-fragments using a patented sliding snapshot algorithm
  • using a special trie, which is especially good for storing records with numerical dense, monotonically increasing 64 Bit integer IDs. We make heavy use of bit shifting to calculate the path to fetch a record
  • time or modification counter-based auto-commit
  • versioned, user-defined secondary index structures
  • a versioned path summary
  • indexing every revision, such that a timestamp is only stored once in a RevisionRootPage. The resources stored in SirixDB are based on a huge, persistent (functional) and durable tree
  • sophisticated time travel queries

I've read a bunch of stuff about memory mapped files, but I'm still not really sure in which cases a memory mapped file would be better and if I should map the whole file or just the potentially rather small page fragments.

It seems MongoDB, LMDB and other data stores are super fast, because of memory mapped files among other stuff.

I think it would also drop the requirement to cache any page-fragments all-together.

In my case the files can get very big as I'm using an append-only paradigm without segment files.

So, besides that I want to better understand the impacts I wonder if I should map the whole file or just the page-fragment regions (however, they could be rather small due to fine granular storage -- maybe in some cases only a few hundred bytes).

Kind regards


Top comments (0)