How Databases Actually Store Data (And Why You Should Care)

#beginners #database #learning

I am studying for my EECS 484 DBMS final and decided to try to get some extra value out of studying by making this blog post. I will explain every concept in my own words that we have learned since the midterm. This includes the database internals portion of the course. I will be as helpful as possible for those trying to learn for themselves instead of enduring the pain of taking the full class.

Physical Latency

First, we will establish the ground rules of physics. A DBMS assumes that the database lives on the non-volatile disk (SSD/HDD), but has to move this data to volatile memory (DRAM) before it can be put to use.

The job of the DBMS is to optimize this data transfer process. As you get further away from the cpu (cpu -> cache -> DRAM -> disk) the speed to read that data skyrockets.
Because random access on disk is insanely slow, the DBMS tries to optimize for sequential access of data. This means reading and writing in sequential blocks rather than jumping around.

Who manages memory?

Some of you might be asking:

Why doesn't the database just use mmap and let the Operating system manage memory?

The OS turns out to be a terrible option:

Transactional Safety: The DBMS needs to flush dirty pages (pages with unwritten changes) to disk in a specific order in order for the data not to be lost during a crash. The OS doesn't care about transactional logs. It flushes whenever it wants.
Performance: The DBMS has insider information that the OS does not. It knows the query pattern. It knows it is about to scan an entire table and can intelligently prefetch data. The OS just sees a file.

File Storage

The DBMS stores the database as a collection of files (>=1) on the disk typically in a proprietary format.

The OS has no clue what is in these files

The storage manager is responsible for maintaining these database files. It organizes the files as a collection of pages while tracking data read/written to pages and available space.

The page is a fixed-size block of data. Each page has a specific page type (tuples, metadata, indexes, logs...), and systems do not mix page types within a single page.

Each page is also given its own PageID. When the DBMS needs a page, the storage manager uses an indirection layer to translate that PageID into the pages actual physical location. This decouples the logical model and physical address allowing the storage manager to reorganize or grow storage.

It is easy to find pages if there is only one heap file. However, given many database files, you need a page directory in order to know:

Which files contain which pages
Which pages are allocated
Which pages are free
Where free space exists for new tuples.

Page Layout

Every page contains a header containing metadata about the pages contents:

Page Size
Checksum
DBMS Version
Transaction Visibility
Compression Info

Some systems require pages to be self-contained.

Let's assume our pages only store tuples. There are two major architectural philosophies for handling tuples in a page:

Tuple-oriented storage
Log-structured storage

We will focus on tuple oriented.

One commonly proposed solution in storing tuples in a page is keeping a count in the header and adding every tuple to the end of the page. However this introduces issues when deleting (leaves a hole in the page) and when handling varying length responses (updating a field could overwrite the next).

The most common layout scheme for tuples in a page is called slotted pages. The slot array maps "slots" to the tuples' starting position offset while the header keeps track of the number of used slots and the offset of the starting location of the last used slot.

Storage Models

Database Workloads

On-Line Transaction Processing (OLTP)
- Small quick operations that hit very few tuples per request
- e.g. banking requests, ecommerce sites
On-Line Analytical Processing (OLAP)
- Huge read-only operations that access massive chunks of the table.
- e.g dashboards, analytical queries
Hybrid Transaction + Analytical Processing
- OLTP + OLAP together on the same database instance

The database can store the tuples in different ways that benefit either OLTP or OLAP workloads. So far we have been assuming n-ary storage model (aka "row storage", NSM).

N-Ary Storage Model (NSM)

NSM is ideal for OLTP workloads since queries tend to only operate on individual tuples and insert-heavy workloads.

Advantages:

fast inserts, updates, deletes. (entire tuple stored in one place).
Good for queries that need entire tuples

Disadvantages:

Not good for scanning large portions of the table on a subset of attributes. (fetches unused data)

Decomposition Storage Model (DSM)

In Decomposition Storage Model (DSM), the DBMS stores the values of a single attribute of a tuple contiguously in a page ("column store"). This storage model is ideal for OLAP where read-only queries perform large scans over a subset of a table's attributes.

Now, to map the value back to its tuple there are 2 options: