DEV Community

Cover image for Inside the Database: A Deep Dive into Disk-Oriented DBMS.
Ahmed Khaled
Ahmed Khaled

Posted on

Inside the Database: A Deep Dive into Disk-Oriented DBMS.

When it comes to databases, everyone talks about queries, indexes, and optimizations, but the actual way data is stored—how it's physically arranged on disk—is often overlooked. Yet, understanding this is crucial for anyone building scalable applications. Data storage influences everything from read/write performance to how efficiently the system handles large datasets. In this blog, we’ll dive into two key methods of organizing data in a DBMS: heap files and log-structured storage, and then explore how different data types are represented within those structures.


Heap Files: The "Junk Drawer" of Data Storage
Let’s start with heap files. Imagine you have a drawer where you randomly throw in things. You don’t really care about order—you just need to store your stuff. That’s basically how heap files work in databases.

In a heap file, records (or rows) are inserted wherever there’s space, without any specific order. The system keeps track of where the data lives using pages (fixed-size blocks of storage) and slots within those pages. When you need to retrieve a record, the DBMS scans the heap file to find it.


Page Layout: How Data is Physically Stored
At the simplest level, a page is a fixed-size block, usually between 4KB and 16KB. Each page contains multiple tuples, and the DBMS uses slots to keep track of where each tuple is stored within the page. Think of slots as pointers or indexes that point to the location of each tuple.

Here’s a breakdown of how a page might be structured:

  • Page Header: The page begins with metadata, like how many slots are in use, how much free space is left, and other housekeeping information.

  • Slot Array: Following the header is the slot array. Each slot points to the actual location of a tuple in the page. Slots allow for flexible storage of variable-length records because they let the DBMS keep track of where tuples are, even if they don't follow each other consecutively.

  • Tuples: The actual data records are stored in the remaining part of the page. For fixed-length tuples, the size is known, so storing and retrieving them is straightforward. For variable-length tuples, the DBMS will use additional space in the slot array to indicate where the tuple starts and how long it is.


Different Approaches to Storing Tuples in Pages
There are a couple of key strategies the DBMS might use to organize tuples within pages. These approaches determine how the data grows and how space is managed:

  1. Packed Representation: In this model, tuples are packed tightly, filling up the page as much as possible. When a new tuple is inserted, it’s placed in the next available spot. If a tuple is deleted, it leaves a gap. To manage this, some DBMS systems may use compaction techniques to periodically shift tuples around to eliminate gaps, but this can be expensive in terms of performance.

  2. Slotted Pages: As mentioned earlier, the page uses a slot array to reference the actual data. The slot array grows from the beginning of the page, while the tuples are added from the end of the page, moving toward the middle. This gives the DBMS flexibility when inserting or deleting data because it only has to update the slot array, not the actual location of the tuples. Deletions don’t require immediate compaction of the data, just an update to the slot array, making operations more efficient.

Advantages of Heap Files:

  • Fast inserts: Since you’re just dropping data wherever there’s space, inserts are quick.
  • Simplicity: The structure is straightforward, making it easy to implement.

Drawbacks of Heap Files:

  • Slow reads: Without any order, the database has to scan the entire file to find specific data.

  • Fragmentation: As data is deleted or updated, the file can get fragmented, slowing down performance over time.


Dealing with the Cons of Heap Files

Problem: Slow Reads (Full Scans)
The biggest downside of heap files is the lack of any inherent order in how records are stored. If you're searching for a particular record, you may have to scan through the entire heap file (or multiple pages) to find it.

Solution: Indexing
To mitigate this issue, most DBMSs use indexes to speed up lookups. An index is like a roadmap, pointing to where specific records are stored in the heap file. Instead of scanning every page, the DBMS can use the index to jump directly to the right page and slot. Common indexing methods include B-trees and hash indexes, which help locate records quickly.

Problem: Fragmentation and Wasted Space
Over time, as tuples are inserted, deleted, and updated, heap files can become fragmented. This leads to wasted space within pages and slower performance during reads and writes.

Solution: Vacuuming and Compaction
Some DBMSs periodically perform a process called vacuuming or compaction, where they reorganize pages to eliminate gaps left by deleted tuples. This ensures that space is used efficiently and helps improve performance. The downside is that vacuuming can be resource-intensive, so it's often scheduled during low-traffic periods.

Log-Structured Storage (LSM): High-Speed Data Writes and Efficient Compaction

Log-Structured Merge (LSM) trees were designed to tackle one major bottleneck in databases: slow writes. If you’re running an application that’s writing data frequently—such as logging systems or time-series databases—LSM trees offer a compelling solution by optimizing how data is written and later reorganized.

How LSM Trees Work: Sequential Writes to the Rescue

LSM trees follow a simple principle: write everything in a log. Rather than inserting new data into random spots on disk like heap files, LSM trees append all writes to a sequential log file. This approach ensures that writes are fast because appending data sequentially is far more efficient than performing random disk I/O.

Here’s the basic workflow:

  1. Write to Memory First: When new data arrives, it’s first written to an in-memory data structure called a memtable (usually implemented as a sorted tree structure like a Red-Black tree).
  2. Flush to Disk as a Log: Once the memtable fills up, it is flushed to disk as an immutable file, typically called an SSTable (Sorted String Table). This file is a sorted, compressed representation of the data.
  3. Compaction: Over time, multiple SSTables accumulate on disk. Since these files are immutable, the DBMS periodically merges them through a process called compaction. Compaction combines and reorganizes the data to reduce fragmentation, clean up obsolete entries, and optimize the overall storage layout.

Entry Log Structure: How Data is Organized in LSM Trees

In LSM trees, data is written to disk in SSTables, which store key-value pairs in a sorted order. Each SSTable typically contains:

  1. Data Block: The actual key-value pairs are stored here, ordered by keys. Each block is compressed to save space.
  2. Index Block: A separate index for quick lookups is maintained. The index maps keys to their position in the data block, allowing the DBMS to jump directly to the relevant data during reads.
  3. Bloom Filters: To avoid unnecessary disk reads, LSM trees often use Bloom filters—a probabilistic data structure that helps quickly determine whether a key is not present in an SSTable. This reduces the need to scan multiple tables unnecessarily.

Compaction: Managing Multiple Log Files

As more SSTables accumulate on disk, the database needs to balance between fast writes and efficient reads. Compaction solves this problem by merging multiple smaller SSTables into a single larger one, eliminating duplicate or outdated entries in the process. This makes subsequent reads faster, as fewer SSTables need to be checked.

However, compaction requires additional resources and can impact performance if not managed well. Most DBMSs perform compaction in the background to minimize its impact on live queries.


Indexes in LSM Trees: Optimizing Read Performance

While LSM trees are great for writes, they need help when it comes to reads. Without any optimizations, searching for a specific record would require scanning multiple SSTables. Here’s how DBMSs improve read performance:

  1. Bloom Filters: As mentioned earlier, Bloom filters quickly determine whether a record is not in a particular SSTable, reducing unnecessary scans.

  2. Summary Table: A summary table can also be used to keep track of the range of keys within each SSTable. This way, the DBMS can check if a key falls within a specific range before looking inside the file, reducing the number of SSTables it has to scan.

  3. Indexes: The DBMS builds indexes over keys, stored alongside the SSTables. These indexes allow for efficient point lookups and range queries, helping locate records quickly even when they are spread across multiple SSTables.


Cons of Log-Structured Storage: Trade-Offs to Consider

While LSM trees offer impressive write performance, they come with some challenges, particularly during reads and compaction:

1. Slower Reads:

Since data is spread across multiple SSTables, reading a specific record might require scanning several files. Without proper indexing or Bloom filters, reads can become slow compared to traditional storage methods like B-trees.

Solution: This is mitigated by using Bloom filters, summary tables, and indexes to narrow down the search space.

2. Compaction Overhead:

Compaction can be resource-intensive. As SSTables accumulate, compaction must merge and rewrite data, which consumes disk I/O and CPU resources. If not managed properly, compaction can slow down the system, especially during high-traffic periods.

Solution: DBMSs often schedule compaction during low-activity times or stagger compaction processes to avoid overwhelming the system.

3. Space Amplification:

Before compaction happens, multiple versions of the same data might exist across different SSTables, leading to space amplification. This means the database might use more storage than necessary to temporarily hold redundant or outdated data.

Solution: By running compaction frequently enough and balancing it with system performance, DBMSs can keep space amplification in check.

Conclusion: The Developer’s Edge

Understanding how a DBMS stores data gives developers an edge when designing databases, writing queries, and tuning performance. Different storage approaches, such as heap files and log-structured storage, offer unique advantages and challenges, and knowing how they work internally helps developers make informed decisions, optimize applications, and avoid potential bottlenecks.

This knowledge empowers developers to:

  • Select appropriate indexes.
  • Choose data types wisely.
  • Tackle performance issues before they become problems.

Top comments (0)