DEV Community

Cover image for Components of a RDBMS: From SQL to Disk
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

Components of a RDBMS: From SQL to Disk

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building **one place for all dev tools, cheat codes, and TLDRs* — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.*

As we move deeper into database systems, it’s time to look inside a Relational Database Management System (RDBMS) and understand how it actually works under the hood.

While SQL feels simple on the surface, a lot of sophisticated machinery is involved in turning a query into results.

This post breaks down the core components of a typical SQL-based RDBMS and walks through how a simple query is executed.

High-Level Architecture of an RDBMS

image

A typical RDBMS is divided into two major subsystems:

  1. Frontend (SQL Translator / Query Preprocessor)
  2. Backend (Query Execution Engine and Supporting Subsystems)

The frontend is responsible for understanding SQL, while the backend is responsible for executing it efficiently and safely.

Frontend: Translating SQL into Something Executable

The frontend takes raw SQL statements and converts them into an internal representation that the database engine can process.

This internal form varies across systems, some generate parse trees, some generate bytecode, and others even produce native machine code.

Regardless of the approach, the goal is the same: prepare the query for execution.

1. Parser

The parser is the first stop for any SQL statement.

  • Breaks the SQL query into tokens
  • Validates syntax
  • Builds a parse tree representing the structure of the query

If there’s a syntax error, the query never goes beyond this stage.

2. Optimizer

The optimizer transforms the parse tree into a more efficient version.

  • Chooses access paths (table scans vs indexes)
  • Reorders joins
  • Applies cost-based or rule-based optimizations

This step is critical for performance, two logically equivalent queries can have vastly different execution costs.

3. Code Generator

The code generator converts the optimized tree into a format the execution engine understands.

  • Internal bytecode (common in lightweight systems)
  • Execution plans
  • Engine-specific instruction structures

At this point, SQL has been fully translated into an executable form.

Backend: Executing Queries Reliably and Efficiently

The backend is where the real work happens. It acts as a virtual machine that interprets the frontend’s output and interacts with disk, memory, and concurrent transactions.

Key Backend Subsystems

1. Storage Management

  • Manages data stored on physical devices (files on disk)
  • Handles page layouts, record formats, and file organization

Users never interact with files directly, this abstraction is entirely handled by the DBMS.

2. Buffer (Cache) Management

  • Maintains an in-memory cache of disk pages
  • Decides which pages to load, keep, or evict
  • Minimizes expensive disk I/O operations

This subsystem has a massive impact on overall database performance.

3. Concurrency Control

  • Coordinates simultaneous access by multiple transactions
  • Prevents data races and inconsistencies
  • Allows maximum safe parallelism

Without this, concurrent users could corrupt shared data.

4. Recovery Management

  • Restores the database after crashes or failures
  • Uses logs and checkpoints
  • Handles partial transactions and rollbacks

This ensures durability even in the presence of system failures.

5. Transaction Management

  • Groups database operations into transactions
  • Enforces ACID properties:
    • Atomicity
    • Consistency
    • Isolation
    • Durability

Applications rely on this to keep data correct and reliable.

6. Interpreter (Execution Engine)

  • Executes the generated code or plan
  • Calls storage, buffer, and transaction subsystems as needed
  • Produces query results for the client

This is the final stage where SQL turns into actual data movement.

A Simple Example: SELECT * FROM Students

Let’s see how all this fits together with a basic query:

SELECT * FROM Students;
Enter fullscreen mode Exit fullscreen mode

Behind the scenes, the engine may perform the following steps:

  1. Open the database file containing the Students table
  2. Position the file pointer at the start of the table
  3. Loop through the table:
    • Read column values from the current record
    • Return the row to the caller
    • Move to the next record
  4. Close the file

All of this happens transparently. The application only sees rows being returned, never file operations or disk access.

Bigger Picture: Databases and Computer Systems

This chapter also reinforces how database systems sit on top of general computer systems:

  • Hardware provides CPU, memory, disks, and network devices
  • Operating systems manage these resources and expose file APIs
  • Processes and threads execute applications
  • File systems store persistent data, but lack transactional guarantees

Because file operations are not inherently atomic, databases cannot rely solely on the OS.

This is why DBMSs implement their own mechanisms for consistency, recovery, and concurrency.

From Data Modeling to Transactions

To summarize the broader journey so far:

  • Data is modeled using entities, attributes, and relationships
  • Relational databases store data as tables (relations)
  • SQL is used for data definition and manipulation
  • Indexes (hash and B-tree) are added for performance
  • Transactions protect data correctness under failures and concurrency

The DBMS abstracts all of this complexity so application developers can focus on business logic rather than low-level data management.

Closing Thoughts

Understanding RDBMS internals makes it clear why databases are far more than “just files with SQL on top.”

Each query triggers a coordinated effort between parsers, optimizers, execution engines, memory managers, and recovery systems.

This layered design is what allows databases to be fast, reliable, and safe even under heavy concurrent workloads.

FreeDevTools

👉 Check out: FreeDevTools

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

⭐ Star it on GitHub: freedevtools

Top comments (0)