Athreya aka Maneshwar

Posted on Jan 1

Components of a RDBMS: From SQL to Disk

#webdev #database #programming #architecture

Hello, I'm Maneshwar. I'm working on FreeDevTools online currently building **one place for all dev tools, cheat codes, and TLDRs* — a free, open-source hub where developers can quickly find and use tools without any hassle of searching all over the internet.*

As we move deeper into database systems, it’s time to look inside a Relational Database Management System (RDBMS) and understand how it actually works under the hood.

While SQL feels simple on the surface, a lot of sophisticated machinery is involved in turning a query into results.

This post breaks down the core components of a typical SQL-based RDBMS and walks through how a simple query is executed.

High-Level Architecture of an RDBMS

A typical RDBMS is divided into two major subsystems:

Frontend (SQL Translator / Query Preprocessor)
Backend (Query Execution Engine and Supporting Subsystems)

The frontend is responsible for understanding SQL, while the backend is responsible for executing it efficiently and safely.

Frontend: Translating SQL into Something Executable

The frontend takes raw SQL statements and converts them into an internal representation that the database engine can process.

This internal form varies across systems, some generate parse trees, some generate bytecode, and others even produce native machine code.

Regardless of the approach, the goal is the same: prepare the query for execution.

1. Parser

The parser is the first stop for any SQL statement.

Breaks the SQL query into tokens
Validates syntax
Builds a parse tree representing the structure of the query

If there’s a syntax error, the query never goes beyond this stage.

2. Optimizer

The optimizer transforms the parse tree into a more efficient version.

Chooses access paths (table scans vs indexes)
Reorders joins
Applies cost-based or rule-based optimizations

This step is critical for performance, two logically equivalent queries can have vastly different execution costs.

3. Code Generator

The code generator converts the optimized tree into a format the execution engine understands.

Internal bytecode (common in lightweight systems)
Execution plans
Engine-specific instruction structures

At this point, SQL has been fully translated into an executable form.

Backend: Executing Queries Reliably and Efficiently

The backend is where the real work happens. It acts as a virtual machine that interprets the frontend’s output and interacts with disk, memory, and concurrent transactions.

Key Backend Subsystems

1. Storage Management

Manages data stored on physical devices (files on disk)
Handles page layouts, record formats, and file organization

Users never interact with files directly, this abstraction is entirely handled by the DBMS.

2. Buffer (Cache) Management

Maintains an in-memory cache of disk pages
Decides which pages to load, keep, or evict
Minimizes expensive disk I/O operations

This subsystem has a massive impact on overall database performance.

3. Concurrency Control

Coordinates simultaneous access by multiple transactions
Prevents data races and inconsistencies
Allows maximum safe parallelism

Without this, concurrent users could corrupt shared data.

4. Recovery Management

Restores the database after crashes or failures
Uses logs and checkpoints
Handles partial transactions and rollbacks

This ensures durability even in the presence of system failures.

5. Transaction Management

Groups database operations into transactions
Enforces ACID properties:
- Atomicity
- Consistency
- Isolation
- Durability

Applications rely on this to keep data correct and reliable.

6. Interpreter (Execution Engine)

Executes the generated code or plan
Calls storage, buffer, and transaction subsystems as needed
Produces query results for the client

This is the final stage where SQL turns into actual data movement.

A Simple Example: `SELECT * FROM Students`

Let’s see how all this fits together with a basic query:

SELECT * FROM Students;

Behind the scenes, the engine may perform the following steps:

Open the database file containing the Students table
Position the file pointer at the start of the table
Loop through the table:
- Read column values from the current record
- Return the row to the caller
- Move to the next record
Close the file

All of this happens transparently. The application only sees rows being returned, never file operations or disk access.

Bigger Picture: Databases and Computer Systems

This chapter also reinforces how database systems sit on top of general computer systems:

Hardware provides CPU, memory, disks, and network devices
Operating systems manage these resources and expose file APIs
Processes and threads execute applications
File systems store persistent data, but lack transactional guarantees

Because file operations are not inherently atomic, databases cannot rely solely on the OS.

This is why DBMSs implement their own mechanisms for consistency, recovery, and concurrency.

From Data Modeling to Transactions

To summarize the broader journey so far:

Data is modeled using entities, attributes, and relationships
Relational databases store data as tables (relations)
SQL is used for data definition and manipulation
Indexes (hash and B-tree) are added for performance
Transactions protect data correctness under failures and concurrency

The DBMS abstracts all of this complexity so application developers can focus on business logic rather than low-level data management.

Closing Thoughts

Understanding RDBMS internals makes it clear why databases are far more than “just files with SQL on top.”

Each query triggers a coordinated effort between parsers, optimizers, execution engines, memory managers, and recovery systems.

This layered design is what allows databases to be fast, reliable, and safe even under heavy concurrent workloads.

👉 Check out: FreeDevTools

Any feedback or contributors are welcome!

It’s online, open-source, and ready for anyone to use.

⭐ Star it on GitHub: freedevtools

DEV Community

Components of a RDBMS: From SQL to Disk

High-Level Architecture of an RDBMS