Devansh Kashyap

Posted on May 22

Building a SQL-like Relational Database Engine in C++ From Scratch

#cpp #database #sql #systems

Most of us use databases every day.

But at some point I started wondering:

How does SQL actually work internally?
How are queries parsed?
How do joins work?
What happens after a SELECT statement?
How does persistence work under the hood?

So instead of only reading about databases, I decided to build one.

That project became Ark — a SQL-like relational database engine written entirely from scratch in C++.

Why I Built It

I wanted to understand the internals of database systems by implementing the pieces myself instead of relying on existing engines or parser generators.

The goal wasn’t to compete with production databases.

The goal was to learn:

parsing
query execution
relational operations
schema management
persistence systems
software architecture

Core Features

Ark currently supports:

Handwritten tokenizer
Recursive descent parser
CRUD operations
INNER / LEFT / RIGHT / FULL joins
Aggregate functions (COUNT, SUM, AVG, MIN, MAX)
ALTER TABLE
LIKE pattern matching
ORDER BY
DISTINCT
File persistence (SAVE / LOAD)
Three-tier diagnostics system with exact line/column reporting

Everything is implemented manually:

no external database libraries
no parser generators

- no embedded SQL engines

Architecture

The execution pipeline looks roughly like this:

Query
  ↓
Tokenizer
  ↓
Parser
  ↓
Command Objects
  ↓
Execution Engine
  ↓
Storage Layer
  ↓
Persistence

The project is split into modular components:

tokenizer
parser
execution engine
diagnostics
storage/persistence

Example Query

CREATE TABLE employees (
    id INT,
    name STRING,
    salary DOUBLE
);

INSERT INTO employees VALUES
    (1, "Alice", 95000.0),
    (2, "Bob", 72000.0);

SELECT * FROM employees
WHERE salary > 80000.0;

One of the Hardest Parts

One of the most interesting challenges was implementing joins and schema evolution.

Handling:

ALTER TABLE
adding/dropping columns
persistence consistency
join execution

became much more complicated than I initially expected.

Parser correctness and diagnostics also took a surprising amount of effort.

What I Learned

Building Ark taught me a lot about:

how parsers actually work
query execution pipelines
relational database concepts
software architecture
debugging complex state systems
designing diagnostics/error reporting

It also gave me a much deeper appreciation for real database engines.

GitHub

GitHub Repository:
https://github.com/kashyap-devansh/Ark

I’d genuinely appreciate feedback from people interested in:

databases
systems programming
parsers
compilers
C++

Especially suggestions for improving the architecture or query engine.

Top comments (1)

Devansh Kashyap • May 22

First time posting on DEV 👋
I’d really appreciate feedback from people interested in databases, parsers, compilers, or systems programming — especially around architecture and query execution design.