DEV Community

Cover image for Building a SQL-like Relational Database Engine in C++ From Scratch
Devansh Kashyap
Devansh Kashyap

Posted on

Building a SQL-like Relational Database Engine in C++ From Scratch

Animated demo of the Ark SQL database engine executing queries in a terminal
Most of us use databases every day.

But at some point I started wondering:

  • How does SQL actually work internally?
  • How are queries parsed?
  • How do joins work?
  • What happens after a SELECT statement?
  • How does persistence work under the hood?

So instead of only reading about databases, I decided to build one.

That project became Ark — a SQL-like relational database engine written entirely from scratch in C++.


Why I Built It

I wanted to understand the internals of database systems by implementing the pieces myself instead of relying on existing engines or parser generators.

The goal wasn’t to compete with production databases.

The goal was to learn:

  • parsing
  • query execution
  • relational operations
  • schema management
  • persistence systems
  • software architecture

Core Features

Ark currently supports:

  • Handwritten tokenizer
  • Recursive descent parser
  • CRUD operations
  • INNER / LEFT / RIGHT / FULL joins
  • Aggregate functions (COUNT, SUM, AVG, MIN, MAX)
  • ALTER TABLE
  • LIKE pattern matching
  • ORDER BY
  • DISTINCT
  • File persistence (SAVE / LOAD)
  • Three-tier diagnostics system with exact line/column reporting

Everything is implemented manually:

  • no external database libraries
  • no parser generators

- no embedded SQL engines

Architecture

The execution pipeline looks roughly like this:

Query
  ↓
Tokenizer
  ↓
Parser
  ↓
Command Objects
  ↓
Execution Engine
  ↓
Storage Layer
  ↓
Persistence
Enter fullscreen mode Exit fullscreen mode

The project is split into modular components:

  • tokenizer
  • parser
  • execution engine
  • diagnostics
  • storage/persistence

Example Query

CREATE TABLE employees (
    id INT,
    name STRING,
    salary DOUBLE
);

INSERT INTO employees VALUES
    (1, "Alice", 95000.0),
    (2, "Bob", 72000.0);

SELECT * FROM employees
WHERE salary > 80000.0;
Enter fullscreen mode Exit fullscreen mode

One of the Hardest Parts

One of the most interesting challenges was implementing joins and schema evolution.

Handling:

  • ALTER TABLE
  • adding/dropping columns
  • persistence consistency
  • join execution

became much more complicated than I initially expected.

Parser correctness and diagnostics also took a surprising amount of effort.


What I Learned

Building Ark taught me a lot about:

  • how parsers actually work
  • query execution pipelines
  • relational database concepts
  • software architecture
  • debugging complex state systems
  • designing diagnostics/error reporting

It also gave me a much deeper appreciation for real database engines.


GitHub

GitHub Repository:
https://github.com/kashyap-devansh/Ark

I’d genuinely appreciate feedback from people interested in:

  • databases
  • systems programming
  • parsers
  • compilers
  • C++

Especially suggestions for improving the architecture or query engine.

Top comments (1)

Collapse
 
kashyapdevansh profile image
Devansh Kashyap

First time posting on DEV 👋
I’d really appreciate feedback from people interested in databases, parsers, compilers, or systems programming — especially around architecture and query execution design.