<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Devansh Kashyap</title>
    <description>The latest articles on DEV Community by Devansh Kashyap (@kashyapdevansh).</description>
    <link>https://dev.to/kashyapdevansh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3944570%2Fefa574f4-4a5d-4e23-bb0e-35abd7b5772b.png</url>
      <title>DEV Community: Devansh Kashyap</title>
      <link>https://dev.to/kashyapdevansh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kashyapdevansh"/>
    <language>en</language>
    <item>
      <title>Building a SQL-like Relational Database Engine in C++ From Scratch</title>
      <dc:creator>Devansh Kashyap</dc:creator>
      <pubDate>Fri, 22 May 2026 17:58:19 +0000</pubDate>
      <link>https://dev.to/kashyapdevansh/building-a-sql-like-relational-database-engine-in-c-from-scratch-42j4</link>
      <guid>https://dev.to/kashyapdevansh/building-a-sql-like-relational-database-engine-in-c-from-scratch-42j4</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvmc0knexo1pof70perh.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvmc0knexo1pof70perh.gif" alt="Animated demo of the Ark SQL database engine executing queries in a terminal" width="560" height="266"&gt;&lt;/a&gt;&lt;br&gt;
Most of us use databases every day.&lt;/p&gt;

&lt;p&gt;But at some point I started wondering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How does SQL actually work internally?&lt;/li&gt;
&lt;li&gt;How are queries parsed?&lt;/li&gt;
&lt;li&gt;How do joins work?&lt;/li&gt;
&lt;li&gt;What happens after a &lt;code&gt;SELECT&lt;/code&gt; statement?&lt;/li&gt;
&lt;li&gt;How does persistence work under the hood?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So instead of only reading about databases, I decided to build one.&lt;/p&gt;

&lt;p&gt;That project became &lt;strong&gt;Ark&lt;/strong&gt; — a SQL-like relational database engine written entirely from scratch in C++.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why I Built It
&lt;/h2&gt;

&lt;p&gt;I wanted to understand the internals of database systems by implementing the pieces myself instead of relying on existing engines or parser generators.&lt;/p&gt;

&lt;p&gt;The goal wasn’t to compete with production databases.&lt;/p&gt;

&lt;p&gt;The goal was to learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;parsing&lt;/li&gt;
&lt;li&gt;query execution&lt;/li&gt;
&lt;li&gt;relational operations&lt;/li&gt;
&lt;li&gt;schema management&lt;/li&gt;
&lt;li&gt;persistence systems&lt;/li&gt;
&lt;li&gt;software architecture&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Core Features
&lt;/h2&gt;

&lt;p&gt;Ark currently supports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Handwritten tokenizer&lt;/li&gt;
&lt;li&gt;Recursive descent parser&lt;/li&gt;
&lt;li&gt;CRUD operations&lt;/li&gt;
&lt;li&gt;INNER / LEFT / RIGHT / FULL joins&lt;/li&gt;
&lt;li&gt;Aggregate functions (&lt;code&gt;COUNT&lt;/code&gt;, &lt;code&gt;SUM&lt;/code&gt;, &lt;code&gt;AVG&lt;/code&gt;, &lt;code&gt;MIN&lt;/code&gt;, &lt;code&gt;MAX&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ALTER TABLE&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LIKE&lt;/code&gt; pattern matching&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ORDER BY&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DISTINCT&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;File persistence (&lt;code&gt;SAVE&lt;/code&gt; / &lt;code&gt;LOAD&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Three-tier diagnostics system with exact line/column reporting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is implemented manually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no external database libraries&lt;/li&gt;
&lt;li&gt;no parser generators&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  - no embedded SQL engines
&lt;/h2&gt;
&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The execution pipeline looks roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query
  ↓
Tokenizer
  ↓
Parser
  ↓
Command Objects
  ↓
Execution Engine
  ↓
Storage Layer
  ↓
Persistence
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The project is split into modular components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tokenizer&lt;/li&gt;
&lt;li&gt;parser&lt;/li&gt;
&lt;li&gt;execution engine&lt;/li&gt;
&lt;li&gt;diagnostics&lt;/li&gt;
&lt;li&gt;storage/persistence&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Example Query
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;STRING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="nb"&gt;DOUBLE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"Alice"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;95000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"Bob"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;72000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;80000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  One of the Hardest Parts
&lt;/h2&gt;

&lt;p&gt;One of the most interesting challenges was implementing joins and schema evolution.&lt;/p&gt;

&lt;p&gt;Handling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ALTER TABLE&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;adding/dropping columns&lt;/li&gt;
&lt;li&gt;persistence consistency&lt;/li&gt;
&lt;li&gt;join execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;became much more complicated than I initially expected.&lt;/p&gt;

&lt;p&gt;Parser correctness and diagnostics also took a surprising amount of effort.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building Ark taught me a lot about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how parsers actually work&lt;/li&gt;
&lt;li&gt;query execution pipelines&lt;/li&gt;
&lt;li&gt;relational database concepts&lt;/li&gt;
&lt;li&gt;software architecture&lt;/li&gt;
&lt;li&gt;debugging complex state systems&lt;/li&gt;
&lt;li&gt;designing diagnostics/error reporting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also gave me a much deeper appreciation for real database engines.&lt;/p&gt;




&lt;h2&gt;
  
  
  GitHub
&lt;/h2&gt;

&lt;p&gt;GitHub Repository:&lt;br&gt;
&lt;a href="https://github.com/kashyap-devansh/Ark" rel="noopener noreferrer"&gt;https://github.com/kashyap-devansh/Ark&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’d genuinely appreciate feedback from people interested in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;databases&lt;/li&gt;
&lt;li&gt;systems programming&lt;/li&gt;
&lt;li&gt;parsers&lt;/li&gt;
&lt;li&gt;compilers&lt;/li&gt;
&lt;li&gt;C++&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Especially suggestions for improving the architecture or query engine.&lt;/p&gt;

</description>
      <category>cpp</category>
      <category>database</category>
      <category>sql</category>
      <category>systems</category>
    </item>
  </channel>
</rss>
