Dhanushka madushan

Posted on Mar 26, 2019 • Edited on Apr 3, 2019

How Database Internally Works

#database #architecture #sqlite

In a computer program, the database is the heart of handling data. It provides an efficient way of storing and retrieving data fast. Learning Database internal architecture in code level is not that so easy since Database has complex architecture. Looking into a complex database is not that easy because of the complexity. But, SQLite is a nice little compact database that used in millions of devices and operating systems. Interesting fact about the SQLite database is that, code is highly readable. All most all of the codes are well commented and architecture is highly layered. This post is to look at the high-level architecture of the SQLite database.
As I early mentioned, SQLite has a highly layered architecture and that can be separated into seven layers. We will go through each of the layers one by one.

TOKENIZER

First Layer is called Tokenizer. Purpose of the tokenizer is to create tokens for a given SQL query. Tokenizer scans SQL query from left to right and generates a list of tokens. Following are some of the types of tokens used in SQL tokenizer.
Operator tokens Operator token used to tokenized arithmetic operators in SQL queries. Here are some of the example set of Operator tokens.
“-” as token MINUS
“(” as token LP
“+” as token PLUS
Keyword tokens Keyword token used to tokenized the keywords used in SQL queries. ADD, SELECT, ROW, INSERT are examples for Keyword tokens.
Whitespace tokens This token used to separate other tokens. Here comments are also identified as white space tokens

PARSER

Paser used to generate a parse tree by reading a stream of tokens. SQLite uses Lemon parser to parse SQL queries. Lemon parser allows SQLite C compiler to generate relevant C code from given sets of language rules. Language rules are defined by using Context Free Grammer(CFG). Example parser tree for the following SQL statement as follows.

SELECT title FROM StartsIn, MovieStar WHERE starName=”Jack” AND birthday like “%1990”

Here below an example, CFG rules to generate a program to evaluate a simple mathematical expression.

   1  program ::= expr(A).   { std::cout << "Result=" << A << std::endl; }
   2
   3  expr(A) ::= expr(B) MINUS  expr(C).   { A = B - C; }
   4  expr(A) ::= expr(B) PLUS  expr(C).   { A = B + C; }
   5  expr(A) ::= expr(B) TIMES  expr(C).   { A = B * C; }
   6  expr(A) ::= expr(B) DIVIDE expr(C).  {

   7           if(C != 0){
   8             A = B / C;
   9            }else{
   10             std::cout << "divide by zero" << std::endl;
   11             }
   12  }  /* end of DIVIDE */

  13  expr(A) ::= INTEGER(B). { A = B; }

Here MINUS, PLUS, TIMES and DIVIDE are the operator token generated by the tokenizer. In this code from line number 3 to 6 defined the CFG for mathematical operation. Lemon parser takes the precedence of these to operate as same as the given order. Think the given expression to evaluate is “3+5*6/3”. Tokenized output will be

3 PLUSE 5 TIMES 6 DIVIDE 3

According to the precedence, Lemon parser evaluate line #6 first and evaluate 6/3. Then it evaluates TIMES operation in line #5 and the result would be 10. Finally, it evaluates line #4 and generates the final result as 13. This result will be printed in line #1 which has the least precedence.
All of the SQL rules are defined in the same kind of structure as I described in the previous example.

CODE GENERATOR AND VIRTUAL MACHINE

While SQL query parsing, SQLite generate a sequence of instruction that needs to runs in order to generate the result. Each of these instructions has some operation to perform and inputs. It same as calling to a function with arguments. These generated codes are running on top of the Virtual machine. The virtual machine read each of this instruction one by one and evaluate it. To understand what are the available operations and how Virtual Machine works, we need to have a good understanding of Btree.

BTREE

Btree is a data structure used to store data as a tree based on its value. The simplest form of the Btree is the Binary tree. Databases use a B-tree data structure to store indexes to improve the performance of the database. Data records are stored in a B+tree structure. If no indexing use, only B+tree used to store the data. A cursor is a special pointer which used to point a record( or row) which given with page id and offset. You can travel through all data records by using the cursor. To find the data in given Btree which used indexing, SQLite start move cursor to the root node first element and perform a B-tree search. If there are no Btree indexing available, SQLite searches the matching record in the B+tree by using linear search. You can find out more details about Btree in this article.
To search for data on a B+tree, the sequence of operation that generated by Code Generator as follows. You can check what was the sequence of instruction that needs to run for a particular SQL query by adding “explain” keyword at the beginning of the query.

explain select * from foo where value = "value";

0 |ColumnCount   |1 |0 |
1 |ColumnName    |0 |0 |value
2 |Open          |0 |3 |foo     // Open cursor and point it to the third page which is root page for foo table(p3 is not important )
3 |VerifyCookie  |46|0 |       // Make sure schema not changed
4 |Rewind        |0 |11|       // Go to first entry
5 |Column        |0 |0 |       // read data and push it on stack
6 |Column        |0 |0 |
7 |Ne            |1 |10|       //Pop the top two elements from the stack.  If they are not equal, then jump to instruction P2.  Otherwise, continue to the next instruction.
8 |Column        |0 |0 |
9 |Callback      |1 |0 |       // Pop P1 values off the stack and form them into an array

10|Next          |0 |5 |       // Move cursor to next record, if data exit goto P2 else go to next line
11|Close         |0 |0 |       // Close the cursor

PAGER

Pages are the smallest unit of a transaction on the file system. Btree and B+tree stored on the page. When the database needs to read data from a file, it requests it as a page. Once the page loaded into the database engine it can store page if it accesses frequently on its cache. Pages are numbered and start from one. The first page is called the root page. Size of a page is constant.

VFS (VIRTUAL FILE SYSTEM)

File access on Unix and Windows are different from each other. VFS provides common API to access files without considering the type of the operating system it runs on. This API includes functions to open, read, write and close a file. Here follows are some of the API used in VFS to read, write data into a file. You can find out more about Pager and VFS in this blog post.

CONCLUSION

In this post, I tried to explain the SQLite database implementation at a high level. If you are interested to know more about the internal architecture of SQLite, I like to share my knowledge with you. So, comment down if there is any clarification/ improvement that needs to add. Also here my backend implementation that used to understand the SQLite architecture. In this repository, I tried to separate each of the layers in SQLite and write some test for each of the layers.

DEV Community