<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: LarmkaartDev</title>
    <description>The latest articles on DEV Community by LarmkaartDev (@larmkaart).</description>
    <link>https://dev.to/larmkaart</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1117465%2F96d93bce-c1f9-4107-a9da-442e7458c09b.jpg</url>
      <title>DEV Community: LarmkaartDev</title>
      <link>https://dev.to/larmkaart</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/larmkaart"/>
    <language>en</language>
    <item>
      <title>The parser (Part 1: Equations)</title>
      <dc:creator>LarmkaartDev</dc:creator>
      <pubDate>Tue, 11 Jul 2023 16:33:56 +0000</pubDate>
      <link>https://dev.to/larmkaart/creating-a-basic-compiler-the-parser-1ko0</link>
      <guid>https://dev.to/larmkaart/creating-a-basic-compiler-the-parser-1ko0</guid>
      <description>&lt;p&gt;The parser is a pretty complex component of the compiler. It will be responsible for converting the tokens into a format the Semantic analyzer can read and execute.&lt;/p&gt;

&lt;p&gt;The parser will 'parse' all of the tokens the lexer generated. Parsing means that the parser will process the tokens and generate an Abstract Syntax Tree (AST) with them like described previously.&lt;/p&gt;

&lt;p&gt;Let's see how the parser does this.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Equations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; In the following paragraph I'll be simplifying all tokens to just their Value properties, so if i write  &lt;code&gt;1 + 2&lt;/code&gt; i actually mean &lt;code&gt;{"Value":"1","Type":"number"},{"Value":"+","Type":"operator"},{"Value":"2","Type":"number"}&lt;/code&gt;&lt;br&gt;
This is just to make the entire example easier to follow.&lt;/p&gt;

&lt;p&gt;Equations are pretty difficult to parse because you have to take order of operations in account. Instead of normal mathicmatical notation, the parser will make use of a clever notation called &lt;a href="https://en.wikipedia.org/wiki/Reverse_Polish_notation"&gt;Postfix notation&lt;/a&gt; (click for more details). Postfix notation is a type of notation that gets rid of our order of operation problem. Instead of putting the operator between the terms, postfix will put it behind them.&lt;/p&gt;

&lt;p&gt;For example the postfix notation will write&lt;/p&gt;

&lt;p&gt;&lt;code&gt;1 + 2&lt;/code&gt; as &lt;code&gt;1 2 +&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;1 + 2 * 3&lt;/code&gt; as &lt;code&gt;1 2 3 * +&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;(1 + 2) * 3&lt;/code&gt; as &lt;code&gt;1 2 + 3 *&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This notation might seem very confusing, but it will allow computers to calculate mathematical equations way quicker and easier.&lt;/p&gt;

&lt;p&gt;To convert our standard notation into the postfix notation I will make use of the 'Shunting yard algorithm'. I won't go into details of implementation, but if you want to know more details and how to implement the algorithm, you should check out the &lt;a href="https://en.wikipedia.org/wiki/Shunting_yard_algorithm"&gt;Wikipedia page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now that we have the equation in postfix notation, we need to convert it into an AST so the Semantic analyzer will be able process and solve equation. for this I'll be using the following piece of code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generatePostfix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;parsePostfix&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="c1"&gt;-- current location in queue&lt;/span&gt;
    &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;findNextOperator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;-- find next operator&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt; &lt;span class="o"&gt;~=&lt;/span&gt; &lt;span class="kc"&gt;nil&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="c1"&gt;-- while there still exists an operator&lt;/span&gt;
        &lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="c1"&gt;-- Look at the previous token&lt;/span&gt;
        &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;rhs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;peekQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;-- right hand side&lt;/span&gt;
        &lt;span class="n"&gt;removeQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;-- remove the processed token&lt;/span&gt;
        &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="c1"&gt;-- look at the previous token&lt;/span&gt;
        &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;lhs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;peekQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;-- left hand side&lt;/span&gt;
        &lt;span class="n"&gt;removeQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;-- remove the processed token&lt;/span&gt;

        &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;equation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"equation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Operator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Left&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lhs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Right&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rhs&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rhs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"number"&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;lhs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"number"&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="c1"&gt;-- the equation only has numbers as terms?&lt;/span&gt;
            &lt;span class="n"&gt;insertQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;solveEquation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;equation&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;-- Insert the solved equation in the queue&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="n"&gt;insertQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentPostfixIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;equation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;-- Insert the unsolved equationn in the queue&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
        &lt;span class="c1"&gt;-- This is a quick optimisation so the Semantic analyzer doesn't have to calculate as much equations&lt;/span&gt;

        &lt;span class="n"&gt;operator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;findNextOperator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;-- find the next operator&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code might seem a bit overwhelming at first so let's go through the code step by step parsing the equation&lt;br&gt;
&lt;code&gt;(1 + 2) * (3 + 4)&lt;/code&gt;&lt;br&gt;
the equation in postfix notation will be&lt;br&gt;
&lt;code&gt;1 2 + 3 4 + *&lt;/code&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Look for the next operator, in our case this will be &lt;code&gt;+&lt;/code&gt; at index 3.&lt;/li&gt;
&lt;li&gt;Put the previous token &lt;code&gt;2&lt;/code&gt; in the right side of the equation&lt;/li&gt;
&lt;li&gt;Put the previous token &lt;code&gt;1&lt;/code&gt;in the left side of the equation&lt;/li&gt;
&lt;li&gt;Check if the equation is already solvabe, if so, put the result of the equation into the queue. Otherwise put the unsolved equation in the queue. The queue should now look like &lt;code&gt;3 3 4 + *&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;repeat steps 1-4 untill no more operators are left. The entire equation should now be solved and the value in the queue should be &lt;code&gt;21&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;all that is left is to return the queue so the equation AST can be processed further. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Lets look a bit closer at step 5. in the first iteration the queue should be&lt;br&gt;
&lt;code&gt;3 3 4 + *&lt;/code&gt;&lt;br&gt;
Now we iterate through step 1-4 again&lt;br&gt;
&lt;code&gt;3 7 *&lt;/code&gt;&lt;br&gt;
one more time&lt;br&gt;
&lt;code&gt;21&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Now we have succesfully solved the equation using the postfix notation!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>The lexer</title>
      <dc:creator>LarmkaartDev</dc:creator>
      <pubDate>Tue, 11 Jul 2023 16:32:34 +0000</pubDate>
      <link>https://dev.to/larmkaart/creating-a-basic-compiler-the-lexer-2bej</link>
      <guid>https://dev.to/larmkaart/creating-a-basic-compiler-the-lexer-2bej</guid>
      <description>&lt;p&gt;The lexer is a surprisingly simple component of a compiler. It mainly consists of a bunch of if-statments. The lexer will break up the code in seperate lines and analyze them one by one.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Tokens
&lt;/h3&gt;

&lt;p&gt;But before looking at the lexer, let's take a better look at tokens. A token is a very simple array that consists of a Type and a Value:&lt;br&gt;
&lt;code&gt;{ Type = TYPE_NAME, Value = TOKEN_VALUE }&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;Type&lt;/code&gt; property describes the type of token like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a statement&lt;/li&gt;
&lt;li&gt;a number&lt;/li&gt;
&lt;li&gt;a variable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;Value&lt;/code&gt; property can have different meanings depending on the type:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the statement type&lt;/li&gt;
&lt;li&gt;the number value&lt;/li&gt;
&lt;li&gt;the variable name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;a token can have even more properties depending on the type.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. The lexer
&lt;/h3&gt;

&lt;p&gt;Let's see how the lexer will break down this example line into tokens:&lt;br&gt;
&lt;code&gt;var y = x + 2&lt;/code&gt;&lt;br&gt;
I will use the &lt;code&gt;string&lt;/code&gt; variable in the code examples to represent the part we are currently looking at.&lt;/p&gt;

&lt;p&gt;the first part is &lt;code&gt;var&lt;/code&gt;. We will insert a variable declaration token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"var"&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="n"&gt;pushTokens&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"statement"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"var"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="c1"&gt;-- puts the token at the end of the tokens list&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we have created our first token using the lexer!&lt;/p&gt;

&lt;p&gt;Next up is &lt;code&gt;y&lt;/code&gt;. This is a variable, but the lexer doesn't know this. Luckily, it is able to look at the previous token and see that it's a declaration, so it will add the &lt;code&gt;y&lt;/code&gt; variable to the variable list.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prevToken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"statement"&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;prevToken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"var"&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="c1"&gt;-- the previous token declares a variable&lt;/span&gt;
    &lt;span class="n"&gt;pushTokens&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"variable"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="c1"&gt;-- put new token in the token list&lt;/span&gt;
    &lt;span class="nb"&gt;table.insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;localVariables&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;-- put new variable in the variable list&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we have &lt;code&gt;=&lt;/code&gt;. This is a simple case of adding a new token with Type &lt;code&gt;assigner&lt;/code&gt; and value &lt;code&gt;=&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="n"&gt;pushTokens&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"assigner"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"="&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;now the variable &lt;code&gt;x&lt;/code&gt;. We will say we defined &lt;code&gt;x&lt;/code&gt; earlier in the code somewhere, so the lexer already knows it's a variable. If it wasn't defined and the previous token is not a variable declaration token then the lexer should throw an error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;localVariables&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="c1"&gt;-- A variable exist with name x?&lt;/span&gt;
    &lt;span class="n"&gt;pushTokens&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"variable"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="c1"&gt;-- Let's add it!&lt;/span&gt;
&lt;span class="k"&gt;elseif&lt;/span&gt; &lt;span class="n"&gt;prevToken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"statement"&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;prevToken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"var"&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="n"&gt;pushTokens&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"variable"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="nb"&gt;table.insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;localVariables&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="nb"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Unknown variable "&lt;/span&gt; &lt;span class="o"&gt;..&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;all thats's left are &lt;code&gt;+&lt;/code&gt; and &lt;code&gt;2&lt;/code&gt; these will be converted into these simple tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"operator"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"+"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"number"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we have fully generated all of our tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"statement"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"var"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"variable"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"y"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"assigner"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"="&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"variable"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"operator"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"+"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"number"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a basic overview of the lexer. The more additions you add to your language, the complexer the lexer will become, so be sure to keep your code nice and tidy! &lt;/p&gt;

</description>
    </item>
    <item>
      <title>Creating a basic compiler - introduction</title>
      <dc:creator>LarmkaartDev</dc:creator>
      <pubDate>Tue, 11 Jul 2023 15:10:49 +0000</pubDate>
      <link>https://dev.to/larmkaart/wipcreating-a-basic-compiler-e4h</link>
      <guid>https://dev.to/larmkaart/wipcreating-a-basic-compiler-e4h</guid>
      <description>&lt;p&gt;For the past week I've been working on a game with a little compiler in Roblox Studio, I want to create a tool for beginners to help them get into coding. The compiler will make use of a high-level, Lua-like language. The goal of the game is to steer a little robot and make it do various tasks like navigating through a maze.&lt;/p&gt;

&lt;p&gt;I've chosen Roblox Studio because of it's user-friendly nature. It will be easy for beginners to just start up the game and start coding! Roblox also handles most of the User interface by itself, so I will be able to focus more on coding the acutal logic. Another reason is that i already have a bunch of experience with Roblox Studio.&lt;/p&gt;

&lt;p&gt;In this blog I want to focus on the journey of me creating my first ever compiler and explain my thought process. But before we start I want to make a few things clear:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most of the code shown in this blog will be a simplification from the real code to make it easier to understand&lt;/li&gt;
&lt;li&gt;Most of the code shown will be taken out of context&lt;/li&gt;
&lt;li&gt;All code will be shown in either Luau, Roblox's main coding language, or my own language. additional comments will be provided whenever necessary&lt;/li&gt;
&lt;li&gt;This compiler is made by me, it might differ from official methods&lt;/li&gt;
&lt;li&gt;This series is still &lt;strong&gt;WIP&lt;/strong&gt;. I am writing this while still working on the compiler, some things are bound to change. Feedback is greatly appreciated!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now that that's out of the way, lets start!&lt;/p&gt;

&lt;h2&gt;
  
  
  Chapter 1: Main overview
&lt;/h2&gt;

&lt;p&gt;You can make a compiler in a few different ways. For example a computer will convert Assembly language into Machine language and then execute it like its part of the code. In my approach I will be making use of an Abstract Syntax Tree (AST for short) I will explain more about this in a later chapter.&lt;/p&gt;

&lt;p&gt;Now let's see a quick overview of my compiler!&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Lexer
&lt;/h3&gt;

&lt;p&gt;The lexer, also known as a tokeniser, will convert the raw code input into tokens. Just like a sentence has words, each with a different function, each word in a line of code will have their own function. &lt;/p&gt;

&lt;p&gt;Let's take the following line as example:&lt;br&gt;
&lt;code&gt;var y = x + 2&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;lets look at the functions of each part:&lt;br&gt;
&lt;code&gt;var&lt;/code&gt; is used to create a variable named y. &lt;br&gt;
&lt;code&gt;=&lt;/code&gt; is used to assign a value to y. &lt;br&gt;
&lt;code&gt;+&lt;/code&gt; is used to add two values together. &lt;br&gt;
&lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; are variables and &lt;code&gt;2&lt;/code&gt; is a number.&lt;/p&gt;

&lt;p&gt;The lexer will go through all of the code like this and generate tokens which define the function of all parts of the code. The parser will be able to process these tokens further.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. The Parser
&lt;/h3&gt;

&lt;p&gt;The parser will take the previously described tokens and create an Abstract Syntax Tree (AST). An AST is an abstract repesentation of code. An AST repesentation of the previous example might look like this:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--T9ebi3iZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wk1fbf90w3row0i5lg9n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--T9ebi3iZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wk1fbf90w3row0i5lg9n.png" alt="(forgive my bad drawing skills)" width="728" height="536"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Using Semantic analysis the computer will be able to run the AST.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. The Semantic analyzer
&lt;/h3&gt;

&lt;p&gt;** WIP **&lt;/p&gt;
&lt;h2&gt;
  
  
  Chapter 2: Before starting
&lt;/h2&gt;

&lt;p&gt;Before you start actually writing a compiler, there are a few things you should take care of. &lt;/p&gt;
&lt;h3&gt;
  
  
  1. Create a language description
&lt;/h3&gt;

&lt;p&gt;Creating a description of the grammar of the language in a text file is an important thing to do before writing the compiler. This will provide a structure to your language and make coding it easier. The description should consist of a name, description and token repesentation. I will provide you with a part of my own description:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;--------------------------] Types
number - any type of number                     { Type = number, Value = NUMBER_VALUE }
string - a string of text                       { Type = string, Value = STRING_VALUE }
variable -  an object with any type of value    { Type = variable, Value = VARIABLE_NAME }
equation - an equation                          { Type = equation, Operator = OPERATOR, Left = LHS, Right = RHS }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Create a comfortable work environment
&lt;/h3&gt;

&lt;p&gt;Before creating the compiler you should make yourself a comfortable work environment. Create a basic code-editor and console, this will make debugging and working with the compiler much easier.&lt;/p&gt;

&lt;p&gt;Now you should be ready to start working!&lt;/p&gt;

</description>
      <category>luau</category>
      <category>compiler</category>
      <category>learning</category>
      <category>backend</category>
    </item>
  </channel>
</rss>
