<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: bingjia</title>
    <description>The latest articles on DEV Community by bingjia (@ccbhj).</description>
    <link>https://dev.to/ccbhj</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1111332%2Fcfae67df-3959-4823-8a90-d4771dd631a0.jpeg</url>
      <title>DEV Community: bingjia</title>
      <link>https://dev.to/ccbhj</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ccbhj"/>
    <language>en</language>
    <item>
      <title>Part I: Implement an expression interpreter for building DSL - Introduce the PEG parser</title>
      <dc:creator>bingjia</dc:creator>
      <pubDate>Sat, 03 Aug 2024 05:36:51 +0000</pubDate>
      <link>https://dev.to/ccbhj/part-i-implement-an-expression-interpreter-for-building-dsl-introduce-the-peg-parser-omf</link>
      <guid>https://dev.to/ccbhj/part-i-implement-an-expression-interpreter-for-building-dsl-introduce-the-peg-parser-omf</guid>
      <description>&lt;p&gt;In the last post I have introduced you the WHY and HOW I start the this project and WHAT the DSL looks like in the end. Starting from this post I will share the implementation of the whole project with you. &lt;/p&gt;

&lt;p&gt;Generally, when we implement a language, the first thing comes in our mind is lexer and then parser. So in this post I will introduce to you how I implement my DSL with specified detail but less conception so that it won't be too confused I hope.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's lexer?
&lt;/h2&gt;

&lt;p&gt;In general, lexer is used for &lt;a href="https://en.wikipedia.org/wiki/Lexical_analysis" rel="noopener noreferrer"&gt;Lexical Analysis&lt;/a&gt; or tokenization if you will. Let's take the clause &lt;em&gt;"We will rock you!"&lt;/em&gt; (the famous rock and roll music from Queen) as an example. After we tokenize it following the grammar rule of English, it shall output a list &lt;code&gt;[Subject("We"), Auxiliary("will"), Verb("rock"), Object("you"), Punctuation("!")]&lt;/code&gt;. So &lt;strong&gt;a lexer is mostly used for classify texts into certain types by its lexical meaning.&lt;/strong&gt; This is significant for us since a syntax is actually consist of lexical elements instead of characters or words. &lt;/p&gt;

&lt;p&gt;Usually we implement a lexer with some code generator that can parse &lt;a href="https://en.wikipedia.org/wiki/Regular_expression" rel="noopener noreferrer"&gt;Regular Express&lt;/a&gt; like &lt;a href="https://en.wikipedia.org/wiki/Ragel" rel="noopener noreferrer"&gt;Ragel&lt;/a&gt;, &lt;a href="https://github.com/blynn/nex" rel="noopener noreferrer"&gt;nex&lt;/a&gt; and so on. But I think you will be surprised how easy it is to implement a lexer after checking &lt;a href="https://www.youtube.com/watch?v=HxaD_trXwRE" rel="noopener noreferrer"&gt;Lexical Scanning in Go&lt;/a&gt; from Rob Pike. He had introduced an interesting pattern to implement a finite-state machine which I think is the core of an lexer.&lt;/p&gt;

&lt;h2&gt;
  
  
  How about the parser?
&lt;/h2&gt;

&lt;p&gt;So how about a parser? What is it used for? Basically, &lt;strong&gt;a parser is used to identify a list of lexical elements with the specified pattern which we also called it grammar.&lt;/strong&gt; Take the example &lt;em&gt;"We will rock you!"&lt;/em&gt; we introduced before, it produces a sequence of &lt;code&gt;[Subject("We"), Auxiliary("will"), Verb("rock"), Object("you"), Punctuation("!")]&lt;/code&gt; which matches the pattern of 'Future tense' grammar in English. So this is exactly what a parser do and is so called '&lt;em&gt;syntax analysis&lt;/em&gt;'. &lt;/p&gt;

&lt;p&gt;Let's take another example in a more computer way, how about an expression like &lt;code&gt;1 + 2 * 3&lt;/code&gt; ? It is obvious that they will be translated into &lt;code&gt;[Number(1), Operator(+), Number(2), Operator(*), Number(3)]&lt;/code&gt; by the lexer, but what this sequence will be translated into by a parser with a common mathematical expression grammar? In general, a lexical elements sequence will got translated into an &lt;em&gt;Abstract Syntax Tree&lt;/em&gt;(or AST in short) by parser like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;      *
     / \
    +   3
   /  \
  1    2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With an abstract syntax tree we can analyze the syntax in the correct order according to the rule of our grammar. &lt;/p&gt;

&lt;h2&gt;
  
  
  Let's implement a parser
&lt;/h2&gt;

&lt;p&gt;Now we are gonna implement an parser on our own. Well not entirely on our own, we still need some tool to help our to generate code for our parser since it is hard to implement it correctly and a hand-writing parser might be difficult to maintain, even if you do, the performance might be poor.&lt;/p&gt;

&lt;p&gt;Luckily, there are many mature parser generative tools that helps us to generate golang code with a grammar definition file. The first choice came in my mind is &lt;a href="https://pkg.go.dev/golang.org/x/tools/cmd/goyacc" rel="noopener noreferrer"&gt;go-yacc&lt;/a&gt; which is the official tool to generate go code for parser. I used to use it to write a SQL analyzer and the it wasn't a pleasure because it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;requires an external lexer. &lt;/li&gt;
&lt;li&gt;lack of documentation.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.gnu.org/software/bison/manual/html_node/Union-Decl.html" rel="noopener noreferrer"&gt;union definition&lt;/a&gt; and &lt;a href="https://www.gnu.org/software/bison/manual/html_node/Structured-Value-Type.html" rel="noopener noreferrer"&gt;value type declaration&lt;/a&gt; are both quite confusing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"Why not try something new?" I think, so there we go, with the amazing tool &lt;a href="https://github.com/pointlander/peg" rel="noopener noreferrer"&gt;peg&lt;/a&gt; I was able to implement both of the lexer and parser in one single grammar file, one single interface. Let's take a closer look at the peg.&lt;/p&gt;

&lt;h3&gt;
  
  
  A closer look at PEG
&lt;/h3&gt;

&lt;p&gt;PEG stands for &lt;a href="https://en.wikipedia.org/wiki/Parsing_expression_grammar" rel="noopener noreferrer"&gt;Parsing Expression Grammar&lt;/a&gt; introduced by Bryan Ford in 2004, which is an alternative to the traditional &lt;a href="https://en.wikipedia.org/wiki/Context-free_grammar" rel="noopener noreferrer"&gt;Context Free Grammar&lt;/a&gt; used for describing and expressing programming language syntax and protocol.&lt;/p&gt;

&lt;p&gt;For the last decades, we have been using parser generative tool like &lt;a href="https://en.wikipedia.org/wiki/Yacc" rel="noopener noreferrer"&gt;yacc&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/GNU_Bison" rel="noopener noreferrer"&gt;bison&lt;/a&gt; that supports CFG to generate parser code, and if you have used them before you might find it difficult to avoid ambiguity and integrate them with the lexer or the regular expression. In fact, the syntax of a programming language is not just patterns of lexical elements but also the rules of those lexical elements which somehow the CFG is missing so when we use tools like yacc &lt;strong&gt;we will have to implement lexer by our self&lt;/strong&gt;. Further more, to avoid ambiguity(like the precedence between plus and multiply, check &lt;a href="https://www.gnu.org/software/bison/manual/html_node/Precedence.html" rel="noopener noreferrer"&gt;this&lt;/a&gt; out) in CFG &lt;strong&gt;we have to define the precedence for each operator&lt;/strong&gt;. All of these crucial fact makes it unnecessarily difficult to develop a parser.&lt;/p&gt;

&lt;p&gt;But thanks to Bryan Ford, now we have another good choice, the PEG that allows us to define the lexical and syntax rule all in one single file with a tight DSL and resolve ambiguity in an elegant and simple way. Let me show you how easy it can be done with peg. &lt;/p&gt;

&lt;h4&gt;
  
  
  Example and code come first
&lt;/h4&gt;

&lt;p&gt;I gonna take examples from my gendsl library which implements a lisp-like syntax(you can check it out &lt;a href="https://github.com/ccbhj/gendsl/blob/main/grammar.peg" rel="noopener noreferrer"&gt;here&lt;/a&gt;). Here is a simple snippet that can parse hex and decimal number literals in the golang style:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package playground

type parser Peg {
}

Script          &amp;lt;- Value EOF

EOF             &amp;lt;- !.

Value           &amp;lt;- IntegerLiteral

IntegerLiteral  &amp;lt;- [+\-]? ('0' ('x' / 'X') HexNumeral 
                           / DecimalNumeral ) [uU]?

HexNumeral      &amp;lt;- HexDigit ([_]* HexDigit)* / '0'

HexDigit        &amp;lt;- [0-9] / [A-F] / [a-f]

DecimalNumeral  &amp;lt;- [1-9] ([_]* [0-9])* / '0'     

# ...                      
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first line &lt;code&gt;package gendsl&lt;/code&gt; is package declaration which decides which package the generated golang file belongs to. The following type declaration &lt;code&gt;type parser Peg {}&lt;/code&gt; used to define the parser type, which we will use it later for evaluation but you can ignore it for now.&lt;/p&gt;

&lt;p&gt;After the parser type declaration we can start to define your syntax rule till the end. This is different from the workflow I used to work with with yacc when I have to define a union type and a lot of token types before I can actually define my grammar, which could be really confusing. Anyway, let's take a quick look at the grammar definition.&lt;/p&gt;

&lt;h4&gt;
  
  
  The very first rule
&lt;/h4&gt;

&lt;p&gt;If you have worked with CFG before you might find the definition DSL syntax quite familiar. The right hand side of the '&amp;lt;-' refers to the pattern of lexical elements, which could be some other patterns or character sequence, and the left hand side is the name of the pattern. Pretty easy, right? &lt;/p&gt;

&lt;p&gt;Let's pay attention to the first pattern rule here since &lt;strong&gt;the first rule is always to entry point of the parser.&lt;/strong&gt; The entry point &lt;code&gt;Script&lt;/code&gt; here is consist of two parts, one is a rule refers to &lt;code&gt;Value&lt;/code&gt; which is consist of a sequence of specified characters(we will get back to this later), the other one &lt;code&gt;EOF&lt;/code&gt; is kind of interesting. Let's jump to the next rule to find the pattern of &lt;code&gt;EOF&lt;/code&gt;. As you can see, &lt;em&gt;EOF&lt;/em&gt; is consist of &lt;code&gt;!.&lt;/code&gt;. What does &lt;code&gt;!.&lt;/code&gt; mean? The &lt;code&gt;!&lt;/code&gt;actually means &lt;code&gt;NOT&lt;/code&gt;, and &lt;code&gt;.&lt;/code&gt; means any character, so &lt;strong&gt;&lt;code&gt;!.&lt;/code&gt; means &lt;code&gt;NOTHING AT ALL&lt;/code&gt; or &lt;code&gt;End Of File&lt;/code&gt; if you will&lt;/strong&gt;. As a result whenever the parser find there is no character to read, it will stop here and treat it as an dummy rule call &lt;code&gt;EOF&lt;/code&gt; which might produces the rule &lt;code&gt;Script&lt;/code&gt;. This is quite a common pattern for PEG. &lt;/p&gt;

&lt;h4&gt;
  
  
  More about the rule syntax
&lt;/h4&gt;

&lt;p&gt;So much like the regular expression(RE), the syntax of PEG is simple: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.&lt;/code&gt; stands for any character.&lt;/li&gt;
&lt;li&gt;character set like &lt;code&gt;[a-z]&lt;/code&gt; is also supported.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;X&lt;/code&gt; matches one character if X is a character or a pattern when X is the name of an rule.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;X?&lt;/code&gt; matches one character or pattern or none, while &lt;code&gt;X*&lt;/code&gt; matches zero or more and 'X+' matches one or more.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;X \ Y&lt;/code&gt; matches X or Y where X and Y could be characters, patterns or rule name. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Take the rule of &lt;code&gt;DecimalNumeral&lt;/code&gt; as an example. The first part &lt;code&gt;[1-9]&lt;/code&gt; means the start of an &lt;code&gt;DecimalNumeral&lt;/code&gt; must be one of a digit ranging from 1 to 9, &lt;code&gt;([_]* [0-9])*&lt;/code&gt; means starting from the second position, all character, if there is any, must all be digit(0-9) that has might have no &lt;code&gt;'_'&lt;/code&gt; or more than one &lt;code&gt;'_'&lt;/code&gt; as its prefix so it could match string like &lt;code&gt;"10_2_3"&lt;/code&gt;. Otherwise, indicated by the operator &lt;code&gt;\&lt;/code&gt;, it could also just be one single character &lt;code&gt;'0'&lt;/code&gt; which means 0 obviously .&lt;/p&gt;

&lt;h4&gt;
  
  
  Resolving ambiguity
&lt;/h4&gt;

&lt;p&gt;I'd like to spend more time to explain the or operator &lt;code&gt;\&lt;/code&gt;, since it is quite important as the solution to the ambiguity. &lt;strong&gt;The PEG will always try to match the first pattern and then the second, the third and so on til it finds one matched, which is considered as earliest-match-first&lt;/strong&gt;. For example, a string &lt;code&gt;"ab"&lt;/code&gt; will never be able to match the grammar &lt;code&gt;G &amp;lt;- 'a' / 'a' 'b'&lt;/code&gt;, since the first character &lt;code&gt;'a'&lt;/code&gt; will be reduced to &lt;code&gt;G&lt;/code&gt; but the &lt;code&gt;'b'&lt;/code&gt; left cannot match anything. By the way, CFG doesn't allow such a rule and will throw the reduce/shift conflict error.&lt;/p&gt;

&lt;p&gt;There is no much syntax left, you can explore them yourself in the &lt;a href="https://github.com/pointlander/peg" rel="noopener noreferrer"&gt;pointlander/peg README&lt;/a&gt; or &lt;a href="https://www.piumarta.com/software/peg/peg.1.html" rel="noopener noreferrer"&gt;peg doc&lt;/a&gt;. &lt;/p&gt;

&lt;h4&gt;
  
  
  Let's have a try
&lt;/h4&gt;

&lt;p&gt;Now that we already have a simple syntax rule prepared above, though it is not the whole grammar for the gendsl project but it can still parse some numbers. Anyway let's generate some code and see if it works as we expect.&lt;/p&gt;

&lt;h5&gt;
  
  
  Preparation
&lt;/h5&gt;

&lt;p&gt;First we have to install the peg binary tool for code generate following this &lt;a href="https://github.com/pointlander/peg?tab=readme-ov-file#installing" rel="noopener noreferrer"&gt;guide&lt;/a&gt;, then we gonna setup our workspace directory for playing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; mkdir peg_playground &amp;amp;&amp;amp; peg_playground 
&amp;gt; go mod init peg_playground 
&amp;gt; touch grammar.peg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Paste the grammar we have before into the &lt;code&gt;peg_playground/grammar.peg&lt;/code&gt;, now we should be able to genreate the code using the generate tool but why not make a Makefile in &lt;code&gt;peg_playground/makefile&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nv"&gt;GO&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; go

&lt;span class="nl"&gt;.SUFFIXES&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;.peg .go&lt;/span&gt;

&lt;span class="nl"&gt;grammar.go&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;grammar.peg&lt;/span&gt;
    peg &lt;span class="nt"&gt;-switch&lt;/span&gt; &lt;span class="nt"&gt;-inline&lt;/span&gt; &lt;span class="nt"&gt;-strict&lt;/span&gt; &lt;span class="nt"&gt;-output&lt;/span&gt; ./&lt;span class="nv"&gt;$@&lt;/span&gt; &lt;span class="nv"&gt;$&amp;lt;&lt;/span&gt;

&lt;span class="nl"&gt;all&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;grammar.go&lt;/span&gt;

&lt;span class="nl"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;rm &lt;/span&gt;grammar.go 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  Generate and test the parser
&lt;/h5&gt;

&lt;p&gt;Now that we have everything ready, let's generate the code for parser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make grammar.go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After running the command, you should see a generated grammar.go in the workspace directory. Let's write a function to parse a string and access our parser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// peg_playground/parser.go&lt;/span&gt;
&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;playground&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;PrintAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;script&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Buffer&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;script&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Pretty&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Init&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PrintSyntaxTree&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The snippet here is pretty simple, it initializes the parser, parses the script we pass to it and print the syntax tree in final. Let's write an unit test to see if it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// peg_playground/parser_test.go&lt;/span&gt;
&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;playground&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"testing"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestPrintTree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;PrintAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`0x123`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"-----------------------------------------------------"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;PrintAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`10_2_3`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"-----------------------------------------------------"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The test function &lt;code&gt;TestPrintTree&lt;/code&gt; calls the &lt;code&gt;PrintAST&lt;/code&gt; and check the error. Let's run it now and see what it gonna print:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;go test . -v
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we should see the whole syntax tree in the output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=== RUN   TestPrintTree
Script "0x123"
 Value "0x123"
  IntegerLiteral "0x123"
   HexNumeral "123"
    HexDigit "1"
    HexDigit "2"
    HexDigit "3"
    parser_test.go:11: -----------------------------------------------------
Script "10_2_3"
 Value "10_2_3"
  IntegerLiteral "10_2_3"
   DecimalNumeral "10_2_3"
    parser_test.go:16: -----------------------------------------------------
--- PASS: TestPrintTree (0.00s)
PASS
ok      playground      0.649s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It looks great, right? Everything works as we expected. No syntax error thrown and it prints every rule matched and the string it matches in a format of tree, which could be really useful when debugging.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap it up
&lt;/h2&gt;

&lt;p&gt;In this post, I have introduced you the two basic but significant parts of interpreter programming language:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lexer, for lexical analysis that transforms a string into a sequence of lexical elements.&lt;/li&gt;
&lt;li&gt;Parser, for syntax analysis that identify the the pattern (so called grammar) in the lexical elements and produces a syntax tree.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And then I introduce the PEG for parser code generating, and address its advantages comparing the traditional CFG:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lexer rule integrated, no standalone lexer need to be implemented.&lt;/li&gt;
&lt;li&gt;Simple, regular expression like syntax to start up fast.&lt;/li&gt;
&lt;li&gt;No ambiguity, no reduce/shift conflict, always earlier-match-first.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally I have a tiny demonstration of how to generate parser with PEG, which is the basis of our interpreter.&lt;br&gt;
In next post, I will walk you through the gendsl grammar in detail.&lt;br&gt;
Thank you for checking this post, hope you enjoy it.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>go</category>
      <category>compiling</category>
    </item>
    <item>
      <title>Intro: Implement an expression interpreter for building DSL</title>
      <dc:creator>bingjia</dc:creator>
      <pubDate>Sat, 20 Jul 2024 09:21:53 +0000</pubDate>
      <link>https://dev.to/ccbhj/implement-an-expression-interpreter-for-building-dsl-3908</link>
      <guid>https://dev.to/ccbhj/implement-an-expression-interpreter-for-building-dsl-3908</guid>
      <description>&lt;p&gt;The other day I ran into a meetup recording video in Youtube presented by William Byrd(a funny and very wise guy) talking about how to write an interpreter in scheme that evaluates scheme itself, which got me fascinated immediately by the simplicity, elegance and power of the meta programing language like common-lisp, scheme and other language in lisp family(you may heard of the parenthesis joke about lisp). I think this video had provided me exactly the right way to introduce a DSL for my rule engine that I have been working on for the last few weeks. So eventually I had came out a DSL framework that enables you to write you DSL in a minutes without dealing with any lexer or parser - &lt;a href="https://github.com/ccbhj/gendsl" rel="noopener noreferrer"&gt;a general DSL&lt;/a&gt;&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/OyfBQmvr2Hc"&gt;
&lt;/iframe&gt;
 &lt;/p&gt;

&lt;p&gt;I will make a series of blog to share my work and some thoughts during my development. But first, let me show you what I think about the beauty of the idea of DSL(Domain Specified Language) and the story about how I start this project.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why DSL?
&lt;/h2&gt;

&lt;p&gt;What do we need DSL? The answer is quite simple: &lt;strong&gt;to have better interaction between our program and users&lt;/strong&gt;.&lt;br&gt;
Think about it!  Most of our programs or apps accepts some key value pairs as an i nput. For example, a command line tool may accept users input from arguments(e.g. "--key value"), or environment variable(e.g. "KEY=VALUE ./cli"). Even for a http server, we may accept url query(e.g. "&lt;a href="http://127.0.0.1:80?key=value%22" rel="noopener noreferrer"&gt;http://127.0.0.1:80?key=value"&lt;/a&gt;), headers(e.g. "Content-Type: application/json"), or body in json(e.g '{"key": "value"}'). But when a program's behavior depend heavily on user input, key value pairs will become insufficient. Take the famous awk for example, how could you tell awk how to query/filter/extract/format strings with key value pair? Even if you could, the user experience will be a disaster since users will have to fill a lot of key to describe his/her need. &lt;/p&gt;

&lt;p&gt;However, with a proper designed DSL as the input of your app, you can have:&lt;br&gt;
    1. &lt;strong&gt;Better user experience&lt;/strong&gt;. Since the DSL is domain specified, it can describe what the user want in a simple and brief way so that users won't have to input too much.&lt;br&gt;
    2. &lt;strong&gt;Decouple with protocol and types&lt;/strong&gt;. Since a DSL expression is just nothing more than just a simple string, you can easily pass it in the command line arguments, environment variable, http query, http body or even socket. Besides, you can read it in any language(since most language support string type) and that is the beauty of SQL which adapts so many language client and different databases. &lt;br&gt;
    3. &lt;strong&gt;Readability&lt;/strong&gt;. Not only it makes it easy for users to describe their need easily, but also frees you from all those checking and validation of users input. The interpreter will do that for you since the grammar is always of certainty.&lt;/p&gt;
&lt;h2&gt;
  
  
  What I need in my DSL?
&lt;/h2&gt;

&lt;p&gt;Now that I have a reason to import DSL in my rule engine, I should consider what it gonna look like.&lt;br&gt;
Months ago I happened to be asked by my leader to write an expression engine that can evaluate expressions into elasticsearch query or into a golang value, and I solved it by introducing the golang's standard library &lt;a href="https://pkg.go.dev/go/parser" rel="noopener noreferrer"&gt;go/parser&lt;/a&gt; to turn golang expressions into an AST(Abstract Syntax Tree) and parse it into different values like elasticsearch query, a simple boolean value or something else. With a few lines of code like this you can parse a golang expression in anyway you want:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"go/ast"&lt;/span&gt;
    &lt;span class="s"&gt;"go/parser"&lt;/span&gt;
    &lt;span class="s"&gt;"go/token"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;expr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"true &amp;amp;&amp;amp; true"&lt;/span&gt;
    &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseExpr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parseNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c"&gt;// =&amp;gt; true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;parseNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;any&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BinaryExpr&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LAND&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;parseNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;parseNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c"&gt;// ...&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Ident&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s"&gt;"true"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s"&gt;"false"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;
            &lt;span class="c"&gt;// ...&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="c"&gt;// ...&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"unsupported node"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I gotta say that the decision of using golang's standard library was a good choice then because:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Only 5 days I was given to finish this job and with the standard library I can have a stable, well-tested compiler without using any lexer or parser.&lt;/li&gt;
&lt;li&gt;The golang expression syntax is pretty enough for me.&lt;/li&gt;
&lt;li&gt;My colleagues are good at golang so the syntax will be easy to learn and use for them. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But there are also some downsides that I have to deal with it when writing the DSL for my rule engine:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You can't customize your own syntax&lt;/strong&gt;. You can only have whatever the golang has, not your own keyword, not your own data types. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The readability is terrible&lt;/strong&gt; as the expression becomes complicated.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parsing and debugging a AST is difficult&lt;/strong&gt; to understand for my teammates and sometimes even for me(although I was the one who wrote it).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So I decide to write my own interpreter and I want it to have: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;My own syntax&lt;/strong&gt;, since I am looking for a DSL for a specified domain problem instead of Turing complete programming language.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accessiblity&lt;/strong&gt; to values including functions from the host language. I should be able to access value of the host language in the DSL.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplicity and readability.&lt;/strong&gt; I want everyone can pick it up and extends it in five minutes.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After checking out the presentation by William Byrd and some digging into the &lt;a href="https://en.wikipedia.org/wiki/Racket_(programming_language)" rel="noopener noreferrer"&gt;Racket&lt;/a&gt;(another dialect of Lisp), I had already been clear about my tiny DSL. It should be:&lt;/p&gt;

&lt;p&gt;1.** It should have the &lt;a href="https://en.wikipedia.org/wiki/S-expression" rel="noopener noreferrer"&gt;S-Expression&lt;/a&gt; syntax.** &lt;strong&gt;A S-Expression is either a atom value of a list of atom values.&lt;/strong&gt; This could make the syntax easy to learn and simple to parse.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;It should allow me to access values and even functions&lt;/strong&gt; from the host language, which is golang for my case.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It should have its own data types&lt;/strong&gt; so that accessing values or functions in host language in the DSL or writing extension for our DSL won't be confusing.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;It should be easy to extension the DSL&lt;/strong&gt;, or in another word,  to control the evaluation of an expression. It was confusing for my colleagues when I am explaining how to implement short circuit for a boolean expression.&lt;/p&gt;

&lt;p&gt;At this time, I had realized that &lt;strong&gt;I can build more of a DSL for my rule engine but a general and flexible framework to build DSL with the powerful and beautiful S-Expression&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  So what does it look like finally?
&lt;/h2&gt;

&lt;p&gt;At the first glance, it looks just like lisp, here is a simple example that can produce a json:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight common_lisp"&gt;&lt;code&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;json&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;later-than&lt;/span&gt; &lt;span class="nv"&gt;$NOW&lt;/span&gt; &lt;span class="mi"&gt;2012&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                                  &lt;span class="c1"&gt;; condition&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;kv&lt;/span&gt; &lt;span class="s"&gt;"language"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;array&lt;/span&gt; &lt;span class="s"&gt;"c"&lt;/span&gt; &lt;span class="s"&gt;"c++"&lt;/span&gt; &lt;span class="s"&gt;"javascript"&lt;/span&gt; &lt;span class="s"&gt;"elixir"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;; then&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;kv&lt;/span&gt; &lt;span class="s"&gt;"language"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;array&lt;/span&gt; &lt;span class="s"&gt;"c"&lt;/span&gt; &lt;span class="s"&gt;"c++"&lt;/span&gt; &lt;span class="s"&gt;"javascript"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;         &lt;span class="c1"&gt;; else&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;kv&lt;/span&gt; &lt;span class="s"&gt;"typing"&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;dict&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;kv&lt;/span&gt; &lt;span class="s"&gt;"c"&lt;/span&gt; &lt;span class="s"&gt;"static"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;kv&lt;/span&gt; &lt;span class="s"&gt;"c++"&lt;/span&gt; &lt;span class="s"&gt;"static"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;kv&lt;/span&gt; &lt;span class="s"&gt;"javascript"&lt;/span&gt; &lt;span class="s"&gt;"dynamic"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
 &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, the syntax is quite simple. Everything is basically consist of S-Expression, which is &lt;strong&gt;either an atomic value like integer/string/boolean/... or an expression in the format of '(X Y Z...)' where 'X' is considered as a function that produces an S-Expression and 'Y', 'Z', ... are considered as the arguments of function 'X'.&lt;/strong&gt; All those functions like 'kv', 'dict', 'array' and even 'if' are actually implemented in golang. Take 'kv' as an example, the implementation is quite simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// _kv evaluate a key and its value then assemble them to a KV struct&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;_kv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EvalCtx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Expr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ValueTypeString&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"key is not string, but %+v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserData&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;KV&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unwrap&lt;/span&gt;&lt;span class="p"&gt;()}},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The procedure 'kv' takes two arguments which are actually expressions, evaluates them by &lt;code&gt;args[n].Eval()&lt;/code&gt; into values, then constructs a KV struct and return it. With the powerful Eval method, we can even implement 'if' behavior that act more like a macro instead of just simple function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// _if check the condition of the first argument,&lt;/span&gt;
&lt;span class="c"&gt;// if condition returns not nil, evaluate the second argument, otherwise the third argument&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;_if&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EvalCtx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Expr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;condEnv&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewEnv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithProcedure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"later-than"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Procedure&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Eval&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CheckNArgs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_laterThan&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;WithInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$NOW"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_year&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="n"&gt;cond&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EvalWithEnv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;condEnv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cond&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;gendsl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ValueTypeBool&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;cond&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unwrap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You may have noticed, &lt;strong&gt;if the condition(first argument) returns true, the second expression will be evaluated but not the third expression&lt;/strong&gt;! This can never be done with languages like golang that is without macro because we are actually manipulate the AST. So amazing, right?&lt;/p&gt;

&lt;p&gt;Finally, with a few lines of code to implement functions like 'kv', 'dict', 'array', and 'if', the script above will produce a JSON output like this(full implementation is &lt;a href="https://github.com/ccbhj/gendsl/blob/main/examples/example_json_test.go" rel="noopener noreferrer"&gt;here&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"c"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"c++"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"javascript"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"elixir"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"typing"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"c"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"static"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"c++"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"static"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"javascript"": "&lt;/span&gt;&lt;span class="err"&gt;dynamic&lt;/span&gt;&lt;span class="s2"&gt;"
   }
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  To be continue...
&lt;/h2&gt;

&lt;p&gt;As you can see above, the DSL engine I came up with is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Highly customizable.&lt;/li&gt;
&lt;li&gt;Easy to learn and extend.&lt;/li&gt;
&lt;li&gt;Typing that is accessible.&lt;/li&gt;
&lt;li&gt;Full control of the syntax tree.
In the following articles I will share with you about how I implement it from scratch which is really fun and great journey for me. 
Hope you enjoy this post ❤️&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>go</category>
      <category>compiling</category>
    </item>
  </channel>
</rss>
