<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joshua Thijssen</title>
    <description>The latest articles on DEV Community by Joshua Thijssen (@jaytaph).</description>
    <link>https://dev.to/jaytaph</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F321917%2F3ba67ef8-4fc9-4f82-92e5-51590b695315.png</url>
      <title>DEV Community: Joshua Thijssen</title>
      <link>https://dev.to/jaytaph</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jaytaph"/>
    <language>en</language>
    <item>
      <title>BitMaelum - e2e encrypted mail</title>
      <dc:creator>Joshua Thijssen</dc:creator>
      <pubDate>Fri, 22 Jan 2021 13:31:37 +0000</pubDate>
      <link>https://dev.to/jaytaph/bitmaelum-e2e-encrypted-mail-28od</link>
      <guid>https://dev.to/jaytaph/bitmaelum-e2e-encrypted-mail-28od</guid>
      <description>&lt;p&gt;Hi all,&lt;/p&gt;

&lt;p&gt;We want to introduce BitMaelum (&lt;a href="https://github.com/bitmaelum):"&gt;https://github.com/bitmaelum):&lt;/a&gt; a privacy and security focused mail system written from the ground up. Its goal is to combat many of the pain-points found in the current email landscape: spam, privacy, mail/account ownership, etc. &lt;/p&gt;

&lt;p&gt;Some of the features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complete end-to-end encryption of mail, including meta-data.&lt;/li&gt;
&lt;li&gt;Combatting spam through a proof-of-work type system.&lt;/li&gt;
&lt;li&gt;Ownership of your bitmaelum address.&lt;/li&gt;
&lt;li&gt;Users, not the list owners, manage mailing lists subscriptions.&lt;/li&gt;
&lt;li&gt;Mobile friendly.&lt;/li&gt;
&lt;li&gt;Automation is made possible through events and webhooks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We have developed a system that we think is ready for our first developer release: our "hello world" release. The system consists of a mail-server, a mail-client, and some supporting tools. The quick start will guide you into creating an account to test the system. We like to invite developers and everyone interested to try out the system, leave feedback, or even help us implement new features and bugfixes.&lt;/p&gt;

&lt;p&gt;For more information, please take a look at our website: &lt;a href="https://bitmaelum.com"&gt;https://bitmaelum.com&lt;/a&gt;, our GitHub account &lt;a href="https://github.com/bitmaelum/bitmaelum-suite"&gt;https://github.com/bitmaelum/bitmaelum-suite&lt;/a&gt;, or send us an email via &lt;a href="mailto:info@bitmaelum.com"&gt;info@bitmaelum.com&lt;/a&gt; or through bitmaelum: bitmaelum!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BxhPh2Sg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/io9e0unpegzjk951kyl8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BxhPh2Sg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/io9e0unpegzjk951kyl8.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>crypto</category>
      <category>email</category>
      <category>privacy</category>
      <category>go</category>
    </item>
    <item>
      <title>A quest to save money on Amazon</title>
      <dc:creator>Joshua Thijssen</dc:creator>
      <pubDate>Tue, 28 Jan 2020 08:20:54 +0000</pubDate>
      <link>https://dev.to/jaytaph/a-quest-to-save-money-on-amazon-2n58</link>
      <guid>https://dev.to/jaytaph/a-quest-to-save-money-on-amazon-2n58</guid>
      <description>&lt;p&gt;One day I've received an email from Amazon Web Services with the subject "AWS Free Tier limit alert". This scared me: even though we use Amazon outside the free tier limit, I had no idea why and what reached the free tier limit. &lt;br&gt;
It turns out we have used 850K SQS requests at the moment while the free tier limit was 1 million. It automatically sends an e-mail at 85% of your usage limit.&lt;/p&gt;

&lt;p&gt;Now things got really scary: we did use SQS, but we use it in no such a way that it should generate 850 thousand requests. And even though I have multiple budget alarms in place and nothing out of the ordinary was going on, I did worry about this. Are there services behaving badly and trashing things, resulting in a maxed-out credit card by the end of the month?&lt;/p&gt;

&lt;p&gt;So I start diving into the infrastructure, trying to figure out what was going on. Not seeing anything out of the ordinary, I figured the issue might be something wrong with the code that was running. After a half-hour or so, I've identified a piece of code that was running sub-optimal, resulting in multiple SQS requests which shouldn't. I've optimized some of the SQS query parameters on Amazon and refactored parts of the code resulting in fewer calls. &lt;/p&gt;

&lt;p&gt;I've received the email from Amazon on the 11th of the month, thus assuming that we would reach 850K times 3 is about 2.5 million SQS requests. I figured that with our new optimization we would have saved around 1 million requests per month, resulting in 1.5 million requests per month. Not good, but at least we shaved off almost half the requests.&lt;/p&gt;

&lt;p&gt;At this point, I was feeling good. Really good. Not only was the code a bit faster and less complex, but we managed to not pay Amazon more money than we need to. And all it took was around 4 hours of work. Four hours! In this moment of victory, I wanted to know how much money we saved this, realizing we are saving this a huge amount of money EACH month.&lt;/p&gt;

&lt;p&gt;From the &lt;a href="https://aws.amazon.com/sqs/pricing/"&gt;pricing page of amazon&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The first 1 million monthly requests are free. After that, the pricing is as follows for all regions:&lt;br&gt;
Pricing per 1 million Requests after Free tier (Monthly): $0.40 ($0.0000004 per request) &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It took a while for it to land, but I was realizing that I've spent four whole developing hours to save around &lt;strong&gt;40 CENTS&lt;/strong&gt; per month and it would take me a bit over &lt;strong&gt;83 YEARS&lt;/strong&gt; to get my money back.&lt;/p&gt;

&lt;p&gt;And this by itself is already bad enough, maybe the biggest problem was that I was wrong about my "savings" too:&lt;/p&gt;

&lt;p&gt;The reason why we have reached this many SQS requests wasn't the code itself. It was the fact that we connected SQS to Lambda, so it would trigger once we received messages in the queue. But this isn't a push-action within Amazon, in fact, lambdas are continuously pulling the queue, resulting in requests being made. And it's not just one single lambda that pulls, there are multiple at the same time.&lt;/p&gt;

&lt;p&gt;In the end we did manage to save some requests, but this was nothing compared to the number of pull requests from lambda itself. Not only did I spent 4 hours to save 20 cents, but I didn't even save those 20 cents in the end.&lt;/p&gt;

&lt;p&gt;It did teach me a good lesson though.. a lesson I thought I knew, but apparently needed a good wakeup call:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Figure out how much you are actually saving in money or time before spending any time/money on your optimization. These four hours spent would be a good deal if it would save up like 100$ each month. We would get our money back quickly and save even more in the end.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When you want to optimize something, make damn sure you are optimizing the correct thing. Code can be deceiving.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I will never forget SQS pricing for the rest of my life.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>aws</category>
      <category>devops</category>
    </item>
    <item>
      <title>Parsing complex search queries in PHP</title>
      <dc:creator>Joshua Thijssen</dc:creator>
      <pubDate>Wed, 22 Jan 2020 08:46:42 +0000</pubDate>
      <link>https://dev.to/jaytaph/parsing-complex-search-queries-in-php-4pbh</link>
      <guid>https://dev.to/jaytaph/parsing-complex-search-queries-in-php-4pbh</guid>
      <description>&lt;p&gt;An important factor in a system like &lt;a href="https://seams-cms.com"&gt;Seams-CMS&lt;/a&gt;, is the fact we need to be able to let users search for data. Such search queries could be as simple as: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;query all entries that start with "foo".&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But they could also become very complex: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;query all entries that start with "foo" and are tagged with either "news" or "featured" or have at least 10 comments but no more than 25, and is published at most 1 year ago.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Creating a system that can deal with such complex questions sounds like a daunting task, but is, in fact, a relatively simple problem: we treat our search queries as a custom programming language.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lexing and parsing
&lt;/h2&gt;

&lt;p&gt;Creating a custom programming language just for search queries sounds like overkill but it's not. Most querying systems, like SQL or ElasticSearch Query language or any system, deal with this the same way. The magic solution is called lexing and parsing.&lt;/p&gt;

&lt;p&gt;To make things work, we need to do these four steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Step 1: Identify tokens from our search query (lexing)&lt;/li&gt;
&lt;li&gt;Step 2: Analyze the tokens and make sure they adhere to our ruleset (parsing)&lt;/li&gt;
&lt;li&gt;Step 3: Create an Abstract Syntax Tree (AST) from these tokens.&lt;/li&gt;
&lt;li&gt;Step 4: Use the AST for building your search.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This, in fact, is exactly what pretty much every programming language does too: from your source code, it tries to decipher different tokens (keywords, strings, variables, etc) and makes sure these tokens follows the rules of the programming language. Next, it will convert these tokens into an Abstract Syntax Tree, which is a representation of your program that can be easily processed or executed.&lt;/p&gt;

&lt;p&gt;Since our search queries are handled by an API written in PHP, we need to find a way to lex and parse in PHP. Fortunately, there is already a common library available for this: the &lt;a href="https://www.doctrine-project.org/projects/doctrine-lexer/en/1.0/index.html"&gt;Doctrine Lexer&lt;/a&gt;. It is mostly used for parsing and generating the DQL language (doctrine query language) but can be easily used for lexing and parsing your custom language. Sweet!&lt;/p&gt;

&lt;h3&gt;
  
  
  Backus normal form
&lt;/h3&gt;

&lt;p&gt;Before we can start with these steps, we must create a blueprint for our query language. This is completely up to you and pretty much the sky is the limit. Do however make sure your rules are easy to follow otherwise you not only confuse yourself and must create a complex parser, but it will also confuse your users.&lt;/p&gt;

&lt;p&gt;One way of creating such a blueprint is with the help of the Backus Normal Form (BNF). This is a standardized way of writing your rules. Some lexer/parser systems can generate parsers directly from this, but in our case, we need to create a parser manually. Again a reason to make the language as simple as possible.&lt;/p&gt;

&lt;p&gt;(note that the following is not completely in BNF notation, but readable enough)&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt; query := simple_query_expression |
          query AND simple_query_expression |
          query OR simple_query_expression

 simpleQueryExpression := identifier operator operand | '(' identifier operator operand ')'

 identifier := [a-zA-Z0-9_.]+

 operator := eq|ne|gt|le|lt|in|ni|contains

 operand := scalar | array

 array := '[' array_value ']'
 array_value := scalar | scalar ',' array_value

 scalar := qstring | string | int | date
 qstring := '"' [^"]+ '"'
 string := [^\s]+
 int := [0-9]+
 date := ISO8601 datetime
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Our language consists of a &lt;code&gt;query&lt;/code&gt; which can be either a &lt;code&gt;simple_query_expression&lt;/code&gt; OR a query followed by the keyword &lt;code&gt;AND&lt;/code&gt; followed by another simple_query_expression, OR a query followed by the keyword &lt;code&gt;OR&lt;/code&gt; followed by another simple_query_expression. This allows us to create complex AND/OR queries.&lt;/p&gt;

&lt;p&gt;A simple_query_expression is nothing more than an identifier followed by an operator followed by an operand, OR the same thing but withing parenthesis. This means that &lt;code&gt;("foo" eq "bar")&lt;/code&gt; and &lt;code&gt;"foo" eq "bar"&lt;/code&gt; are the same thing.&lt;/p&gt;

&lt;p&gt;An identifier is nothing more than a word that matches the &lt;code&gt;[a-zA-Z0-9_]+&lt;/code&gt; regex. Some languages prefer to have identifiers starting with a letter first for instance, but in our case, it doesn't matter.&lt;/p&gt;

&lt;p&gt;Operators are words: &lt;code&gt;eq&lt;/code&gt;, &lt;code&gt;ne&lt;/code&gt;, &lt;code&gt;gte&lt;/code&gt;, &lt;code&gt;gt&lt;/code&gt; etc..  We could have opted for operators like &lt;code&gt;==&lt;/code&gt; &lt;code&gt;!=&lt;/code&gt;, &lt;code&gt;&amp;gt;&lt;/code&gt; etc but since we are sending these search queries mostly through URLs, it would be easier to read them as things like &lt;code&gt;==&lt;/code&gt; would be URL-escaped.&lt;/p&gt;

&lt;p&gt;Scalars, strings, qstrings, int, and dates are also relatively simple tokens.&lt;/p&gt;

&lt;p&gt;Arrays are anything between brackets: &lt;code&gt;[ "foo", 4 ]&lt;/code&gt;, &lt;code&gt;[ "foo", "bar", "baz" ]&lt;/code&gt; etc.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lexing
&lt;/h2&gt;

&lt;p&gt;Now that we have created our language, it's time to start with the lexer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;SeamsCMS\Query\Filter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;Doctrine\Common\Lexer\AbstractLexer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Lexer&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;AbstractLexer&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// All tokens that are not valid identifiers must be &amp;lt; 100&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_NONE&lt;/span&gt;              &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_NUMBER&lt;/span&gt;            &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_STRING&lt;/span&gt;            &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_OPEN_PARENTHESIS&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_CLOSE_PARENTHESIS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_OPEN_BLOCK&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_CLOSE_BLOCK&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_COMMA&lt;/span&gt;             &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_DATE&lt;/span&gt;              &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// All keyword tokens should be &amp;gt;= 200&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_AND&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_OR&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_EQ&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;202&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_NE&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;203&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_LT&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;204&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_LTE&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;205&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_GT&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;206&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_GTE&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;207&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_IN&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;208&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_NI&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;209&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="no"&gt;T_CONTAINS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;210&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * {@inheritdoc}
     */&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getCatchablePatterns&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'[a-z0-9_.]*'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// Identifiers&lt;/span&gt;
            &lt;span class="s1"&gt;'(?:[0-9]+)?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// numbers&lt;/span&gt;
            &lt;span class="s1"&gt;'(?:"[^"]+")'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// Double quoted strings&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * {@inheritdoc}
     */&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_NONE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Recognize numeric values&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;is_numeric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_NUMBER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;// Recognize quoted strings&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="nv"&gt;$value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str_replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\"\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;is_numeric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="c1"&gt;// Quoted numbers are still numbers &lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_NUMBER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;

                &lt;span class="c1"&gt;// See if we are a quoted date&lt;/span&gt;
                &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="nv"&gt;$ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;\DateTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;\DateTimeZone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"UTC"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$ret&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nx"&gt;\DateTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_DATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;\Exception&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="c1"&gt;// It's ok when it's not a date.&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;

                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_STRING&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;// Recognize identifiers, aliased or qualified names&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;ctype_alpha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="nv"&gt;$name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'SeamsCMS\\Query\\Filter\\Lexer::T_'&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nb"&gt;strtoupper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defined&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="nv"&gt;$type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;constant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$type&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;

                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_STRING&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;// Recognize symbols&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'('&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_OPEN_PARENTHESIS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;')'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_CLOSE_PARENTHESIS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'['&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_OPEN_BLOCK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;']'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_CLOSE_BLOCK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;','&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_COMMA&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="c1"&gt;// Default&lt;/span&gt;
            &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="c1"&gt;// Do nothing&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$type&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We define symbols (constants) for each token we want to identify. This means that whenever we detect a &lt;code&gt;,&lt;/code&gt; in our language, the lexer will return a &lt;code&gt;T_COMMA&lt;/code&gt; (unless the comma was inside a quoted string, in which case it would return a &lt;code&gt;T_STRING&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;There is a little bit of magic going on detecting the operators: when we discover the string "eq", the lexer checks if there is a constant called T_EQ, and if so, it will use that token, otherwise the string is just a string. This is here so we don't have to define each operator separately.&lt;/p&gt;

&lt;p&gt;Another tricky thing is that we check quoted strings for dates. If the string can be parsed as a DateTime, we consider the string as a &lt;code&gt;T_DATE&lt;/code&gt;, otherwise, it's just a &lt;code&gt;T_STRING&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parsing the tokens
&lt;/h2&gt;

&lt;p&gt;Now that we can lex our language into tokens, it's time to make sense of these tokens. According to our BNF, a &lt;code&gt;simple_query_expression&lt;/code&gt; must start with a &lt;code&gt;T_IDENTIFIER&lt;/code&gt; followed by a &lt;code&gt;T_OPERATOR&lt;/code&gt;. The parser is where this syntax checks will occur and where it will return errors identifying the error (for instance: it can return "operator expected" when a &lt;code&gt;T_IDENTIFIER&lt;/code&gt; is followed by a &lt;code&gt;T_DATE&lt;/code&gt; instead of a &lt;code&gt;T_OPERATOR&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here is a snippet of our parser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight php"&gt;&lt;code&gt;    &lt;span class="cd"&gt;/**
     *  simpleQueryFactor:= identifier operator operand | '(' QueryExpression ')'
     *
     * @return Node
     * @throws ParseException
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;simpleQueryFactor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$lookaheadType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;lexer&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;lookahead&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'type'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$lookaheadType&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;Lexer&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_OPEN_PARENTHESIS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;lexer&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;moveNext&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

            &lt;span class="nv"&gt;$expr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;QueryExpression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;mustMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Lexer&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_CLOSE_PARENTHESIS&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$expr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$identifier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;identifierExpression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nv"&gt;$operator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;operatorExpression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$operator&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;getOperator&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'in'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'ni'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="c1"&gt;// In and NI must be arrays&lt;/span&gt;
                &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;mustMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Lexer&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_OPEN_BLOCK&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nv"&gt;$operand&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;arrayExpression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
                &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;mustMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Lexer&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_CLOSE_BLOCK&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'gt'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'gte'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'lt'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'lte'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="c1"&gt;// These can only be numbers or dates, not strings&lt;/span&gt;
                &lt;span class="nv"&gt;$operand&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;numberOrDateExpression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'contains'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'eq'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s1"&gt;'ne'&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="nv"&gt;$operand&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;operandExpression&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;SimpleQueryNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$identifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$operator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$operand&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * identifier := [A-Z0-9_.]+
     *
     * @return IdentifierNode
     * @throws ParseException
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;identifierExpression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IdentifierNode&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;mustMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Lexer&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="na"&gt;T_STRING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'identifier'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$identifier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;lexer&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'value'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;IdentifierNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$identifier&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;(note that the BNF is a bit different that we discussed).&lt;br&gt;
Inside the &lt;code&gt;simpleQueryFactor&lt;/code&gt; function we check the syntax for a &lt;code&gt;simple_query_expression&lt;/code&gt; and generate an AST node. We check for parenthesis and if found, we parse the expression in between. The "mustMatch" function checks if the next token is actually a &lt;code&gt;T_OPEN_PARENTHESIS&lt;/code&gt; or &lt;code&gt;T_CLOSE_PARENTHESIS&lt;/code&gt; so it will throw a syntax error when we forget a closing &lt;code&gt;)&lt;/code&gt;, or have an uneven number of open and close parenthesis.&lt;/p&gt;

&lt;p&gt;The syntax rules for a query depends a bit depending on each operator: the &lt;code&gt;in&lt;/code&gt; and &lt;code&gt;ni&lt;/code&gt; operator MUST have an array as an operand, while &lt;code&gt;gt&lt;/code&gt;, &lt;code&gt;gte&lt;/code&gt; etc must have either a number or a date. &lt;/p&gt;

&lt;p&gt;If everything follows our rules, we return a SimpleQueryNode which is an AST node with an identifier, an operator and operand.&lt;/p&gt;

&lt;p&gt;the &lt;code&gt;identifierExpression&lt;/code&gt; is a bit simpler: it will check if the value matches a string (we don't have separate tokens for identifiers), and returns an identifier AST node.&lt;/p&gt;
&lt;h2&gt;
  
  
  AST nodes
&lt;/h2&gt;

&lt;p&gt;The AST nodes itself are simple data objects without any logic inside. They are present to represent the abstract syntax tree:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;IdentifierNode&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cd"&gt;/** @var string */&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * IdentifierNode constructor.
     * @param string $value
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * @return string
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getValue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;





&lt;div class="highlight"&gt;&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SimpleQueryNode&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cd"&gt;/** @var IdentifierNode */&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$identifier&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cd"&gt;/** @var OperatorNode */&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$operator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cd"&gt;/** @var Node */&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$operand&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * SimpleQuery constructor.
     *
     * @param IdentifierNode $identifier
     * @param OperatorNode $operator
     * @param Node $operand
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;IdentifierNode&lt;/span&gt; &lt;span class="nv"&gt;$identifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;OperatorNode&lt;/span&gt; &lt;span class="nv"&gt;$operator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt; &lt;span class="nv"&gt;$operand&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;identifier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$identifier&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;operator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$operator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;operand&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$operand&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * @return IdentifierNode
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getIdentifier&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IdentifierNode&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;identifier&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * @return OperatorNode
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getOperator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;OperatorNode&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * @return Node
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getOperand&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="na"&gt;operand&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Walking the abstract syntax tree
&lt;/h3&gt;

&lt;p&gt;So the parsing process does two steps at once: checking the syntax of our language and generating an abstract syntax tree.&lt;/p&gt;

&lt;p&gt;A query like this&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;content.full_name.value ne "John Doe" and (meta.sizelte 12000 or meta.blaat eq "foobar")
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;would give us the following AST:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;'AST\\QueryExpressionNode(
    "terms": array(
        "0" =&amp;gt; AST\\SimpleQueryNode(
            "identifier": AST\\IdentifierNode(
                "value": \'content.full_name.value\',
            ),
            "operator": AST\\OperatorNode(
                "operator": \'ne\',
            ),
            "operand": AST\\LiteralNode(
                "value": \'John Doe\',
            ),
        ),
        "1" =&amp;gt; AST\\QueryTermNode(
            "factors": array(
                "0" =&amp;gt; AST\\SimpleQueryNode(
                    "identifier": AST\\IdentifierNode(
                        "value": \'meta.size\',
                    ),
                    "operator": AST\\OperatorNode(
                        "operator": \'lte\',
                    ),
                    "operand": AST\\LiteralNode(
                        "value": 12000,
                    ),
                ),
                "1" =&amp;gt; AST\\SimpleQueryNode(
                    "identifier": AST\\IdentifierNode(
                        "value": \'meta.blaat\',
                    ),
                    "operator": AST\\OperatorNode(
                        "operator": \'eq\',
                    ),
                    "operand": AST\\LiteralNode(
                        "value": \'foobar\',
                    ),
                ),
            ),
        ),
    ),
)'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This is now a system that can be easily traversed over. We know that a &lt;code&gt;QueryExpressionNode&lt;/code&gt; consists of one or multiple terms that have to be 'AND'ed with each other. A &lt;code&gt;simpleQueryNode&lt;/code&gt; consists of an identifier, operator, and operand, which could be a literal or array, etc. &lt;/p&gt;

&lt;p&gt;At this point, we don't care about syntax and rules because that all has been taken care of by the lexer and parser.&lt;/p&gt;

&lt;p&gt;In our case, we use this AST to generate filters for our MongoDB data cluster, but we could just as easily use the same AST to output an ElasticSearch query or even an SQL query to fetch data from another data source. The possibilities are endless!&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Dealing with complex queries doesn't have to be that hard. It's easy to fall into the trap of creating your own parser, but more often than not you end up with all kind of edge-cases that you may or may not handle, have issues when dealing with the complexities of subqueries, or and/or queries that may or may not be in parenthesis, etc.&lt;/p&gt;

&lt;p&gt;Instead of writing your own parser system, use a proven and well-documented system like lexing and parsing: it's for a good reason that it's used in every (query) language out there and although it may look scary at first, you'll get the hang of it quickly.&lt;/p&gt;

&lt;p&gt;More information about Seams-CMS, or other articles about development, PHP and CMSes in general can be found at &lt;a href="https://seams-cms.com"&gt;https://blog.seams-cms.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>php</category>
    </item>
  </channel>
</rss>
