<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vicente Maldonado</title>
    <description>The latest articles on DEV Community by Vicente Maldonado (@vicentemaldonado).</description>
    <link>https://dev.to/vicentemaldonado</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F144206%2F43bc984e-e9f7-469b-bc7b-f2ff0948b370.jpg</url>
      <title>DEV Community: Vicente Maldonado</title>
      <link>https://dev.to/vicentemaldonado</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vicentemaldonado"/>
    <language>en</language>
    <item>
      <title>Beautiful Soup Hello World</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Fri, 02 Aug 2019 09:08:11 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/beautiful-soup-hello-world-5b1k</link>
      <guid>https://dev.to/vicentemaldonado/beautiful-soup-hello-world-5b1k</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F450%2F1%2A_pugNfCqCKDV4utB85rJtw.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F450%2F1%2A_pugNfCqCKDV4utB85rJtw.jpeg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Beautiful Soup is a Python library for working with HTML and XML files. You can use it to navigate a HTML document, search it, extract data from it and even change the document structure. Let’s see how it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from bs4 import BeautifulSoup

html = '''
    &amp;lt;html&amp;gt;
    &amp;lt;head&amp;gt;
        &amp;lt;title&amp;gt;Beautiful Soup Hello World&amp;lt;/title&amp;gt;
    &amp;lt;/head&amp;gt;
    &amp;lt;body&amp;gt;
        &amp;lt;h1&amp;gt;Header&amp;lt;/h1&amp;gt;
        &amp;lt;p&amp;gt;Paragraph 1&amp;lt;/p&amp;gt;
        &amp;lt;p&amp;gt;Paragraph 2&amp;lt;/p&amp;gt;
        &amp;lt;p&amp;gt;Paragraph 3&amp;lt;/p&amp;gt;
    &amp;lt;/body&amp;gt;
    &amp;lt;/html&amp;gt;
'''

soup = BeautifulSoup(html, 'html.parser')

print(soup.title)
print(soup.title.name)
print(soup.title.text)

print(soup.p.text)

for paragraph in soup.find\_all('p'):
    print(paragraph.text)

print(soup.get\_text())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s a really basic example, but before you can run it you first need to install Beautiful Soup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install beautifulsoup4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While you’re at it, install another library as well:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install lxml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s a HTML/XML parser. Don’t worry about it.&lt;/p&gt;

&lt;p&gt;Let’s start — import Beautiful Soup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from bs4 import BeautifulSoup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we need some HTML to work with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;html = '''
    &amp;lt;html&amp;gt;
    &amp;lt;head&amp;gt;
        &amp;lt;title&amp;gt;Beautifulsoup Hello World&amp;lt;/title&amp;gt;
    &amp;lt;/head&amp;gt;
    &amp;lt;body&amp;gt;
        &amp;lt;h1&amp;gt;Header&amp;lt;/h1&amp;gt;
        &amp;lt;p&amp;gt;Paragraph 1&amp;lt;/p&amp;gt;
        &amp;lt;p&amp;gt;Paragraph 2&amp;lt;/p&amp;gt;
        &amp;lt;p&amp;gt;Paragraph 3&amp;lt;/p&amp;gt;
    &amp;lt;/body&amp;gt;
    &amp;lt;/html&amp;gt;
'''
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is a basic HTML document stored in a Python string. Of course, working with HTML stored in a Python script is not very exciting, but this is a Hello, World, so hey.&lt;/p&gt;

&lt;p&gt;Create an instance of the BeautifulSoup object, specifying the HTML document and the parser to be used (I said don’t worry about it):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;soup = BeautifulSoup(html, 'html.parser')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we have our HTML parsed and stored in a variable named soup and we can play with it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(soup.title)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use soup.title to access the HTML document’s &lt;/p&gt; element. This prints:&lt;br&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;title&amp;gt;Beautifulsoup Hello World&amp;lt;/title&amp;gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Sometimes you don’t want the HTML tag:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(soup.title.text)
&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;and get just the element text:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Beautifulsoup Hello World
&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;Our document has just one &lt;/p&gt; element so Beautiful Soup appropriately returns it if we use soup.title. But the documents has three &lt;p&gt; elements (paragraphs) so what happens when we try to pull the same trick?&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(soup.p.text)
&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;It returns the first &lt;/p&gt;
&lt;p&gt; element in the document:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Paragraph 1
&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;If you want to get all paragraphs in a documents, well, just use find_all():&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for paragraph in soup.find\_all('p'):
    print(paragraph.text)
&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;find_all() returns all paragraphs in the document and you can iterate them using a simple for loop.&lt;/p&gt;

&lt;p&gt;This is just scratching the surface with Beautiful Soup. At the end let’s see how simple it is to get all text (and only text) in the document:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(soup.get\_text())
&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;As expected, this prints&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Beautifulsoup Hello World

Header
Paragraph 1
Paragraph 2
Paragraph 3
&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;You can find the full script in my &lt;a href="https://github.com/Wurdlack/Medium/blob/master/BeautifulSoup/soup001.py" rel="noopener noreferrer"&gt;Github&lt;/a&gt;. ttfn.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>html</category>
      <category>beautifulsoup</category>
      <category>parsing</category>
    </item>
    <item>
      <title>Meet the CHICKEN</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Fri, 21 Jun 2019 17:00:38 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/meet-the-chicken-2gh4</link>
      <guid>https://dev.to/vicentemaldonado/meet-the-chicken-2gh4</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2ArWhFhPhZd4qCsNBZQJ-0ig.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2ArWhFhPhZd4qCsNBZQJ-0ig.jpeg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No, not him. You see, &lt;a href="http://wiki.call-cc.org/eggs-chronological" rel="noopener noreferrer"&gt;in the beginning, there was the CHICKEN. And the Felix said, “Let there be eggs!” — &lt;/a&gt;— and there were eggs.&lt;/p&gt;

&lt;p&gt;CHICKEN is a variant of the programming language Scheme. Yes, the one with lots of silly parentheses. No, not that one, that’s Lisp —  &lt;strong&gt;L&lt;/strong&gt; ost &lt;strong&gt;i&lt;/strong&gt; n &lt;strong&gt;S&lt;/strong&gt; tupid &lt;strong&gt;P&lt;/strong&gt; arentheses. Scheme is even more alien and, believe it or not, even less usable. Or is it?&lt;/p&gt;

&lt;p&gt;If you visit the CHICKEN web site you’ll learn that it strives to be simple, portable, extensible, well documented and actively supported. Hmm, let’s see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simple&lt;/strong&gt;  — simple it is. I was able to install CHICKEN on my Linux box with a single command. On Windows I had to download the source code and (gasp!) compile it to get the CHICKEN — this amounted to 6 — 7 mouse clicks and about as many keyboard hits. Compiling CHICKEN from source is pretty easy if you follow the readme. Also, CHICKEN has recently been upgraded to a version 5 and there are three-click installers available for the previous version (4) so it’s fairly safe to assume that a few will pop up for the current version as well. &lt;strong&gt;JUST BE PATIENT&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The editor. Ah, Emacs, the cause and the solution to all life’s problems. Yes, Emacs is still the best choice for the paren language family. You can compromise and use VS Code, Eclipse, JEdit or a number of text editors that support syntax highlighting for Scheme. A bit of a minus — not that there is much syntax anyway.&lt;/p&gt;

&lt;p&gt;CHICKEN comes with several executables: ‘csi’ is the CHICKEN Scheme interpreter which starts the REPL a loop that all Lispers love but few can explain why. ‘csc’ is the CHICKEN Scheme compiler — it produces native executables. Yes, you can use CHICKEN to make standalone executable programs. &lt;strong&gt;THIS IS BIG&lt;/strong&gt;. ‘chicken-install’ — you can use that one to install external CHICKEN libraries (they are called “Eggs” of course). ‘chicken-install’ pulls the eggs’ source code from the CHICKEN’s central repository, compiles it and makes it available for you to use. This work great most of the time, except some eggs have external dependencies (the SDL graphics library for instance) that you have to install yourself before installing the eggs in question. Also note that ‘csc’ chashes with ‘csc’ the C Sharp compiler so there is a possibility that ‘csi’ and ‘csc’ are called ‘chicken-csi’ and ‘chicken-csc’, so there’s that.&lt;/p&gt;

&lt;p&gt;All in all CHICKEN is simple enough as long as you put in some work — just about as any other programming environment I guess.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extensible&lt;/strong&gt;. The eggs. The best part of CHICKEN.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Need a GUI? &lt;a href="http://eggs.call-cc.org/5/#ui" rel="noopener noreferrer"&gt;CHICKEN’s got you (almost) covered&lt;/a&gt;. The favorite CHICKEN GUI library seems to be IUP (made by the Lua guys) but I haven’t been able to install it on my Linux due to the dependencies I mentioned earlier. On Windows, CHICKEN did come pre-packaged with IUP in its previous release (4). On the other hand the Tk (Tkinter, Python, hello!) bindings work very well and if you find that too limiting you can always use the Java Swing (and probably JavaFx but I haven’t gotten around to trying that one).&lt;/p&gt;

&lt;p&gt;Want to do some Web dev? &lt;a href="http://eggs.call-cc.org/5/#web" rel="noopener noreferrer"&gt;That’s awful&lt;/a&gt;. Haven’t done more than the Hello World with Awful, but it’s there and it works.&lt;/p&gt;

&lt;p&gt;Graphics programming? &lt;a href="http://eggs.call-cc.org/5/#graphics" rel="noopener noreferrer"&gt;No problem&lt;/a&gt;! Databases? &lt;a href="http://eggs.call-cc.org/5/#db" rel="noopener noreferrer"&gt;Still no problem&lt;/a&gt;! Networking?… you get the idea — heck, j&lt;a href="http://eggs.call-cc.org/5/" rel="noopener noreferrer"&gt;ust look at the list for yourself &lt;/a&gt;— it’s impressive.&lt;/p&gt;

&lt;p&gt;Well, CHICKEN &lt;strong&gt;is&lt;/strong&gt; extensible and extended. Just have in mind it’s a small community that builds libraries for their own needs — that one thing you are looking for might just be the one that’s missing.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Actively supported&lt;/strong&gt;. In that case just ask the guys themselves. There is a &lt;a href="http://wiki.call-cc.org/discussion-groups" rel="noopener noreferrer"&gt;mailing list called chicken users&lt;/a&gt; here. The list is not crazy active but it is very responsive — I posted a couple of questions there recently and received an answer — directly from the man who hatched the egg, so to speak. There’s also an IRC channel, but I haven’t visited. Also, don’t be surprised if you run into the people from the CHICKEN team on Stack Overflow and Reddit (over on r/lisp and elsewhere)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Well documented&lt;/strong&gt;. That’s an understatement. There is:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="http://wiki.call-cc.org/" rel="noopener noreferrer"&gt;&lt;strong&gt;A Wiki&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://wiki.call-cc.org/chicken-for-programmers-of-other-languages" rel="noopener noreferrer"&gt;&lt;strong&gt;A getting started guide&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://wiki.call-cc.org/tutorials" rel="noopener noreferrer"&gt;&lt;strong&gt;Tutorials&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://wiki.call-cc.org/man/5/The%20User%27s%20Manual" rel="noopener noreferrer"&gt;&lt;strong&gt;User’s Manual&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://wiki.call-cc.org/tips%20and%20tricks" rel="noopener noreferrer"&gt;&lt;strong&gt;Tips and tricks&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://eggs.call-cc.org/" rel="noopener noreferrer"&gt;&lt;strong&gt;Eggs&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;…and more.&lt;/p&gt;

&lt;p&gt;All in all CHICKEN is highly recommended. Just look at the little fella:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F335%2F1%2AE02BxkAHshSB2ucrm850GA.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F335%2F1%2AE02BxkAHshSB2ucrm850GA.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ttfn!&lt;/p&gt;

</description>
      <category>lisp</category>
      <category>scheme</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Python Lark Parser introduction</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Tue, 09 Apr 2019 11:46:09 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/python-lark-parser-introduction-2g4e</link>
      <guid>https://dev.to/vicentemaldonado/python-lark-parser-introduction-2g4e</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqq0iuvrwhlfujzjd85xr.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqq0iuvrwhlfujzjd85xr.jpeg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://lark-parser.readthedocs.io/en/latest/" rel="noopener noreferrer"&gt;Lark&lt;/a&gt; is a Python parsing library. Unlike parser generators like Yacc it doesn’t generate a source code file from a grammar — the parser is generated dynamically. Let’s see hot it works. You import Lark:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

from lark import Lark


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;then specify the grammar:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

grammar = """
start: WORD "," WORD "!"
%import common.WORD
%ignore " "
"""


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The grammar can be a Python string or read from a separate file. After that, just create a Lark class instance, initializing it with the grammar:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

parser = Lark(grammar)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;and you are ready to parse:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

def main():
    print(parser.parse("Hello, world!"))
    print(parser.parse("Adios, amigo!"))

if \_\_name\_\_ == '\_\_main\_\_':
    main()


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;parser.parse returns a &lt;a href="https://lark-parser.readthedocs.io/en/latest/classes/#tree" rel="noopener noreferrer"&gt;Tree&lt;/a&gt; instance containing the parse tree:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Tree(start, [Token(WORD, 'Hello'), Token(WORD, 'world')])
Tree(start, [Token(WORD, 'Adios'), Token(WORD, 'amigo')])


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;That’s it, clean and simple. It’s up to you to decide what to do with the parsed string. Let’s see where we can go from there. Here is an example of a simple arithmetic expression parser:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

from lark import Lark

grammar = """
start: add\_expr
     | sub\_expr

add\_expr: NUMBER "+" NUMBER

sub\_expr: NUMBER "-" NUMBER

%import common.NUMBER
%ignore " "
"""


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The grammar ignores spaces. Also note that the grammar terminals are written in uppercase letters (NUMBER) while the grammar rules are written in lowercase letters (start, add_expr and sub_expr). %import and %ignore are directives. You can find the &lt;a href="https://lark-parser.readthedocs.io/en/latest/grammar/" rel="noopener noreferrer"&gt;grammar reference&lt;/a&gt; in the Lark documentation. We can import definitions from other grammars — in this case &lt;a href="https://github.com/lark-parser/lark/blob/master/lark/grammars/common.lark" rel="noopener noreferrer"&gt;common.lark&lt;/a&gt; .( common.lark just contains some useful definitions). The above grammar will successfully parse addition and subtraction expressions, like:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

1+1
2-1
3 - 2


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;and nothing else. Next, create the Lark object:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

parser = Lark(grammar)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;and we are ready to parse:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

def main():
    print(parser.parse("1+1"))
    print(parser.parse("2-1"))
    print(parser.parse("3 - 2"))    

if \_\_name\_\_ == '\_\_main\_\_':
    main()


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The output is as expected:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Tree(start, [Tree(add\_expr, [Token(NUMBER, '1'), Token(NUMBER, '1')])])
Tree(start, [Tree(sub\_expr, [Token(NUMBER, '2'), Token(NUMBER, '1')])])
Tree(start, [Tree(sub\_expr, [Token(NUMBER, '3'), Token(NUMBER, '2')])])


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Note that this example just prints the parse tree as before. Let’s &lt;a href="https://lark-parser.readthedocs.io/en/latest/classes/#transformers-visitors" rel="noopener noreferrer"&gt;transform&lt;/a&gt; it to something more useful:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

from lark import Lark, Transformer

grammar = """
start: add\_expr
     | sub\_expr

add\_expr: NUMBER "+" NUMBER -&amp;gt; add\_expr

sub\_expr: NUMBER "-" NUMBER -&amp;gt; sub\_expr

%import common.NUMBER
%ignore " "
"""


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;add_expr and sub_expr on the right hand side of the grammar rules are the names of the functions that are to be applied when a rule is successfully parsed. Let’s write them:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

class CalcTransformer(Transformer):

    def add\_expr(self, args):
        return int(args[0]) + int(args[1])

    def sub\_expr(self, args):
        return int(args[0]) - int(args[1])


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Uh. For instance, when parsing&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

2-1


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;args[0] will contain "2" and args[1] will contain "1" . In our transformer functions we convert both to integers and add or subtract them returning the result. Now create the Lark object:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

parser = Lark(grammar, parser='lalr', 
    transformer=CalcTransformer())


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;For it to be able to accept transformers the parser needs to be a LALR parser. We are finally ready to parse:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

def main():
    print(parser.parse("1+1"))
    print(parser.parse("2-1"))
    print(parser.parse("3 - 2"))

if \_\_name\_\_ == '\_\_main\_\_':
    main()


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The output is now:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Tree(start, [2])
Tree(start, [1])
Tree(start, [1])


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Better? 1+1 is 2, 2–1 is1 and 3–2 is also 1.&lt;/p&gt;

&lt;p&gt;Of course this is just scratching the surface. If you are interested, you can find the &lt;a href="https://github.com/Wurdlack/Medium/tree/master/lark_examples" rel="noopener noreferrer"&gt;full examples&lt;/a&gt; on Github.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>python</category>
      <category>parsing</category>
    </item>
    <item>
      <title>Visitor Pattern in Java</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Sat, 23 Mar 2019 19:04:24 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/visitor-pattern-in-java-3lh1</link>
      <guid>https://dev.to/vicentemaldonado/visitor-pattern-in-java-3lh1</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2As3CV9XEYJ--O0Pg02hPwig.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F800%2F1%2As3CV9XEYJ--O0Pg02hPwig.jpeg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When I find a concept difficult to understand I try to strip it to bare essentials. This happened to me recently with the &lt;a href="https://en.wikipedia.org/wiki/Visitor_pattern" rel="noopener noreferrer"&gt;visitor pattern&lt;/a&gt; so here is my take on it. Of course I will be grateful for any corrections. Here goes.&lt;/p&gt;

&lt;p&gt;Let’s say we have three classes derived from a common parent, called A&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;abstract class A
{
    public String name;
    abstract void accept(Visitor v);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;class B that has two objects as components:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class B extends A
{
    public A child1;
    public A child2;

    public B(String name)
    {
        this.name = name;
    }

[@Override](http://twitter.com/Override)
    void accept(Visitor v)
    {
        v.visitB(this);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;class C that has one component:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class C extends A
{
    public A child;

    public C(String name)
    {
        this.name = name;
    }

[@Override](http://twitter.com/Override)
    void accept(Visitor v)
    {
        v.visitC(this);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and class D that has no components&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class D extends A
{
    public D(String name)
    {
        this.name = name;
    }

[@Override](http://twitter.com/Override)
    void accept(Visitor v)
    {
        v.visitD(this);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All three classes expose a property, name that lets us distinguish their instances and a method named accept that allows visitors to visit them. The classes don’t care and don’t need to know what their visitors do. Visitor is an interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;interface Visitor
{
    public void visitB(B b);
    public void visitC(C c);
    public void visitD(D d);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is a method for each class it visits. Let’s try this out with a visitor implementation that just prints out the name of objects it visited:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class PrintVisitor implements Visitor
{
    public void visitB(B b)
    {
        b.child1.accept(this);
        System.out.println(b.name + " visited.");
        b.child2.accept(this);
    }

    public void visitC(C c)
    {
        System.out.println(c.name + " visited.");
        c.child.accept(this);
    }

    public void visitD(D d)
    {
        System.out.println(d.name + " visited.");
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The visitor is recursive: it visits a tree node and then it visits its children. Now let’s make a tree made up from the classes B, C and D:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        /\*
            F
          / \
        B G
      / \ \
     A D H
         / \ \
        C E I

        \*/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are nine objects and seven relations. First, create the objects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        B f = new B("F");
        B b = new B("B");
        B d = new B("D");

        C g = new C("G");
        C h = new C("H");

        D a = new D("A");
        D c = new D("C");
        D e = new D("E");
        D i = new D("I");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, the relations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        f.child1 = b;
        f.child2 = g;

        b.child1 = a;
        b.child2 = d;

        d.child1 = c;
        d.child2 = e;

        g.child = h;
        h.child = i;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally start visiting our tree by visiting its root node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        PrintVisitor v = new PrintVisitor();
        f.accept(v);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A visited.
B visited.
C visited.
D visited.
E visited.
F visited.
G visited.
H visited.
I visited.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you look at &lt;a href="https://en.wikipedia.org/wiki/Tree_traversal" rel="noopener noreferrer"&gt;this article&lt;/a&gt;, the above code performs tree traversal, and what is called &lt;a href="https://en.wikipedia.org/wiki/Tree_traversal#In-order_(LNR)" rel="noopener noreferrer"&gt;in-order traversal&lt;/a&gt; at that (). Let’s change our visitor class to do a pre-order traversal — the visitor first displays the node name and then visits its children:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class PrintVisitor implements Visitor
{
    public void visitB(B b)
    {
        System.out.println(b.name + " visited.");
        b.child1.accept(this);
        b.child2.accept(this);
    }

    public void visitC(C c)
    {
        System.out.println(c.name + " visited.");
        c.child.accept(this);
    }

    public void visitD(D d)
    {
        System.out.println(d.name + " visited.");
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the output is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;F visited.
B visited.
A visited.
D visited.
C visited.
E visited.
G visited.
H visited.
I visited.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In &lt;a href="https://en.wikipedia.org/wiki/Tree_traversal#Post-order_(LRN)" rel="noopener noreferrer"&gt;post-order traversal&lt;/a&gt; the visitor first visits node children and only then displays its name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class PrintVisitor implements Visitor
{
    public void visitB(B b)
    {
        b.child1.accept(this);
        b.child2.accept(this);
        System.out.println(b.name + " visited.");
    }

    public void visitC(C c)
    {
        c.child.accept(this);
        System.out.println(c.name + " visited.");
    }

    public void visitD(D d)
    {
        System.out.println(d.name + " visited.");
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A visited.
C visited.
E visited.
D visited.
B visited.
I visited.
H visited.
G visited.
F visited.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Besides the Wikipedia article I linked at the beginning, there is a nice description of the visitor pattern &lt;a href="http://www.craftinginterpreters.com/representing-code.html#the-visitor-pattern" rel="noopener noreferrer"&gt;here&lt;/a&gt;. In short:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The visited objects don’t need to know what their visitors do, they just need to accept them.&lt;/li&gt;
&lt;li&gt;There needs to be a protocol that lets visited objects and visitors communicate, in our case the Visitor interface.&lt;/li&gt;
&lt;li&gt;A visitor uses separate methods (ie visitB, visitC and visitD for visiting each class)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(You can find the code on &lt;a href="https://github.com/Wurdlack/Medium/blob/master/visitor_test/Main.java" rel="noopener noreferrer"&gt;Github&lt;/a&gt;.)&lt;/p&gt;

</description>
      <category>datastructures</category>
      <category>java</category>
      <category>designpatterns</category>
      <category>programming</category>
    </item>
    <item>
      <title>A Context-Free Grammar Tutorial</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Wed, 13 Mar 2019 10:05:17 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/a-context-free-grammar-tutorial-38b5</link>
      <guid>https://dev.to/vicentemaldonado/a-context-free-grammar-tutorial-38b5</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2AX769PvIY7YQN4lTge1Zt6A.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1024%2F1%2AX769PvIY7YQN4lTge1Zt6A.jpeg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I recently came across a tutorial on context-free grammars with several examples of common patterns you can find in those grammars and I thought to myself — why not implement some of those examples as an exercise?&lt;/p&gt;

&lt;p&gt;I first had to choose the language, Java, and the tools to implement the examples with, &lt;a href="https://www.google.com/url?sa=t&amp;amp;rct=j&amp;amp;q=&amp;amp;esrc=s&amp;amp;source=web&amp;amp;cd=1&amp;amp;cad=rja&amp;amp;uact=8&amp;amp;ved=2ahUKEwiv5YKW8vrgAhWriIsKHR3dAg4QFjAAegQIBRAC&amp;amp;url=https%3A%2F%2Fwww.jflex.de%2F&amp;amp;usg=AOvVaw1UsT-Zk-dlb95doMmPXcLF" rel="noopener noreferrer"&gt;JFlex&lt;/a&gt; and &lt;a href="http://web.cecs.pdx.edu/~mpj/jacc/" rel="noopener noreferrer"&gt;Jacc&lt;/a&gt;. The pair is sufficiently similar to Flex and Bison and the grammars hopefully won’t get obscured by implementation.&lt;/p&gt;

&lt;p&gt;I got myself familiar with JFlex and Jacc and made three notes about working with them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/wurdlack/meet-jflex-59gd"&gt;Meet JFlex&lt;/a&gt; — an intro to JFlex with a small example of a standalone lexer,&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/wurdlack/use-jflex-to-count-words-5ci4"&gt;Use JFlex to Count Words&lt;/a&gt; — another standalone lexer example, this time a bit more involved,&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/wurdlack/use-jflex-and-jacc-together-3nnd"&gt;Use JFlex and Jacc Together&lt;/a&gt; — how to make JFlex and Jacc cooperate, also with a rudimentary example.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are interested in CFGs and parsing, the tutorial I mentioned is not a bad place to get some practical experience implementing them. You can find it &lt;a href="http://marvin.cs.uidaho.edu/Handouts/" rel="noopener noreferrer"&gt;here&lt;/a&gt; and I also uploaded it to &lt;a href="https://github.com/Wurdlack/Medium/tree/master/cfg_tutorial" rel="noopener noreferrer"&gt;Github&lt;/a&gt; (hoping not to have broken any copyright laws).&lt;/p&gt;

&lt;p&gt;The grammar I used to demonstrate how JFlex and Jacc work together is actually the grammar from the section 2.1 of the tutorial (“A grammar for a language that allows a list of X’s “) so I’ve actually already started going through the exercises. Yay.&lt;/p&gt;

&lt;p&gt;Using &lt;a href="https://www.graphviz.org/" rel="noopener noreferrer"&gt;Graphviz&lt;/a&gt; you can get a visual representation of your grammars. First export the grammar as a dot file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jacc -d Parser.jacc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates Parser.dot. Then&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dot -Tjpg Parser.dot -o Parser.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and you end up with Parser.jpg. It’s very simple:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F307%2F1%2AbFCKrWMJPtYDgJIIml8Agg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F307%2F1%2AbFCKrWMJPtYDgJIIml8Agg.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is a state machine represented as a directed graph. Creating grammar visual representations is just one of the tools Jacc provides to help you debug your grammars. You can export a grammar to a text file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jacc -v Parser.jacc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates Parser.output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Output created by jacc on Wed Mar 13 09:15:29 CET 2019

state 0 (entry on sentence)
    $accept : \_sentence $end

X shift 2
    . error

sentence goto 1

state 1 (entry on sentence)
    $accept : sentence\_$end
    sentence : sentence\_X (2)

$end accept
    X shift 3
    . error

state 2 (entry on X)
    sentence : X\_ (1)

$end reduce 1
    X reduce 1
    . error

state 3 (entry on X)
    sentence : sentence X\_ (2)

$end reduce 2
    X reduce 2
    . error

4 terminals, 1 nonterminals;
2 grammar rules, 4 states;
0 shift/reduce and 0 reduce/reduce conflicts reported.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or into a html file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jacc -h Parser.jacc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates ParserMachine.html:&lt;/p&gt;

&lt;h4&gt;Generated machine for Parser&lt;/h4&gt;

&lt;pre&gt;
// Output created by jacc on Wed Mar 13 09:20:03 CET 2019

&lt;a&gt;&lt;b&gt;state 0 (entry on sentence)&lt;/b&gt;&lt;/a&gt;
    $accept : _sentence $end

    X shift 2
    . error

    sentence goto 1

&lt;a&gt;&lt;b&gt;state 1 (entry on sentence)&lt;/b&gt;&lt;/a&gt;
    $accept : sentence_$end
    sentence : sentence_X    (2)

    $end accept
    X shift 3
    . error

&lt;a&gt;&lt;b&gt;state 2 (entry on X)&lt;/b&gt;&lt;/a&gt;
    sentence : X_    (1)

    $end reduce 1
    X reduce 1
    . error

&lt;a&gt;&lt;b&gt;state 3 (entry on X)&lt;/b&gt;&lt;/a&gt;
    sentence : sentence X_    (2)

    $end reduce 2
    X reduce 2
    . error

4 terminals, 1 nonterminals;
2 grammar rules, 4 states;
0 shift/reduce and 0 reduce/reduce conflicts reported.
&lt;/pre&gt;

&lt;p&gt;You can actually go through the machine states by clicking the links.&lt;/p&gt;

&lt;p&gt;Jacc also supports tracing your grammar on sample inputs and embedding custom error productions in grammars — that’s it. ttfn!&lt;/p&gt;

</description>
      <category>programming</category>
      <category>interpreters</category>
      <category>parsing</category>
      <category>java</category>
    </item>
    <item>
      <title>Use JFlex and Jacc Together</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Mon, 11 Mar 2019 11:02:45 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/use-jflex-and-jacc-together-3nnd</link>
      <guid>https://dev.to/vicentemaldonado/use-jflex-and-jacc-together-3nnd</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F500%2F1%2AXCb6SyEhRx5JoxfLvd0QUw.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F500%2F1%2AXCb6SyEhRx5JoxfLvd0QUw.jpeg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Just as &lt;a href="https://www.jflex.de/" rel="noopener noreferrer"&gt;JFlex&lt;/a&gt; generates lexers, &lt;a href="http://web.cecs.pdx.edu/~mpj/jacc/" rel="noopener noreferrer"&gt;Jacc&lt;/a&gt; generates parsers, but what’s the difference? A lexer can recognize words and a parser can recognize whole sentences, or more formally, use lexers to work with &lt;a href="https://en.wikipedia.org/wiki/Regular_grammar" rel="noopener noreferrer"&gt;regular grammars&lt;/a&gt;, and parsers to work with &lt;a href="https://en.wikipedia.org/wiki/Context-free_grammar" rel="noopener noreferrer"&gt;context-free grammars&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is the reason the two are often used together — you first use a lexer to recognize words and pass those words to a parser which is able to determine if the words form a valid sentence.&lt;/p&gt;

&lt;p&gt;Now, Java being Java, there is a choice of parser generators out there: &lt;a href="https://www.antlr.org/" rel="noopener noreferrer"&gt;Antlr&lt;/a&gt; is ubiquitous, but there are also &lt;a href="http://www.ssw.uni-linz.ac.at/Coco/" rel="noopener noreferrer"&gt;CoCo/R&lt;/a&gt;, &lt;a href="https://javacc.org/" rel="noopener noreferrer"&gt;JavaCC,&lt;/a&gt; &lt;a href="http://sablecc.org/" rel="noopener noreferrer"&gt;SableCC&lt;/a&gt;, &lt;a href="http://www2.cs.tum.edu/projects/cup/" rel="noopener noreferrer"&gt;Cup&lt;/a&gt;, &lt;a href="http://byaccj.sourceforge.net/" rel="noopener noreferrer"&gt;Byacc/J&lt;/a&gt; and probably many others. Even the venerable &lt;a href="https://www.gnu.org/software/bison/manual/html_node/Bison-Parser.html" rel="noopener noreferrer"&gt;Bison&lt;/a&gt; is capable of generating Java parsers. Some parsers, like Antlr, CoCo/R and JavaCC don’t even need a separate lexer to feed them words — they can generate one of their own!&lt;/p&gt;

&lt;p&gt;So why Jacc? Why not?&lt;/p&gt;

&lt;p&gt;Ok so how JFlex and Jacc work together:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F552%2F1%2AFxPEreHwpiRs_PbzaXBBFQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F552%2F1%2AFxPEreHwpiRs_PbzaXBBFQ.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;JFlex reads the input as a stream of characters and produces a token for Jacc when Jacc asks for one. &lt;a href="https://en.wikipedia.org/wiki/Lexical_analysis#Token" rel="noopener noreferrer"&gt;A token is a string with a meaning&lt;/a&gt;. For instance, +, true and 3.14 are all tokens — some of them don’t really need a value aside from their type: true is a Boolean literal, but some of them do: 3.14 is an integer literal with the value of 3.14.&lt;/p&gt;

&lt;p&gt;As with JFlex, a Jacc file has three distinct sections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;directives section
%%
rules section
%%
additional code section
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Jacc creates a list of all the tokens it expects in a separate file. You specify the file and the token list like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%interface ParserTokens
%token X NL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s look at the generated file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Output created by jacc on Mon Mar 11 09:54:05 CET 2019

interface ParserTokens {
    int ENDINPUT = 0;
    int NL = 1;
    int X = 2;
    int error = 3;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is an interface that doubles as an enumeration. Apart from the two token types we asked for, NL and X, two more are created: one for the end of input and one for errors. Back in the JFlex file you “implement” this “interface”:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%class Lexer
%implements ParserTokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;just so our lexer could see the ENDINPUT, NL, X AND error constants. There are a few more things Jacc expects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A function that returns integer values that represent token types (0, 1, or 3 in our example). Naming that function yylex is a tradition, so let’s do that:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%function yylex
%int
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Three more functions: getToken to get the current token code, nextToken to read the next token code and getSemantic to get the current token value:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%{

private int token;
    private String semantic;

    public int getToken()
    {
        return token;
    }

    public String getSemantic()
    {
        return semantic;
    }

    public int nextToken()
    {
        try
        {
            token = yylex();
        }
        catch (java.io.IOException e)
        {
            System.out.println(
                "IO exception occured:\n" + e);
        }
        return token;
    }

%}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You may notice that we decided to make the token semantic value a String so we also need to indicate that in the parser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%semantic String
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For our example we’ll just have the lexer recognize the following words: word, Word, wOrd, worD, … , wORD and WORD and new lines. We’ll ignore whitespace.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x = [wW][oO][rR][dD]
nl = \n | \r | \r\n
space = [\t]

%%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the lexer finds a word it will return the X token (ie. 2 from the ParserTokens interface) with a value of word, Word,… — this is what semantic = yytext(); does. When it encounters a new line it will return ENDINPUT (ie. 0):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{x} { semantic = yytext(); return X; }
{space} { /\* Ignore space \*/ }
{nl} { return ENDINPUT; }
[^] { System.out.println("Error?"); }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s it. Now the parser “grammar” in all its glory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sentence : X { System.out.println("X found: " + $1); }
    | sentence X { System.out.println("X found: " + $2); }
    ;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The grammar is &lt;a href="https://en.wikipedia.org/wiki/Left_recursion" rel="noopener noreferrer"&gt;left-recursive&lt;/a&gt;, which allows us to have one or more X words in a sentence. $1 and $2 will hold the X semantic value, passed there by the lexer, and we’ll simply print it out.&lt;/p&gt;

&lt;p&gt;In the main method we create Lexer and Parser instances and start parsing. One thing to note here is that we have to “prime” the lexer with&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;parser.lexer.nextToken();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;before the parser can use it. For reference, here is the lexer full source&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.\*;

%%

%class Lexer
%implements ParserTokens

%function yylex
%int

%{

private int token;
    private String semantic;

    public int getToken()
    {
        return token;
    }

    public String getSemantic()
    {
        return semantic;
    }

    public int nextToken()
    {
        try
        {
            token = yylex();
        }
        catch (java.io.IOException e)
        {
            System.out.println(
                "IO exception occured:\n" + e);
        }
        return token;
    }

%}

x = [wW][oO][rR][dD]
nl = \n | \r | \r\n
space = [\t]

%%

{x} { semantic = yytext(); return X; }
{space} { /\* Ignore space \*/ }
{nl} { return ENDINPUT; }
[^] { System.out.println("Error?"); }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and that of the parser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%{

import java.io.\*;

%}

%class Parser
%interface ParserTokens

%semantic String

%token X NL

%%

sentence : X { System.out.println("X found: " + $1); }
    | sentence X { System.out.println("X found: " + $2); }
    ;

%%

private Lexer lexer;

    public Parser(Reader reader)
    {
        lexer = new Lexer(reader);
    }

    public void yyerror(String error)
    {
        System.err.println("Error: " + error);
    }

    public static void main(String args[]) throws IOException
    {
        System.out.println("Interactive evaluation:");

        Parser parser = new Parser(
            new InputStreamReader(System.in));

        parser.lexer.nextToken();
        parser.parse();
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You need to compile Lexer.flex&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jflex lexer.flex
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and Parser.jacc&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jacc parser.jacc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and the three generated Java files ( Lexer.java , Parser.java and ParserTokens.java ):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;javac \*.java
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to finally be able to run the parser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;java Parser
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is an example terminal session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Interactive evaluation:
word Word wOrD WORD
X found: word
X found: Word
X found: wOrD
X found: WORD
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can find the &lt;a href="https://github.com/Wurdlack/Medium/tree/master/jflex_jacc" rel="noopener noreferrer"&gt;full source code on Github&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Actually this was pretty boring, but we are now free to play with context-free grammars!&lt;/p&gt;

</description>
      <category>parsing</category>
      <category>programming</category>
      <category>interpreters</category>
      <category>java</category>
    </item>
    <item>
      <title>Use JFlex to Count Words</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Sun, 10 Mar 2019 19:06:27 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/use-jflex-to-count-words-5ci4</link>
      <guid>https://dev.to/vicentemaldonado/use-jflex-to-count-words-5ci4</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F612%2F1%2An6Ziy9s7A21I1JyCPSX4Jg.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F612%2F1%2An6Ziy9s7A21I1JyCPSX4Jg.jpeg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://dev.to/wurdlack/meet-jflex-20de-temp-slug-2207783"&gt;previous story&lt;/a&gt; we got to meet JFlex, a tool for generating lexers in Java. The example lexer was contrived, banal and not all that useful so let’s show that JFlex can be put to good use with a (bit) more useful example: we’ll count words, lines and characters the user enters.&lt;/p&gt;

&lt;p&gt;The first part of the JFlex file is the same as in the first example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.\*;

%%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We just import all the java.io classes. To start with, we’ll need a way to store our word, line and char count:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%{

public int chars = 0;
public int words = 0;
public int lines = 0;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;chars, words and lines will become public members of the generated class, accessible to the rest of the code. The main method is much the same as in the first example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

lexer.yylex();

 System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);
}

%}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are two differences though:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is no infinite loop — the lexer will read from System.in until we interrupt it ( Ctrl-d on Linux and Ctrl-Z on Windows I think). This allows the user to enter several lines of text in the terminal.&lt;/li&gt;
&lt;li&gt;We don’t use the yylex() directly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the next part:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%class Lexer
%type Integer

%%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The generated Java class will be named Lexer and yylex() will return a Java Integer value. This is only because yylex() needs to return something and it returns an object of type Yylex by default — javac will complain that Yylex type doesn’t exist because it doesn’t (if you don’t create it yourself).&lt;/p&gt;

&lt;p&gt;Finally, in the lexical rules part:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[a-zA-Z]+ { words++; chars += yytext().length(); }
\n { chars++; lines++; }
. { chars++; }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you type a word, recognized by the [a-zA-Z]+ regex, the word count will be incremented and the char count be increase by the entered word length. If you press Enter, ie. \n , the char count will be incremented. And if you enter a random character like * or &amp;amp; the character count will be incremented.&lt;/p&gt;

&lt;p&gt;This allows us to print out the final count of chars, words and lines (back in main):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the complete file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.\*;

%%

%{

public int chars = 0;
public int words = 0;
public int lines = 0;

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

lexer.yylex();

 System.out.format(
 "Chars: %d\nWords: %d\nLines: %d\n",
 lexer.chars, lexer.words, lexer.lines);
}

%}

%class Lexer
%type Integer

%%

[a-zA-Z]+ { words++; chars += yytext().length(); }
\n { chars++; lines++; }
. { chars++; }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As in the previous example you need to compile both the JFlex file and the generated Java file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[johnny@test example1]$ jflex Lexer.flex
[johnny@test example1]$ javac Lexer.java
[johnny@test example1]$ java Lexer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a simple demo terminal session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The quick brown fox
jumps over the lazy dog.
Chars: 45
Words: 9
Lines: 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can download the full code from &lt;a href="https://github.com/Wurdlack/Medium/blob/master/word_count/Lexer.flex" rel="noopener noreferrer"&gt;Github&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>compilers</category>
      <category>programming</category>
      <category>parsing</category>
      <category>java</category>
    </item>
    <item>
      <title>Meet JFlex</title>
      <dc:creator>Vicente Maldonado</dc:creator>
      <pubDate>Sun, 10 Mar 2019 16:16:16 +0000</pubDate>
      <link>https://dev.to/vicentemaldonado/meet-jflex-59gd</link>
      <guid>https://dev.to/vicentemaldonado/meet-jflex-59gd</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftkt7v7x185sfb8pyd20i.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftkt7v7x185sfb8pyd20i.jpeg" width="570" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.jflex.de/" rel="noopener noreferrer"&gt;JFlex&lt;/a&gt; is a scanner generator for Java. A &lt;a href="https://en.wikipedia.org/wiki/Comparison_of_parser_generators#Regular_languages" rel="noopener noreferrer"&gt;scanner generator&lt;/a&gt; will generate a &lt;a href="https://en.wikipedia.org/wiki/Lexical_analysis" rel="noopener noreferrer"&gt;scanner (a.k.a. lexer)&lt;/a&gt; for you instead of you having to write one yourself. JFlex is modeled after &lt;a href="https://en.wikipedia.org/wiki/Flex_(lexical_analyser_generator)" rel="noopener noreferrer"&gt;(f)&lt;/a&gt;&lt;a href="https://en.wikipedia.org/wiki/Lex_(software)" rel="noopener noreferrer"&gt;lex&lt;/a&gt; only it’s written in Java and generates Java lexers unlike the two older tools.&lt;/p&gt;

&lt;p&gt;What is the JFlex workflow?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a JFlex source file (*.flex)&lt;/li&gt;
&lt;li&gt;Use the JFlex command-line tool to compile the file into a Java file&lt;/li&gt;
&lt;li&gt;Use javac to compile the Java file&lt;/li&gt;
&lt;li&gt;Invoke the *.class file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;et voilà, you have a working scanner/lexer. You can use it as a standalone tool or in combination with other programs — tools like Yacc/Bison commonly expect a scanner to feed them input to work with.&lt;/p&gt;

&lt;p&gt;A JFlex source file is made up of three parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.jflex.de/manual.html#ExampleUserCode" rel="noopener noreferrer"&gt;usercode&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.jflex.de/manual.html#ExampleOptions" rel="noopener noreferrer"&gt;options and declarations&lt;/a&gt; and&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.jflex.de/manual.html#ExampleLexRules" rel="noopener noreferrer"&gt;lexical rules&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;separated by double percent sign (%%). Here is a simple example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.\*;

%%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s it for the first part. Oddly enough, if you want to add code to the generated Java class you’ll have to include that code in the middle section of the JFlex file (options and declarations). There’s no magic to it: JFlex creates the lexer based on a template, and the code you put in the first section does not end up as a part of the generated class — this is why you put your import statements here (You can go wild and put full Java classes there too but that’s not a very good idea).&lt;/p&gt;

&lt;p&gt;The code you do put in the middle section of your JFlex file, on the other hand, does end up as a part of the lexer. Let’s add the main method to the class and make it self-contained:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%{

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

 System.out.println("Start lexing");

 while (true)
 {
 System.out.println(lexer.yylex());
 }
}

%}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A lexer that JFlex generates needs to be initialized with a Java Reader. In this case we will accept input from System.in, ie. stdio. JFlex generates a class named Yylex with a function named yylex(). Let’s change that (we are still in the middle section of our JFlex file):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%class Lexer
%type String

%%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will make JFlex change the class name to Lexer and yylex() will return a Java String instead of a Yytoken — a class we won’t bother creating.&lt;/p&gt;

&lt;p&gt;The plan here is to make yylex() return any character we type on our keyboard — this is why we specify its %type as String. Then we’ll just use an infinite loop ( while (true) ) to accept characters and immediately print them out.&lt;/p&gt;

&lt;p&gt;Let’s finish with the third section (lexical rules):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[^] { return yytext(); }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;[^] will match any character. yytext() will return the character as a string and { return ...} is what yylex() will return so we are done.&lt;/p&gt;

&lt;p&gt;Here is the complete file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import java.io.\*;

%%

%{

public static void main(String[] args) throws IOException
{
 InputStreamReader reader =
 new InputStreamReader(System.in);

Lexer lexer = new Lexer(reader);

 System.out.println("Start lexing");

 while (true)
 {
 System.out.println(lexer.yylex());
 }
}

%}

%class Lexer
%type String

%%

[^] { return yytext(); }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You need to compile it (watch for jflex error messages in output):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[johnny@test example1]$ jflex Lexer.flex
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;compile the generated Java file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[johnny@test example1]$ javac Lexer.java
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[johnny@test example1]$ java Lexer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s an example terminal session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Start lexing
123
1
2
3

abc 
a
b
c

^C[johnny@test example1]$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use ctrl-c to stop the program. Of course this is not very exciting and you don’t have to use a 570 loc Java file (yes, that’s how long the generated lexer is) just to echo characters.&lt;/p&gt;

&lt;p&gt;You can download the source code from &lt;a href="https://github.com/Wurdlack/Medium/blob/master/meet_jflex/Lexer.flex" rel="noopener noreferrer"&gt;Github&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>parsing</category>
      <category>java</category>
      <category>compilers</category>
    </item>
  </channel>
</rss>
