The CPS transformer from Chapter 6 of EOPL3 reuses the LETREC language from Chapter 3 with the addition of multiargument procedures and multideclaration letrec expressions. I already wrote a parser for it when I solved exercise 3.33 so I decided to start writing my implementation of the CPS transformer by basing my implementation on LETREC-3.33.
Before starting on the main task I decided to refactor its parser. I extracted a lexer, Lexer.hs, which simplified the parser alot but then I was left with parsing expressions that worked but felt a bit icky.
For e.g.
ifExpr :: Parser Expr
ifExpr = If <$> (rIf *> expr) <*> (rThen *> expr) <*> (rElse *> expr)
In particular, I didn't like the placement of the parentheses and how it grouped the terminals with the expression to the right of it. It didn't match how I thought of the parser doing a left-to-right parse of the string. This made me wonder if thinking in terms of keep and skip from elm/parser would improve the situation.
keep and skip from elm/parser
keep is implemented by the function (|=) : Parser (a -> b) -> Parser a -> Parser b and it is used to keep values in a parser pipeline.
skip is implemented by the function (|.) : Parser keep -> Parser ignore -> Parser keep and it is used to skip values in a parser pipeline.
You typically use them in a pipeline along with succeed as follows:
ifExpr : Parser Expr
ifExpr =
P.succeed If
|. rIf
|. leftParen
|= P.lazy (\_ -> expr)
|. rightParen
|= block
|= optional
( P.succeed identity
|. rElse
|= block
)
N.B. This example was taken from Monkey/Parser.elm.
I personally find that to be easier to read and understand. It's also easier to write. You just have to think about what you want to keep and what you want to skip. And, you don't have to worry about the proper placement of parentheses.
Finding equivalents in Haskell
pure :: a -> f a corresponds with succeed.
(<*>) :: f (a -> b) -> f a -> f b corresponds with (|=).
(<*) :: f a -> f b -> f a corresponds with (|.).
Hence,
If <$> (rIf *> expr) <*> (rThen *> expr) <*> (rElse *> expr)
can be rewritten as:
pure If <* rIf <*> expr <* rThen <*> expr <* rElse <*> expr
Since (<*) and (<*>) are both left associative and have the same precendence the above is interpreted as:
(((((((pure If) <* rIf) <*> expr) <* rThen) <*> expr) <* rElse) <*> expr)
Starting with the innermost parenthesized expression and moving outward, it says:
- Lift
Ifinto the applicative - Parse
"if"but skip its value - Parse
exprand keep its value - Parse
"then"but skip its value - Parse
exprand keep its value - Parse
"else"but skip its value - Parse
exprand keep its value
The key takeaway for me is that you write, read, and understand it from left-to-right.
About (<$)
And then I learned about (<$) :: a -> f b -> f a and noticed that:
pure x <* y
is equivalent to
x <$ y
So now:
pure If <* rIf <*> expr <* rThen <*> expr <* rElse <*> expr
can be rewritten as:
If <$ rIf <*> expr <* rThen <*> expr <* rElse <*> expr
Conclusion
Thinking in terms of keep and skip helped me to improve the readability of all my parsers. An unexpected consequence is that they are also easier to write now. I simply look at the non-terminal for which I want to write the parser and determine what I want to keep and what I want to skip.
Grammar:
Program ::= Expr
Expr ::= Const | Var | Diff | Zero | If | Let | Proc | Call | Letrec
Const ::= Number
Var ::= Id
Diff ::= '-' '(' Expr ',' Expr ')'
Zero ::= 'zero?' '(' Expr ')'
If ::= 'if' Expr 'then' Expr 'else' Expr
Let ::= 'let' Id '=' Expr 'in' Expr
Proc ::= 'proc' '(' Id ')' Expr
Call ::= '(' Expr Expr* ')'
Letrec ::= 'letrec' (Id '(' (Id (',' Id)*)? ')' '=' Expr)* 'in' Expr
Number ::= [0-9]+
Id ::= [a-z]+
Parser:
program :: Parser Program
program = Program <$ whiteSpace <*> expr <* eof
expr :: Parser Expr
expr
= constExpr
<|> diffExpr
<|> zeroExpr
<|> ifExpr
<|> letrecExpr
<|> letExpr
<|> procExpr
<|> callExpr
<|> varExpr
constExpr :: Parser Expr
constExpr = Const <$> number
diffExpr :: Parser Expr
diffExpr = hyphen *> parens (Diff <$> expr <* comma <*> expr)
zeroExpr :: Parser Expr
zeroExpr = Zero <$ rZero <*> parens expr
ifExpr :: Parser Expr
ifExpr = If <$ rIf <*> expr <* rThen <*> expr <* rElse <*> expr
letrecExpr :: Parser Expr
letrecExpr = Letrec <$ rLetrec <*> many recProc <* rIn <*> expr
where
recProc = (,,) <$> identifier <*> parens (commaSep identifier) <* equal <*> expr
letExpr :: Parser Expr
letExpr = Let <$ rLet <*> identifier <* equal <*> expr <* rIn <*> expr
procExpr :: Parser Expr
procExpr = Proc <$ rProc <*> parens identifier <*> expr
callExpr :: Parser Expr
callExpr = parens (Call <$> expr <*> many expr)
varExpr :: Parser Expr
varExpr = Var <$> identifier
N.B. See Letrec/Parser.hs for the full source code.
P.S. It seems I discovered this improvement 4 years ago, when I implemented Monkey in Haskell, but forgot about it. Hopefully, writing about it will help me remember to use it in the future.
Subscribe to my newsletter
If you're interested in improving your skills with Elm then I invite you to subscribe to my newsletter, Elm with Dwayne. To learn more about it, please read this announcement.
Top comments (0)