You can keep reading here or jump to my blog to get the full experience, including the wonderful pink, blue and white palette.
In the previous post we've created the basic building blocks of a parser combinator library. Now it's time to add some more fancy stuff.
Choice and Failure
Up until now, each parser either parsed successfully or failed. We'd like to have a parser that can try different parsers before giving up and failing.
Luckily, math already did the dirty job for us. In fact, we can use a couple of typeclasses: Alt and Plus.
Alt captures the intuition of a choice between alternatives. It has only one member alt or infix <|>:
instance altParser :: (Functor Parser) => Alt Parser where
    -- alt :: forall a. f a -> f a -> f a
    alt p1 p2 = Parser (\s -> case runParser p1 s of
                                Just (Tuple v s') -> Just $ Tuple v s'
                                Nothing           -> runParser p2 s)
In other words, it tries p1 and falls back to p2 in case of failure. For example
main :: Effect Unit
main = do
  logShow $ runParser (char 'Z' <|> char 's') "string"
  -- (Just (Tuple 's' "tring"))
Plusrepresents failure:
instance plusParser :: (Alt Parser) => Plus Parser where
    -- empty :: forall a. f a
    empty = Parser (\_ -> Nothing)
In the previous post, we represented failure in the fail parser. Turns out we could have used empty in its place:
main :: Effect Unit
main = do
  -- `:: Parser Unit` is needed to make the compiler happy
  logShow $ runParser (empty :: Parser Unit) "string"
  -- Nothing
Putting it together
main :: Effect Unit
main = do
  logShow $ runParser (empty <|> anyChar) "string"
  -- (Just (Tuple 's' "tring"))
  logShow $ runParser (anyChar <|> empty) "string"
  -- (Just (Tuple 's' "tring"))
  logShow $ runParser (char 'Z' <|> char 's') "string"
  -- (Just (Tuple 's' "tring"))
Attempt Parsing Multiple Times
It would be awesome to be able to use one parser as many times as it can succeed. Let's say we wanted to match any number of spaces. Then we could write
spaces :: Parser Unit
spaces = do
  c <- anyChar
  if c == ' '
    then spaces
    else Parser (\s -> Just (Tuple unit $ fromChars [c] <> s))
In other words, we parse any character, if it's a space we recursively call spaces. Otherwise, we noop by putting back the character in the string.
Or similarly
spaces' :: Parser Unit
spaces' = (char ' ' >>= \_ -> spaces') <|> pure unit
we either succeed at parsing a space and recursively call spaces' or we get out successfully.
Those work
main :: Effect Unit
main = do
  logShow $ runParser spaces "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces "string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces' "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces' "string"
  -- (Just (Tuple Unit "string"))
However, they look ugly. But guess what? Math has our back once again. As a matter of fact, we can use Data.List.many.
To use many, we need to satisfy a couple of constraints. In fact, the signature is
many
  :: forall f a. Alternative f
  => Lazy (f (List a))
  => f a
  -> f (List a)
If we substitute Parser in place of f then we get
many
  :: forall a. Alternative Parser
  => Lazy (Parser (List a))
  => Parser a
  -> Parser (List a)
That means we need the following instances: Alternative Parser and Lazy (Parser (List a)).
Alternative is easy pease because it has no members to implement. In particular, for a type to be an Alternative, it only needs to be both Applicative and Plus. Since Parser implements those, we can just write
instance alternativeParser :: (Applicative Parser, Plus Parser) => Alternative Parser
Attempting to parse an indefinite amount of times requires laziness for the same reason we need laziness to have an infinite list (check the mutual recursive definition of many and some to dig deeper).
In Haskell that is not a problem because the language is lazy by default. In PureScript we need to use Lazy:
instance lazyParser :: Lazy (Parser (List a)) where
    -- defer :: (Unit -> l) -> l
    defer g = Parser (\s -> runParser (g unit) s)
That means now we can write
spaces0 :: Parser Unit
spaces0 = do
  _ <- many $ char ' '
  pure unit
We discard the spaces (_ <- many $ char ' ') because we are not interested in a list of spaces, we just want to consume them:
main :: Effect Unit
main = do
  logShow $ runParser spaces0 "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces0 "string"
  -- (Just (Tuple Unit "string"))
The best part is that now we can also use some to have a parser that needs at least one match to succeed:
spaces1 :: Parser Unit
spaces1 = do
  _ <- some $ char ' '
  pure unit
main :: Effect Unit
main = do
  logShow $ runParser spaces0 "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces0 "string"
  -- Nothing
  
  
  The int Parser
Just for fun, let's put everything together into a parser that parses Ints.
At the top we define int as
int :: Parser Int
int =
  neg <|> nat
In other words, an int is either a negative or a natural. nat is implemented as follows
nat :: Parser Int
nat = do
  xs <- some digit
  case fromString $ fromChars xs of
    Just n -> pure n
    Nothing -> empty 
And neg as follows
neg :: Parser Int
neg = do
  _ <- char '-'
  negate <$> nat
Et voilà
main :: Effect Unit
main = do
  logShow $ runParser int "123 string"
  -- (Just (Tuple 123 " string"))
  logShow $ runParser int "-123 string"
  -- (Just (Tuple -123 " string"))
  logShow $ runParser int "string"
  -- Nothing
The Whole Code
module Main where
import Prelude
import Effect (Effect)
import Effect.Console (logShow)
import Data.Maybe
import Data.Tuple
import Data.String.Yarn hiding (fromString)
import Data.List
import Data.Char.Unicode
import Data.String.CodeUnits
import Control.Alt
import Control.Plus
import Control.Alternative
import Control.Lazy (class Lazy)
import Data.Int
newtype Parser a = Parser (String -> Maybe (Tuple a String))
runParser :: forall a. Parser a -> String -> Maybe (Tuple a String)
runParser (Parser g) s = g s
instance functorParser :: Functor Parser where
    -- map :: forall a b. (a -> b) -> f a -> f b
    map g f = Parser (\s -> case runParser f s of
                              Just (Tuple v s') -> Just $ Tuple (g v) s'
                              Nothing           -> Nothing)
instance applyParser :: (Functor Parser) => Apply Parser where
    -- apply :: forall a b. f (a -> b) -> f a -> f b
    apply fg f = Parser (\s -> case runParser fg s of
                                 Just (Tuple g s') ->
                                   case runParser f s' of
                                     Just (Tuple v s'') -> Just $ Tuple (g v) s''
                                     Nothing            -> Nothing
                                 Nothing           -> Nothing)
instance applicativeParser :: (Apply Parser) => Applicative Parser where
    -- pure :: forall a. a -> f a
    pure x = Parser (\s -> Just $ Tuple x s)
instance bindParser :: (Apply Parser) => Bind Parser where
    -- bind :: forall a b. m a -> (a -> m b) -> m b
    bind m g = Parser (\s -> case runParser m s of
                               Just (Tuple v s') -> runParser (g v) s'
                               Nothing           -> Nothing)
instance monadParser :: (Bind Parser) => Monad Parser
instance altParser :: (Functor Parser) => Alt Parser where
    -- alt :: forall a. f a -> f a -> f a
    alt p1 p2 = Parser (\s -> case runParser p1 s of
                                Just (Tuple v s') -> Just $ Tuple v s'
                                Nothing           -> runParser p2 s)
instance plusParser :: (Alt Parser) => Plus Parser where
    -- empty :: forall a. f a
    empty = Parser (\_ -> Nothing)
instance alternativeParser :: (Applicative Parser, Plus Parser) => Alternative Parser
instance lazyParser :: Lazy (Parser (List a)) where
    -- defer :: (Unit -> l) -> l
    defer g = Parser (\s -> runParser (g unit) s)
fail :: forall a. Parser a
fail = Parser (\_ -> Nothing)
anyChar :: Parser Char
anyChar = Parser (\s -> case toChars s :: List Char of
                          Nil       -> Nothing
                          Cons x xs -> Just $ Tuple x $ fromChars xs)
sat :: (Char -> Boolean) -> Parser Char
sat pred = do
    c <- anyChar
    if pred c then pure c else fail
digit :: Parser Char
digit = sat isDigit
lower :: Parser Char
lower = sat isLower
upper :: Parser Char
upper = sat isUpper
letter :: Parser Char
letter = sat isAlpha
alphanum :: Parser Char
alphanum = sat isAlphaNum
char :: Char -> Parser Char
char c = sat ((==) c)
string :: String -> Parser String
string s =
  map fromChars $ case toChars s :: List Char of
       Nil       -> pure Nil
       Cons x xs -> do
          _ <- char x
          _ <- string $ fromChars xs
          pure $ Cons x xs
spaces :: Parser Unit
spaces = do
  c <- anyChar
  if c == ' '
    then spaces
    else Parser (\s -> Just (Tuple unit $ fromChars [c] <> s))
spaces' :: Parser Unit
spaces' = (char ' ' >>= \_ -> spaces') <|> pure unit
spaces0 :: Parser Unit
spaces0 = do
  _ <- many $ char ' '
  pure unit
spaces1 :: Parser Unit
spaces1 = do
  _ <- some $ char ' '
  pure unit
int :: Parser Int
int =
  neg <|> nat
nat :: Parser Int
nat = do
  xs <- some digit
  case fromString $ fromChars xs of
    Just n -> pure n
    Nothing -> empty
neg :: Parser Int
neg = do
  _ <- char '-'
  negate <$> nat
main :: Effect Unit
main = do
  logShow $ runParser anyChar "string"
  -- (Just (Tuple 's' "tring"))
  logShow $ runParser (char 's') "string"
  -- (Just (Tuple 's' "tring"))
  logShow $ runParser (char 'Z') "string"
  -- Nothing
  logShow $ runParser digit "3tring"
  -- (Just (Tuple '3' "tring"))
  logShow $ runParser (string "stri") "string"
  -- (Just (Tuple "stri" "ng"))
  logShow $ runParser (string "ZZZ") "string"
  -- Nothing
  logShow $ runParser (char 'Z' <|> char 's') "string"
  -- (Just (Tuple 's' "tring"))
  logShow $ runParser (empty :: Parser Unit) "string"
  -- Nothing
  logShow $ runParser (empty <|> anyChar) "string"
  -- (Just (Tuple 's' "tring"))
  logShow $ runParser (anyChar <|> empty) "string"
  -- (Just (Tuple 's' "tring"))
  logShow $ runParser spaces "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces "string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces' "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces' "string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces0 "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces0 "string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces1 "   string"
  -- (Just (Tuple Unit "string"))
  logShow $ runParser spaces1 "string"
  -- Nothing
  logShow $ runParser int "123 string"
  -- (Just (Tuple 123 " string"))
  logShow $ runParser int "-123 string"
  -- (Just (Tuple -123 " string"))
  logShow $ runParser int "string"
  -- Nothing
Outro
Special thanks to Tom for writing "Fantas, Eel, and Specification 10: Alt, Plus, and Alternative" and to Typeclassopedia. Those two provide clear explanations of the intuitions behind Alternative. Also, thanks a bunch to @monoidmusician and @kadblas from fpchat who helped me understand Lazy.
Get the latest content via email from me personally. Reply with your thoughts. Let's learn from each other. Subscribe to my PinkLetter!
 
 
              
 
    
Top comments (0)