In the previous post, an implementation of the parser class have been introduced, and in this post will be about some basic parsers.
If the parsing process broke down to it's simplest components, a pattern will be found in these components that represent the simplest operations the parser can do, then they can be combined to form larger and more complicated patterns.
First, we need the most basic and the most important of all. a parser to match one character.
the char
parser
It matches one character in the input string.
We need to define a parser using our Parser
class from before.
const char = char =>
new Parser(state => {
// logic goes here
});
Then, we need to match the current character with the given one. here I'll use Regexp
to match.
const char = char =>
new Parser(state => {
const match = new RegExp(`^${char}$`).test(state.charAt(state.index));
});
After that, the parser returns a new state with updated position and status.
const char = char =>
new Parser(state => {
const match = new RegExp(`^${char}$`).test(state.charAt(state.index));
return state
.withStatus(1 << (!match + !match))
.withIndex(state.index + match);
});
char
takes a "regex" that represents one character as an input. matches the current character with it. then, returns a new state based on the result.
In the coming posts, I'll discuss what exactly is the state, and implement more complex parsers.
for the full code. take a look at
pari
More than a simple parser combinator.
install with npm
.
npm i pari
usage and basic parsers
you can read the source in src/
. it's self documenting and easy to read.
here is a simple overview.
import {
char,
firstOf,
sequence,
zeroOrOne,
oneOrMore,
zeroOrMore
} from 'pari';
// the `char` parser matches one char.
// it take a `regex` that matches exactly one char.
const digit = char('[0-9]');
// `firstOf` parser returns the first match in a list of parsers.
const lowerCase = char('[a-z]');
const digitOrLwcase = firstOf([digit, lowerCase]);
// `sequence` parser matches a list of parsers in sequence.
const hex = char('[0-9a-fA-F]');
const byteHex = sequence([char('0'), char('x'), hex, hex]
…thanks for reading 😄.
Top comments (0)