DEV Community

0xc0Der
0xc0Der

Posted on

Building a simple word counting parser using pari.

In this post I'll implement a simple parser that counts the number of words and lines of the input.

first we need to define what we consider as a white space.

// it's written like that because
// it'll be passed to the `char` parser later.
const whitespace = ' \\r\\n\\t\\f\\v';
Enter fullscreen mode Exit fullscreen mode

Then, define what is a word

a word is a sequence of non white space characters.

So, follows the definition of word and space parsers.

import { char, oneOrMore } from 'pari';

// ...

const wordChar = char(`[^${whitespace}]`);
const wsChar = char(`[${whitespace}]`);

const word = oneOrMore(wordChar);
const space = oneOrMore(wsChar);
Enter fullscreen mode Exit fullscreen mode

So, how do we keep count? we need to define a parser State.

import { State, ... } from 'pari';

class CounterState extends State {
    #wordsCount = 0;
    #linesCount = 0;

    // State must have a `clone` method.
    clone() {
        const state = new CounterState(
            this.input,
            this.index,
            this.status
        );

        state.#wordsCount = this.#wordsCount;
        state.#linesCount = this.#linesCount;

        return state;
    }

    get wordsCount {
        return this.#wordsCount;
    }

    get linesCount {
        return this.#linesCount;
    }

    withIncWords() {
        this.#wordsCount += 1;
        return this;
    }

    withIncLines() {
        this.#linesCount += 1;
        return this;
    }
}

// ...
Enter fullscreen mode Exit fullscreen mode

In, the space parser we need to increase the count of lines by one if we encounter a line character and increase the count of words by one at the last space (the maybe multiple consecutive spaces).

// ...

const space = oneOrMore(wsChar.ok(state => 
    state.charAt(state.index - 1) == '\n'
        ? state.withIncLines()
        : state
)).ok(state => state.withIncWords());
Enter fullscreen mode Exit fullscreen mode

In the word parser we need to handle an edge case that is the end of input.

//...

const word = oneOrMore(wordChar.ok(state =>
    state.charAt(state.index) == ''
        ? state.withIncWords().WithIncLines()
        : state
));
Enter fullscreen mode Exit fullscreen mode

Finally, we define out word counter parser and pass it a state with an input.

// ...
import { firstOf, ... } from 'pari';

const wc = oneOrMore(firstOf([word, space]));

const input = ...;

const result = wc.process(
    new CounterState(input)
);

// print word and line counts
console.log(
    result.wordsCount,
    'words',
    result.linesCount,
    'lines'
);
Enter fullscreen mode Exit fullscreen mode

Thank you for reading 😄, If you have any questions do not hesitate to leave a comment.

SurveyJS custom survey software

JavaScript UI Libraries for Surveys and Forms

SurveyJS lets you build a JSON-based form management system that integrates with any backend, giving you full control over your data and no user limits. Includes support for custom question types, skip logic, integrated CCS editor, PDF export, real-time analytics & more.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay