DEV Community

Iven Marquardt
Iven Marquardt

Posted on

How to fold a string in Javascript?

A naive functional implementation would look like this:

const strFold = f => acc => ([head, ...tail]) =>
  head === undefined
    ? acc
    : strFold(f) (f(acc) (head)) (tail);

const countVowels = n => x =>
  "aoiue".search(x) === -1
    ? n
    : n + 1;

strFold(countVowels) (0) ("hello"); // 2

run code

However, this is neither stack-safe nor does it utilize the special trait of Javascript strings: They are different from arrays.

We don't want to consume each character one by one but chunks with any number of characters. Hence our string fold should rely on regular expressions. Let's do this:

const strFoldChunk = rx => f => acc => s => {
  const ry = new RegExp( // clone
    rx.source,
    rx.flags[0] !== "g"
      ? "g" + rx.flags
      : rx.flags);

  let r, acc_ = acc;

  while (r = ry.exec(s)) { // implicit local mutation by exec
    acc_ = f(acc_) (r[0]);
  }

  return acc_;
};

const strMatchAll = rx =>
  strFoldChunk(rx)
    (x => y => x + y)
      ("");

const strMatchLast = rx => s =>
  strFoldChunk(rx)
    (_ => x => x)
      ("")
        (s);

strMatchAll(/a\d/) ("aaa1a2a3a4a5"); // "a1a2a3a4a5"
strMatchLast(/a\d/) ("aaa1a2a3a4a5"); // "a5"

run code

We managed to consume chunks of characters and additionally abstracted from recursion with stack-safe folds. This is promising.

However, on second thoughts we cannot derive strMatch or strMatchNth from strFoldChunk efficiently, because this would need the folding to stop at the 1st or nth element. A fold has run to completion semantics, though. It traverses the entire structure.

Let's take another step and introduce lazy evaluated right folds. Please note that the thunk function creates a value that is not evaluated until it is needed by a computation. If thunk is evaluated once, the result is stored and reused for further accesses. Here is the implementation:

const strict1 = thunk =>
  thunk && thunk[THUNK]
    ? thunk.valueOf()
    : thunk;

const thunk = f =>
  new Proxy(f, new ThunkProxy());

// simplyfied version

class ThunkProxy {
  constructor() {
    this.memo = undefined;
  }

  get(g, k) {
    if (this.memo === undefined) {
      this.memo = g();

      while (this.memo && this.memo[THUNK])
        this.memo = this.memo.valueOf();
    }

    if (k === THUNK)
      return true;

    else if (k === Symbol.toPrimitive)
      return this.memo[Symbol.toPrimitive];

    else if (k === "valueOf")
      return () => this.memo;

    else return this.memo[k];
  }
}

const THUNK = "thunk";

const union = type => (tag, o) =>
  (o[type] = type, o.tag = tag.name || tag, o);

const match = (tx, o) =>
  o[tx.tag] (tx);List = union("List");

const Nil = List("Nil", {});

const Cons = head => tail =>
  List(Cons, {head, tail});

const strFoldChunkr = rx => f => acc => s => {
  const ry = new RegExp( // clone
    rx.source,
    rx.flags[0] !== "g"
      ? "g" + rx.flags
      : rx.flags);

  const go = r =>
    r === null
      ? Cons(acc) (NIL)
      : f(r[0]) (thunk(() => go(ry.exec(s))));

  return go(ry.exec(s));
};

const listFoldr = f => acc => xs => {
  const go = (xs, i) =>
    match(xs, {
      Nil: _ => acc,
      Cons: ({head, tail}) => f(head, i) (thunk(() => go(tail, i + 1)))
    });

  return go(xs, 0);
};

const strMatch = rx => s =>
  strFoldChunkr(rx)
    (Cons)
      ("")
        (s).head;

const strMatchNth = rx => n => s =>
  listFoldr((head, i) => tail =>
    i === n
      ? head
      : strict1(tail)) // explicitly encforce evaluation
        ("")
          (strFoldChunkr(rx)
            (Cons)
              ([])
                (s));

strMatch(/a\d/) ("aaa1a2a3a4a5"); // "a1"
strMatchNth(/a\d/) (2) ("aaa1a2a3a4a5"); // "a3"

run code

I know, this implementation is a bit more involved. Lazy evaluation is hard to understand, if you are unfamiliar with the concept. Basically the algorithm stops as soon as the criterion (evaluated the 1st and nth element respectively) is reached. The lazy behavior resembles short circuiting in a eagerly evaluated languages.

We successfully managed to push the limits of Javascript and this is only the beginning. Much more is possible.

Top comments (0)