A naive functional implementation would look like this:
const strFold = f => acc => ([head, ...tail]) =>
head === undefined
? acc
: strFold(f) (f(acc) (head)) (tail);
const countVowels = n => x =>
"aoiue".search(x) === -1
? n
: n + 1;
strFold(countVowels) (0) ("hello"); // 2
However, this is neither stack-safe nor does it utilize the special trait of Javascript strings: They are different from arrays.
We don't want to consume each character one by one but chunks with any number of characters. Hence our string fold should rely on regular expressions. Let's do this:
const strFoldChunk = rx => f => acc => s => {
const ry = new RegExp( // clone
rx.source,
rx.flags[0] !== "g"
? "g" + rx.flags
: rx.flags);
let r, acc_ = acc;
while (r = ry.exec(s)) { // implicit local mutation by exec
acc_ = f(acc_) (r[0]);
}
return acc_;
};
const strMatchAll = rx =>
strFoldChunk(rx)
(x => y => x + y)
("");
const strMatchLast = rx => s =>
strFoldChunk(rx)
(_ => x => x)
("")
(s);
strMatchAll(/a\d/) ("aaa1a2a3a4a5"); // "a1a2a3a4a5"
strMatchLast(/a\d/) ("aaa1a2a3a4a5"); // "a5"
We managed to consume chunks of characters and additionally abstracted from recursion with stack-safe folds. This is promising.
However, on second thoughts we cannot derive strMatch
or strMatchNth
from strFoldChunk
efficiently, because this would need the folding to stop at the 1st or nth element. A fold has run to completion semantics, though. It traverses the entire structure.
Let's take another step and introduce lazy evaluated right folds. Please note that the thunk
function creates a value that is not evaluated until it is needed by a computation. If thunk
is evaluated once, the result is stored and reused for further accesses. Here is the implementation:
const strict1 = thunk =>
thunk && thunk[THUNK]
? thunk.valueOf()
: thunk;
const thunk = f =>
new Proxy(f, new ThunkProxy());
// simplyfied version
class ThunkProxy {
constructor() {
this.memo = undefined;
}
get(g, k) {
if (this.memo === undefined) {
this.memo = g();
while (this.memo && this.memo[THUNK])
this.memo = this.memo.valueOf();
}
if (k === THUNK)
return true;
else if (k === Symbol.toPrimitive)
return this.memo[Symbol.toPrimitive];
else if (k === "valueOf")
return () => this.memo;
else return this.memo[k];
}
}
const THUNK = "thunk";
const union = type => (tag, o) =>
(o[type] = type, o.tag = tag.name || tag, o);
const match = (tx, o) =>
o[tx.tag] (tx);List = union("List");
const Nil = List("Nil", {});
const Cons = head => tail =>
List(Cons, {head, tail});
const strFoldChunkr = rx => f => acc => s => {
const ry = new RegExp( // clone
rx.source,
rx.flags[0] !== "g"
? "g" + rx.flags
: rx.flags);
const go = r =>
r === null
? Cons(acc) (NIL)
: f(r[0]) (thunk(() => go(ry.exec(s))));
return go(ry.exec(s));
};
const listFoldr = f => acc => xs => {
const go = (xs, i) =>
match(xs, {
Nil: _ => acc,
Cons: ({head, tail}) => f(head, i) (thunk(() => go(tail, i + 1)))
});
return go(xs, 0);
};
const strMatch = rx => s =>
strFoldChunkr(rx)
(Cons)
("")
(s).head;
const strMatchNth = rx => n => s =>
listFoldr((head, i) => tail =>
i === n
? head
: strict1(tail)) // explicitly encforce evaluation
("")
(strFoldChunkr(rx)
(Cons)
([])
(s));
strMatch(/a\d/) ("aaa1a2a3a4a5"); // "a1"
strMatchNth(/a\d/) (2) ("aaa1a2a3a4a5"); // "a3"
I know, this implementation is a bit more involved. Lazy evaluation is hard to understand, if you are unfamiliar with the concept. Basically the algorithm stops as soon as the criterion (evaluated the 1st and nth element respectively) is reached. The lazy behavior resembles short circuiting in a eagerly evaluated languages.
We successfully managed to push the limits of Javascript and this is only the beginning. Much more is possible.
Top comments (0)