DEV Community

Vitaliy Akimov
Vitaliy Akimov

Posted on • Updated on • Originally published at codeburst.io

Simpler Transducers for JavaScript

Developers often want to split computations into several separate stages. The smaller the stage, the easier it is to reason, develop and maintain. For example we split some computation into 3 stages in functions f, g, k with resulting one is input => f(g(k(input))) or using Ramda R.comp(f,g,k) or any other library with function composition operator.

The problem with this approach is intermediate data passed between functions, and each sub-stage should finish its computation completely before passing the result to next stage. The data size they operate with may be large or even infinite if it is some server requests stream. In a case of unlimited data k will never return control. As it is often occurring task, there are many solutions, like nodejs streams with their .pipe() operation adding stream transformer to the chain.

The worse approach would be to pass a single value between the stages and mutate it. It is very difficult to reason about shared data mutation especially if it is some recursive structure, like programming language abstract syntax tree.
Transducers described in this post may be seen as a simpler solution to the problem, working simultaneously, no intermediate data, no data mutations.

Transducers are easy to compose. In fact they are just functions and function composition is just enough, the expressions above (input => f(g(k(input))) and R.comp(f,g,k)) are the same for transducers. The resulting transducer is a pipeline of computations receiving data from producer and passing it to consumer. Producer and consumer may do many things, read/write network data, file, DB, or just in-memory array.

The term transducers became popular after introducing them in Clojure in Transducers are coming blog post and ported to JavaScript by a few libraries including Ramda. Clojure style transducers are different to the ones described in this post. They transform consumers, which are called reducers in Clojure. Unlike these transducers which transform producers. This distinction makes them much simpler to define in use in ES6 because of generator functions.

Clojure transducers type from the original blog post is:

;;reducing function signature
whatever, input -> whatever

;;transducer signature
(whatever, input -> whatever) -> (whatever, input -> whatever)

There is an earlier paper with the example of transducers transforming producers instead of consumers: Lazy v. Yield: Incremental, Linear Pretty-printing in Haskell. And data types there are:

type GenT e m = ReaderT (e -> m()) m
type Producer m e = GenT e m ()
type Consumer m e = e -> m ()
type Transducer m1 m2 e1 e2 = Producer m1 e1 -> Producer m2 e2

To see Consumer there is a reducer from Clojure substitute State e a = s -> m (a, s) into Consumer definition:

Consumer (State whatever) input
= input -> State whatever ()
= input -> whatever -> ((), whatever)
= whatever, input -> whatever

Producer in the paper has a more complex type. Haskell doesn’t have embedded generators.

Fortunately, JavaScript now has. Producers are just any Iterable value. It may be some in-memory array or any generator function. Consumer is a function taking Iterable value and interpreting it somehow, for example by saving results to file, or JavaScript standard Array.from function stores result in in-memory Array. The approach will work even if the sequence is infinite.

Transducers take input Producer (Iterator) along with other optional parameters and return another Producer-iterator with another computation stacked on top of it.

A typical pattern is:

function* myFun(parameter, input) {
  // local variable storing this transducer's internal state
  // state with initial values
  let sum = 0;
  for(const i of input) {
    // use the current value of `i` along with the current
    // state `sum` to compute its next output value `o`
    sum += i;
    const o = sum * 10;
    yield o;      
  }
}

For example map function applying a function to each element is:

function* map*(fun, input) {
  for(const i of input) {
     yield fun(i); 
  }
}

Or filter, passing further only elements satisfying some predicate:

function* filter(pred, input) {
  for(const i of input) {
    if (pred(i))
      yield i;  
  }
}

Taking first num element is:

const take = num => function*(input) {
  let count = 0
  for(const i of input) {
    yield i
    if (++count === num)
      return
  }
}

Next is a more complex chunk function. It receives stream of arrays of arbitrary length, and partition them into arrays of fixed size:

const chunk = size => function*(input) {
  const buf = []
  for(const i of input) {
    buf.push(...i)
    while (buf.length > num)
      yield buf.splice(0, num)
  }
  if (buf.length)
    yield buf
}

Hierarchical data may be handled as well; an example is @effectful/transducer project for JavaScript syntax transformation.

But everything is not this simple if there is an asynchronous code at some stage. Likely this is required in Producer because IO in JavaScript is typically asynchronous. It is possible to call next of an iterator in an asynchronous callback, but not yield.

Recently EMCAScript has got async generators and for await-of syntax extension for this. Everything in this story works for async generators too, except for-of is replaced by for await-of. There is a more detailed case study for async generators as transducers in “Decouple Business Logic Using Async Generators” article.

Top comments (0)