Nickolay Platonov

Posted on Mar 16

Reactivity beyond React

#react #javascript #webdev #architecture

UI vs. Data

Reactivity has recently become a dominant paradigm for UI development. It was popularized by UI frameworks such as React, Vue, and Angular. Later, the approach to UI reactivity was revised by frameworks such as Solid.js, Preact, and others, with a focus on fine-grained reactivity.

Today these frameworks are highly refined and polished, yet they have barely scratched the surface of what is possible with reactivity. In this post, I invite you to dive deeper into reactivity beyond the UI - that is, reactivity in data.

Most of the ideas discussed below have been implemented in the reactive library called ChronoGraph and battle-tested in the implementation of a Gantt engine. When the text says “in practice,” it should be read as “in the practice of implementing a Gantt project scheduling engine.”

Reactivity in the data layer uses the same primitives, but it introduces new requirements and, with them, a different set of engineering challenges. Let’s go through the most important ones.

Transactionality

When you calculate data, you will most likely have business logic and rules to enforce. If you modify some signal deep in the reactive graph and the changes propagate to other nodes, some of those rules may be violated, and you may need to revert the whole change.

This is a major difference from UI reactivity, where this requirement is usually absent, and it influences the implementation significantly.

Once this requirement appears, the notions of commit and reject appear as well. It is also no longer possible to mutate dependencies in place during a transaction; all transactional data needs to be tracked separately from the “stable” state.

The closest concept in current libraries is probably batching, which, if I understand it correctly, reads the values of a group of signals in one pass.

Explicit reactive context

This requirement follows from the previous one. With reactive data, you still want a reactive UI, but you do not want to mix data and UI: a transaction rejected in the data layer should not reject anything in the UI.

Because of that, you need to be able to explicitly specify which reactive context a given primitive operates in. Most existing systems assume a single global reactive context shared by all primitives.

I’m talking about something like:

const ctx1 = new Reactivity()
const ctx2 = new Reactivity()

const signal1 = new ctx1.Signal(1)
const signal2 = new ctx2.Signal(1)

It remains an open question how dependencies between signals from different reactive contexts should be tracked. Various edge cases are possible, such as one context being deleted while another is still active.

The closest concept in current libraries is probably the createRoot(fn) call, which, if I understand it correctly, defines a reactive context active during the execution of fn. The difference is that you may still need to add signals to the context even after that function has finished executing.

Configurable laziness

Most existing libraries make an architectural choice about the laziness of their primitives upfront. For UI, the most optimal choice is usually to make them all lazy, so that reading from a top-level node automatically refreshes all the child nodes it uses.

For data, the situation is more subtle. You need to be able to configure the laziness of primitives.

Imagine the business rules you want to maintain. The signals that compute those rules need to be strict: they all should be evaluated during a transaction to ensure that no business rules are violated. Everything else should remain lazy to avoid unnecessary calculations.

The closest concept in current libraries is probably the createEffect call, which, if I understand it correctly, defines a signal that is always evaluated at the end of the current batch and cannot be read by any other signal because it does not produce an output.

Duality of primitives

Most existing reactive systems have two kinds of primitives. The first is a signal, a mutable value boxed into a reactive context. The second is a calculation, a lambda in a reactive context. Signals can be writable; calculations cannot.

In practice, this is not enough. Very often, a node needs to be both writable and calculable. Imagine a task in a calendar. It has reactive fields such as startDate, endDate, and duration. The business logic is: (1) all of these fields should be writable; (2) when one field is updated by the user, the others should adapt.

Technically, this can be implemented by using two reactive nodes for every field: one for user input and one for calculation. However, this doubles the number of nodes and may be impractical.

ChronoGraph solves this by implementing a special effect, INPUT_VALUE, for nodes. Conceptually, all nodes become computable, but signals are simply computable nodes that always return their input value:

const writeable = ctx.Calculation(Y => Y.INPUT_VALUE())
const calculable = ctx.Calculation(Y => writeable.value + 1)
const dual = ctx.Calculation(Y => {
    return writeable.value === 42 ? Y.INPUT_VALUE() : calculable.value
})

I do not think there is a corresponding concept in current reactive libraries, largely because this is not relevant to the UI rendering use case. In the data layer, however, this feature can halve the number of signals you need - say, from 2M to 1M - which can significantly reduce pressure on the garbage collector.

Cyclic computations

The duality described above, as demonstrated by the task scheduling example, very easily creates cycles in computations.

Those cycles need to be handled carefully. All nodes should follow the same calculation path.

This turned out to be a major problem for reactive data, at least in regular business use cases, where forms often contain many interconnected fields and all of those fields must remain editable.

ChronoGraph solves this problem for simple cycles with a fixed number of variables by using a special CycleResolver. Solving the more general case remains a research topic.

As with dual primitives, I have not seen much discussion of cyclic computations in current reactive libraries. Meanwhile, I believe this problem is one of the main blockers preventing wider adoption of reactive calculations in general.

Turing completeness

Here I mean the ability to modify some signals from inside calculation functions, with the result of that modification being observable within the same transaction.

This requirement may be controversial. But in practice it can be very useful and can greatly simplify the code. During development, we had a situation where this feature reduced the solution to a one-liner; the alternative would have required a major refactoring. Since the implementation is not especially complex, I believe it is a must-have feature.

To implement this feature, one needs to introduce an extra layer inside transactions - called iterations in ChronoGraph. When, during a transaction, a write to an already calculated signal occurs, a new iteration simply begins. Eventually, no new iterations start and the transaction completes.

This feature can easily lead to endless loops and de-optimizations, so it should not be abused.

Node leveling

When modeling the data domain, one of the basic requirements is to support belongs-to and has-many relations in data. For example, an order belongs to a customer, and a customer has many orders.

The belongs-to side is simply a signal. When its value is written, it should send a message to the has-many side. The has-many relation is then a collection of all such signals coming from the appropriate belongs-to signals.

An interesting problem appears here: when processing the has-many side, all messages from the belongs-to side should already have been processed. One solution is to add a concept of level to nodes, with the following rule: before calculating any node of level n, all nodes of level n - 1 must already have been calculated.

This establishes an interesting - and, I would say, natural - ordering of calculations:

signals at level 0 (directly writable nodes)
computables at level 1 - they can read only level 0 nodes
computables at level 2 - they can read only levels 0 and 1
...
finally, a fixed point: computables at level ∞ - they can read values from levels 0, 1, ..., ∞

In ChronoGraph, only three levels are used: 0, 1, and ∞. In a UI system, I imagine more levels could be added, such as ∞ + 1, ∞ + 2, and so on. This could correspond to progressive enhancement of a page, where some parts should be rendered as soon as possible while others can be delayed.

Stack inversion

When processing data, one can easily encounter a linked-list data structure. Once such a structure appears, its maximum length in current reactive systems is limited by the language stack size. In the best-case scenario, the limit is MAX_STACK_FRAME / 3-5, where 3-5 is the number of intermediate calls a reactive system makes between recursive calculation calls.

In practice, that means 2000-3000 elements, which is often not enough for real-world data.

The solution is to invert the stack and move it into user space. In JavaScript, this can be done with generators. This introduces a new type of signal - generator-based signals. They yield the signals they read, so stack frames do not get exhausted.

Conclusion

As you may have noticed, reactivity in the data layer is a much broader topic than reactivity in the UI. Since even UI reactivity has not yet settled into a common standard, it is no surprise that reactivity in data remains largely unexplored.

ChronoGraph is a first attempt to apply the reactivity paradigm to regular business applications, and it has been successful enough to power a real-world business system.

At the same time, during implementation we encountered certain challenges that suggest limits on the kinds of tasks reactive data can solve. But that is a topic for another blog post.

If you have a system or a use case that could be solved with reactivity in the data layer, please let me know in the comments. And if you’ve worked with reactivity in data in your own applications, I’d love to hear about your experience.

Thanks for reading!

Top comments (2)

Jaen • Mar 20 • Edited

Here I mean the ability to modify some signals from inside calculation functions, with the result of that modification being observable within the same transaction.

Can you go a bit more into detail about this use case? I have a hard time imagining something that would require this (and how it can work out without introducing subtle bugs).

Nickolay Platonov • Mar 20

I was surprised that this feature was needed, but this is just how business logic worked. It was calculating start/end dates of the task, that conforms to various constraints, then if both of those dates are inside of the non-working time interval according to task calendar, that means task has 0 duration. But if task has duration 0 start/end dates should be the same and there's a contradiction. The easy solution was to just write 0 to task's duration and let the reactivity to update the start/end dates. Alternative was a complete overhaul of the start/end calculations.

There were some other similar cases, where this feature was used as "magical trick" to shortcut on the complexity.