Ariane

Posted on May 31, 2020 • Edited on Jun 1, 2020

To Understand React Fiber, You Need to Know About Threads

#react #javascript #computerscience

A not so brief introduction to (some) React Fiber fundamentals, and the CS concepts that it's modelled on.

A Little Background

It's important to understand that I'm approaching this topic as a total beginner. When I worked on my first React project, I felt this very strong desire to understand how React works. I think intuitively I could feel how powerful React was, but intellectually I wanted to understand WHY we need React for modern development and what powers it "under the hood". So, this article aims to make sense of that.

I relied heavily on the following sources to write this article:

Lin Clark's A Cartoon Intro to Fiber

Philip Roberts What The Heck Is the Event Loop Anyway?

Max Koretskyi's The how and why on React’s usage of linked list in Fiber to walk the component’s tree

Andrew Clark's React Fiber Architecture

Understanding Fiber hinges on the following concepts: (Cooperative) Scheduling, Threads, and Linked Lists. I've added those and a couple other terms to an appendix, and you can refer to them when you need!

So starting at the beginning, what is React, and what is React Fiber?

React is a javascript library that helps developers build complex, modern UIs.

Fiber refers to React's data structure/architecture. Fiber made it possible for React to implement a new reconciliation algorithm. It improves perceived performance for complex React applications.

What?

Ok yeah, that was a mouthful.

What is a reconciliation algorithm?

When we talk about reconciliation in the context of the browser, we're trying to reconcile what's currently rendered on the page, and what should/will be rendered next.

The DOM - the Document Object Model - is a browser interface that allows programs and scripts to manipulate what's rendered on a web page. The DOM can be manipulated using vanilla JS but libraries like React aim to make manipulation easier.

As UIs have become more complex, rendering and the data that's required for it has been broken into smaller and smaller components. On a modern web app (say Facebook) if you click on a button, it's not likely that as a user you expect to navigate to a whole other page. It's more likely that when you click a button to like a post you expect to see the number of likes go up, or as you type a new post, you expect to see your words appear in that input.

Rendering your words live as you type them is actually easily done without any JS at all, but the problem is that again, as the user, when you submit that post, you expect to see it on that same page along with all the other posts that were already there, plus you expect to see when someone else likes a different post, or another user posts to your timeline, and when you hover over that post you want to see a list of emoji reactions you can click etc etc. Suddenly, using the DOM to keep track of those small components and the state of their data becomes very complicated.

So how did React make it easier to render these smaller components?

Instead of having to tell the browser HOW to get from one render to the next, React made it so developers could simply declare what they wanted the next render to look like, and React would make it so!

To do this, React created a component tree, and when it was notified that a change needed to be rendered, React would traverse the tree telling the DOM to render specific nodes that needed to be added or updated. What's important to understand here is how React was traversing the component tree and updating the DOM before Fiber.

A Component Tree

Image Source

"React implemented a recursive algorithm that would call mount component or update component on the DOM until it got to the bottom of the tree." - Lin Clark

Before Fiber, React didn't separate the process of reconciliation, and rendering to the DOM. As a result, the "main thread" - Javascript is a single-threaded process - would get stuck at the bottom of the call stack. In other words, React was calling the DOM to render synchronously, and it couldn't pause this traversal to go call a different render anytime in the middle, so frames in the browser would get dropped.

This first version of React's reconciliation algorithm was retroactively termed the 'Stack Reconciler', which illustrates how it operated.

What did it mean for the main thread to get stuck at the bottom of the call stack?

It meant that if, for instance, a component needed to be changed but React hadn't completed traversing the tree from a previous call to render, then it wouldn't be able to handle that change until it had completed traversal.

Without the option to interrupt reconciliation, no new changes could be "inserted" into the stack, effectively blocking any other (potentially higher priority) changes from being made until the stack was cleared.

Enter Fiber.

The Fiber architecture can solve blocking (and a host of other problems) because Fiber made it possible to split reconciliation and rendering to the DOM into two separate phases.

Phase 1 is called Reconciliation/Render.
Phase 2 is called Commit.

Admittedly, it's a bit confusing that rendering is referred to in phase one, but let's iron that out.

In phase one, React is called to render new and/or updated components (it can also perform other types of work that I won't get into). React will schedule the work to be done (changes to be rendered) by creating a list of changes (called an effect list) that will be executed in the Commit phase. React will fully compute this list of changes before the second phase is executed.

In the second, Commit phase, React actually tells the DOM to render the effect list created in phase one.

What's really important to understand here, is that the Reconciliation/Render phase can be interrupted, but the Commit phase cannot, and it's only in the Commit phase that React will actually render to the DOM.

Fiber makes it possible for the reconciliation algorithm to walk the component tree using a singly linked list tree traversal algorithm. (see appendix). The Fiber architecture was created because a linked list traversal algorithm can run asynchronously, using pointers to return to the node where it paused its work.

Visualization of a Traversal

Image Source

How does Fiber help to break down reconciliation?

Ok, now we're getting to the good stuff.

Basically, a Fiber is a node that represents a unit of work. Fiber is React's version of a thread, which is "the smallest sequence of programmed instructions that can be managed independently by a scheduler."

Image Source

A Multi-Thread Process

Javascript is a single-thread process, but Fiber helps fake a multi-thread process because it enables asynchronus behaviour.

React creates two Fiber tree instances, the current instance, and the workInProgress instance. The current instance is built on first render, and has a one-to-one relationship with the React component tree. When a new render is called, React will start work on the new workInProgress instance using the reconciliation algorithm to walk the component tree and find where changes must be made.

Fiber Tree Instances

Image Source

React leverages the asynchronus model of "cooperative scheduling" (see appendix) in order to build the workInProgress tree.

Modern browsers (like Chrome) have an API called requestIdleCallback, that allows web apps to schedule work when there's free time at the end of a stack frame, or when the user is inactive (React uses a polyfill when browsers don't offer this API).

When React is called to render and start reconciliation, it checks in with the main thread to know how much time it has to do its work. React does a unit of work, then checks in with the main thread again, and repeats this process until it has completed the workInProgress tree - which means traversing all the child and sibling nodes, and then returning to their parent, eventually reaching the root node and completing the tree.

As I understand, Chrome's implementation of the requestIdleCallback API will grant as much as 50 ms to React to do it's work, but React will check in with the main thread after it's done work for each Fiber.

If at some point React checks in and the main thread has new work to be done (maybe the user clicked a button), React will complete whatever work it can in the remaining time it was originally given, but then yield to the main thread and drop the process it was doing to pick up the new work from the browser. Once it completes that new work, React will restart the work it was trying to complete before.

Here is where things become a little fuzzy for me. Concurrent React is still in the experimental phase. As I understand, the implementation of the Fiber architecture makes it possible for the React team to create features like Time-Slicing and Suspense that would be built on this cooperative scheduling model, but it's not entirely clear to me how well developed React scheduling is right now. I'd seek to answer this question next in my research.

Concurrent React

So What did we Learn?

React Fiber is not, as I had orginaly understood, the React reconciliation algorithm itself. Fiber is a single unit of the React data structure that enables more complex reconciliation algorithms and cooperative scheduling in React. The reconciliation algorithm implemented with Fiber uses a single linked list tree traversal model to flatten the component tree into a linked list of Fiber nodes to be committed to the DOM.

A Final Note

I welcome corrections to this article because I'm well aware my understanding is in no way complete, and probably totally wrong in some cases.

Appendix

Scheduling

In computing, scheduling is the method by which work is assigned to resources that complete the work. The work may be virtual computation elements such as threads, processes or data flows, which are in turn scheduled onto hardware resources such as processors, network links or expansion cards.

A scheduler is what carries out the scheduling activity. Schedulers are often implemented so they keep all computer resources busy (as in load balancing), allow multiple users to share system resources effectively, or to achieve a target quality of service. Scheduling is fundamental to computation itself, and an intrinsic part of the execution model of a computer system; the concept of scheduling makes it possible to have computer multitasking with a single central processing unit (CPU).
Terms: workers, threads, single or multi-threads
Source: Wikipedia

Threads

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. The implementation of threads and processes differs between operating systems, but in most cases a thread is a component of a process. Multiple threads can exist within one process, executing concurrently and sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its dynamically allocated variables and non-thread-local global variables at any given time.

Source: Wikipedia
See Also: Specific to React - Fiber Principles

What is Heap vs Stack? Heap is memory, stack is function frames

Cooperative Scheduling

Cooperative multitasking, also known as non-preemptive multitasking, is a style of computer multitasking in which the operating system never initiates a context switch from a running process to another process. Instead, processes voluntarily yield control periodically or when idle or logically blocked in order to enable multiple applications to be run concurrently.

This type of multitasking is called "cooperative" because all programs must cooperate for the entire scheduling scheme to work. In this scheme, the process scheduler of an operating system is known as a cooperative scheduler, having its role reduced down to starting the processes and letting them return control back to it voluntarily.

Source: Wikipedia

Another Source: Cooperative and Pre-emptive Scheduling Algorithms

Linked Lists

A linked list is a linear data structure where each element is a separate object.

Source: Linked Lists
Another Source: Wikipedia

requestIdleCallback()

The requestIdleCallback method queues a function to be called during a browser's idle periods. This enables developers to perform background and low priority work on the main event loop, without impacting latency-critical events such as animation and input response.

Without requestIdleCallback, if you append elements to the DOM while the user happens to be tapping on a button, your web app can become unresponsive, resulting in a poor user experience. In the same way that requestAnimationFrame allowed apps to schedule animations properly and maximize the chances of hitting 60fps, requestIdleCallback schedules work when there is free time at the end of a frame, or when the user is inactive. This means that there’s an opportunity to do your work without getting in the user’s way.

Source: MDN

Source: Google developers resource

The Future of AI, LLMs, and Observability on Google Cloud

Datadog sat down with Google’s Director of AI to discuss the current and future states of AI, ML, and LLMs on Google Cloud. Discover 7 key insights for technical leaders, covering everything from upskilling teams to observability best practices

Learn More