Developers often pretend to know what they're doing, especially when they're insecure newer developers like myself! Sometimes we happen to stumble upon interesting patterns, think they're elegant, and get attached to them rather than using the solution that performs better. In the course of building a file directory, I gleaned some interesting insights into recursion, search, memoization, virtualization, and generator functions. The path getting there exposed me to concepts that I haven't really dealt with since my algorithms course in college. Fortunately, my first slow-but-elegant solution, a recursive react component, was supplanted by the use of generator functions in react-vtree
, an equally interesting technology. Dealing with folder-based file systems has been one of the more rewarding small features I've had the opportunity to work in my short career.
The idea of a folder-based file system is a ubiquitous abstraction in software. A folder-based filesystem exists as a tree structure. Each folder either contains files which can be thought of as leaf nodes in the tree structure or folders that have the aforementioned folder as a parent.
A Glossary for Terms in this Post:
- Tree ← A set of elements where each element has only one parent, which may be itself (called a root node). All paths to a root node are unique → Directory
- Node ← Any element in the tree → Folder or File
- Leaf ← Any node in the tree without children → *File
Viewing a set of folders in a directory reveals a clear hierarchy in that we can conditionally render children based on some a folder's particular ‘hide/show’ icon that handles click and keypress events.
In the course of building a new product for my employer, Meshify, we worked on building a Directory that could:
- Search by folder or filename and highlight matched text from the search
- Highlight a selected folder based on a url
folderId
parameter - Show and hide folder contents from click events
- Be able to handle ~10K+ folders without crashing or being overly laggy.
I wish I could say that I knew what I was doing when I started working on this problem. The first two insights I had regarded how to store and pass folder data and how to search recursively across folders.
Each folder in the list contains a parent folder id. Using this relationship, the list can be iterated over to return a set of children belonging to that folder. We should only have to do this once, invalidating data only on changes to the list of folders. This is the perfect case for a lookup table and memoization. In my case, I decided on a Map
data structure and the useMemo
hook. It’s worth noting that the use of object and memoization tools from another framework can work as well.
While being sure to write meaningful tests on different mocked folder lists, I built out the functionality for creating a memoized map that recomputes data associated with
The code that I ended up setting on looks like the folder provider in this example Folder Provider.
If you want to take anything away from the code above, the most useful part in my mind me was this code snippet.
const childrenMatch = annotatedRoot.children
.map(childContainsMatch)
.some(Boolean); // same as .some(item => item == true)
A child of a folder can contain a match to search text such that if any folder matches the search text somewhere deep in the tree, all of the folders in the path between the root folders and that folder have the information requisite to display their contents. The folder might need to be open even when a folder does not match the search text provided. In the case that a folder contains other folders, we need to use recursion to search those child folders for any elements that match independent of that folder's depth.
By knowing that we are guaranteed a return when we reach a folder without any children (you can think of this as a file if that helps), we should avoid potential stack overflow errors. The array method Array.prototype.some
in this context will exit as soon as it finds one true return from childContainsMatch
.
Given this map, we are able to build out a Directory component that handles most of the work we need to do (in theory, more to be revealed).
Initially, the component I built look something like this:
Control Flow for Folder Component
- Get Folder Info from Map given folder id
-
If the folder has children:
-
If search text is present:
-
If this folder name matches search:
- Render Name with highlighted search text, show/hide icon with event handlers
-
Else:
-
If this folder contains children that match or this folder is set to open:
- Map across this folders children, return new Folder component for each
-
If this folder contains children that match or this folder is set to open:
-
If this folder name matches search:
-
Else:
- Render name & show/hide icon with event handlers
-
If the folder is set to open:
- Map across children, return new Folder component for each
-
If search text is present:
-
Else (is a leaf node):
-
If search text is present:
- If name matches search:
- Render file name with search highlighting
- If name matches search:
-
Else:
- Render file name
-
If search text is present:
As you can see, in the case that a folder has children, the Folder component renders itself recursively! Some of you may not think that’s cool, but it’s the first time I've had a compelling need to use recursion with a React component and I think it’s damn cool.
Unfortunately, this scheme doesn't perform amazingly with large lists of folders. After some investigation, it was pretty clear that there weren't unnecessary re-renders or obviously slow performance issues in the FolderProvider
component. The unfortunate truth was that, in some cases, we were simply rendering too many things at once. Without changing any backend APIs, the best solution seemed to be virtualization. After using Twitter to ask what the current state of virtualization was, I was made aware of react-window. Scrolling through the readme of react-window led me to react-vtree. The npm package "provides a lightweight and flexible solution for rendering large tree structures", just what I was looking for.
Would it surprise you if I told you that this added even more complexity to the problem?
react-vtree
is a quick and practical introduction to the utility of generator functions, as well as virtualization. The core functionality of react-vtree lies in a treeWalker
generator function that is taken as a prop.
// In the component enclosing the scope of the tree walker funciton
const { annotatedFolderMap, searchText } = useContext(FolderContext)
function * treeWalker(refresh) {
const stack = []
rootFolders.forEach(folder => {
const data = annotatedFolderMap.get(folder.id)
if (searchText !== "" && isVisible) {
stack.push(data);
} else {
stack.push(folder)
}
})
while (stack.length !== 0) {
const currentFolder = stack.pop()
const isOpen = yield refresh ? { currentFolderData } : id
if (currentFolder.children.length > 0 && isOpen) {
children.map(child => {
const data = annotatedFolderMap.get(currentFolder.id)
if (searchText !== "" && isVisible) {
stack.push(data);
} else {
if (searchText === "") {
stack.push(data);
}
}
})
}
}
}
The function treeWalker
here is an example of lazily computed values. The tree that consumes the treeWalker function, looks up the default state for whether the particular folder is open, call this variable defaultIsOpen
. The tree then sends that data back to the treeWalker
function through the line const {value, done} = iter.next(defaultIsOpen)
. The const isOpen
in the while loop is set through that call to iter.next
. No data is collected unless we are sure it is a member of an open directory or it is a root folder. It is worth noting that the tree walker function is not as lazy as it could be, in that data that is not being rendered can still be collected as a result of calling this generator. This generator function is called any time a node's is open state is changed via the provided toggle
function.
react-vtree
is build on top of react-window. react-window
is a virtualization tool, which means that it only renders items that are viewable within your window. The savings are two-fold, less unneccessary data is saved and no unnecessary nodes are rendered. Of course, there is no longer the interesting use of recursion; one can take solace in the fact that this solution uses some of the most modern features of Javascript and the react ecosystem to suitably render thousands of folders blazingly fast.
Heres a gif of the final product:
In retrospect, the process of building this component mirrored the adage "make it work, make it pretty, and then make it fast". I wish I could say I knew what I was doing, but I happened to luckily stumble on a convenient separation of concerns. By separating the data concerns from the actual rendered view, the process of refactoring this work to go from using a bespoke, recursive tree component to a virtualized tree with react-vtree
was remarkably painless.
Top comments (0)