The Problem
So I recently came across an issue in a React App that I'm building (for funsies):
I have an array of objects, that potentially can get huge. Each one of those objects has an id
, so implementing Array.find
to get the item I want should work.
const nodes = [
{ id:"abc", content:"Lorem ipsum"},
{ id:"def", content:"Dolor sit" },
// ...
]
const getNode = id => nodes.find(n => n.id === id);
console.log(getNode('abc'));
// => { id:"abc", content:"Lorem ipsum" }
However, when nodes
gets big, Array.find
is going to be iterating over each item, which can get expensive. So we can implement a 'cache' of sorts to help out.
const nodes = [
{ id:"abc", content:"Lorem ipsum"},
{ id:"def", content:"Dolor sit" },
// ...
];
const keyedNodes = {}
const getNode = id => {
if (!keyedNodes[id]) {
keyedNodes[id] = nodes.find(n => n.id === id);
}
return keyedNodes[id];
}
console.log(getNode('abc'));
// => { id:"abc", content:"Lorem ipsum" }
console.log(getNode('abc'));
// This time we are coming from keyedNodes!
// => { id:"abc", content:"Lorem ipsum" }
Seems simple enough!
React and data
Being a relative React newbie, I had it drilled into my head where the sources of data in an app should be: either prop
or state
. props
holds data that the component receives (and it shouldn't update itself), and state
holds the current state of the component, which that same component has complete control over (via setState
of course!).
Armed with this info, I went to implement this memoization tactic using the component's state, and it got super messy given setState
's asynchronous nature.
Check out the demo on CodeSandbox
Look at that nasty getNode
function! We have to wait for the state to resolve before actually changing the node, or else we risk overwriting the state at the wrong time. (the state in changeNodes
, which doesn't have the keyedNodes
update in getNode
, would overwrite the keyedNodes
object to be blank! No help at all!).
I lived with this for a while, then I looked over at Kent Dodd's video on using class fields (which is useful for getting around those pesky bind
calls). This reminded me that class fields exist (sort of..., the default babel config for Create React App does allow for their use). So not only could I put state
in as a class field (along with arrow functions to create properly bound functions for component callbacks), but anything else can go here too!
Note: You don't actually need to use class fields for this, either! this.keyedNodes
in constructor
will do the same thing.
So, putting keyedNodes
on a classVariable renders something similar, but much easier to read:
Check out the demo on CodeSandbox
Downsides?
The main downside to this is that React doesn't look at class fields other than state
and props
to control the rendering of updates. So if for whatever reason you need this cache to be tied to the render loop, you are stuck with the first way: keeping the cache in state.
I believe that in most cases, however, the cache doesn't need to trigger or get updated by React itself. The cache should follow any updates to the component, not preempt them.
To that end, perhaps we can an addition in componentDidUpdate
to clear the cache if this.state.nodes
just went through an update, so we aren't potentially dealing with old data. But this goes to show that data in class fields needs to be treated with care.
One other side effect is that these class fields are bound to the instance and not the prototype. Meaning that another component on the page that is using the same set of data has to build its own cache, and can't borrow it. This can be fixed by putting the cache in state, lifting the cache to a parent component, or using a render prop (or HOC) with a Cache
component (or withCache
HOC).
Conclusion - Use with caution!
Holding on to component-specific (and even instance-specific) data within a class or object field can be very useful for some quick optimizations, or just holding some data that doesn't necessarily need to be ensnared in the React render loop, where the async nature of setState
can cause strange problems and race conditions that can lead to less-than-readable code. However, because the class field is outside of the render loop, updates to that data won't be managed by React, and can cause problems along the way if used improperly. A simple cache for storing data that needs to be readily accessible is a great use for this, as a cache naturally falls back onto the React state for a miss, and should 'follow the leader' in taking the source of truth from state.
Top comments (4)
Why not just use an object rather than an array and use id as sub object key?
So, our cache object
keyedNodes
is an object withid
being used at the key.nodes
is data from whatever data source you need to pull from, and you may not have control over what format it is in (think a list of records from MongoDB).Depending on the size and how many nodes you are accessing, you could pre-process the
nodes
array to build the entire cache up-front. That would be good if you know you are going to visit a majority of the nodes.Sidenote: In my specific case, I was looking to represent a sort of graph, and I chose a flat array structure for it instead of potentially incredibly deep nesting:
vs.
The second option seemed much better for me while building out the data.
I also chose a sort of on-demand caching strategy because it's possible to skip several nodes or loop back while a user is going through the data (which is for a sort of choose-your-adventure web game).
Steve,Is the combination of caching , pagination and search a better option in your case? All the checked nodes go to cache and search probably might help bringing in lesser nodes to navigate?
Why not just use a functional component and the useMemo hook? Or, if you're passing the result as props to another component, the React.memo() function?