React DevTools tells you what rendered. Linters tell you what looks wrong in your code. But neither tells you why your state architecture is broken.
I spent six months thinking about this problem, and I ended up in a strange place: treating React state updates as discrete-time signals and using linear algebra heuristics to find architectural debt at runtime.
This article explains the technique. There's an open-source tool at the end, but the idea is what matters.
The Problem Nobody Talks About
Every React developer has written something like this:
const [isLoading, setLoading] = useState(false);
const [isSuccess, setSuccess] = useState(false);
const [hasError, setError] = useState(false);
Three booleans. They always toggle together. You've created an implicit state machine with impossible states (isLoading: true AND isSuccess: true). A linter won't catch this. A profiler won't catch this. Code review might catch it, if someone's paying attention.
Or this:
const { user } = useContext(AuthContext);
const [localUser, setLocalUser] = useState(user);
useEffect(() => {
setLocalUser(user); // manual sync
}, [user]);
You've copied a global truth into local state. Now you have two sources of truth. One day they'll drift, and you'll spend three hours debugging it.
These patterns are invisible to static analysis. They only reveal themselves through behavior at runtime. So the question is: can we detect behavior mathematically?
Stop Looking at Values. Start Looking at Timing.
Here's the key insight: you don't need to know what a state variable holds. You only need to know when it changes.
Imagine you're watching your app run. Every time a component re-renders, you write down which state variables changed during that frame. After 50 frames, you have something like this:
Frame: 1 2 3 4 5 6 7 8 9 10 ...
isLoading: 0 1 0 0 1 0 0 1 0 0
isSuccess: 0 1 0 0 1 0 0 1 0 0
userName: 0 0 0 0 0 0 1 0 0 0
Look at isLoading and isSuccess. They update in the exact same frames. Every time. That pattern alone tells you they're redundant, without ever looking at their values.
Now look at userName. Completely independent rhythm. It belongs in the architecture. The other two don't.
This is the core idea: each state variable is a binary vector in {0, 1}⁵⁰, where each dimension is a browser frame. We can use math on these vectors to find relationships.
Cosine Similarity: Measuring How "Together" Two Variables Move
The dot product of two binary vectors counts the frames where both variables updated:
But raw counts are misleading. A variable that updates 40 times out of 50 frames will overlap with almost anything. We need to normalize.
Cosine Similarity solves this:
This gives us a score between 0 and 1:
- 1.0 means the variables update in perfect sync.
- 0.0 means they never update in the same frame.
Why 0.88?
Through testing on production codebases (including Excalidraw), I found that 0.88 is the threshold that best filters browser scheduling noise while still catching real architectural problems. Geometrically, that's an angle of about 28° in a 50-dimensional space. Below that, you get false positives from the browser's non-deterministic task scheduling. Above that, you miss real issues.
This is a heuristic, not a formal proof. But it works in practice.
But Wait: Correlation Isn't Causation
Two variables updating in the same frame doesn't mean one caused the other. Maybe they're both responding to the same click event. That's not a bug. That's normal.
To figure out direction, we check the similarity at a time offset.
Instead of comparing frame-to-frame (offset 0), we shift one signal by one frame and compare again:
- Offset 0 (sync): A and B update in the same frame. Possibly redundant.
- Offset +1: A updates, then B updates one frame later. A might be causing B.
- Offset -1: B updates, then A updates one frame later. B might be causing A.
// Normalize the circular offset once, outside the loop
const baseOffset = ((headB - headA + offset) % L + L) % L;
for (let i = 0; i < L; i++) {
let iB = i + baseOffset;
if (iB >= L) iB -= L;
// ... dot product
}
If the highest similarity is at offset +1, that means A consistently updates one frame before B. That's a causal sync leak: A changes, triggers an effect, and that effect sets B. You're forcing a double render.
If the highest similarity is at offset 0, they're just redundant. Two variables storing the same information.
This distinction matters because the fix is different. Redundant state means "delete one." Causal leaks mean "derive instead of sync."
The Subspace Problem: Local vs. Global
There's another pattern that pure correlation misses: Context Mirroring.
When you copy a context value into local state, the local variable will be perfectly correlated with the context. But they have different roles. The context is the source of truth. The local state is a shadow.
To handle this, we assign every signal a role:
-
Local Basis (U):
useState,useReducer. These should be independent. -
Global Subspace (W):
createContext. These are the anchors.
The ideal architecture is a Direct Sum: V ≈ U ⊕ W, meaning local and global state don't overlap. If a local variable is found inside the global subspace (high similarity with a context value), it's flagged as Context Mirroring.
This catches a class of bugs that pure similarity analysis misses: the local state isn't "redundant" with another local state. It's redundant with a context.
Finding the Root Cause: Who Do I Fix First?
In a complex app, this analysis can produce dozens of findings. Three boolean explosions, two context mirrors, five sync leaks. Where do you start?
This is where it gets interesting. We can model all these relationships as a directed graph.
- Nodes: State variables, effects, and events.
- Edges: "A triggered B" (from the lead-lag analysis).
But here's the problem: when two variables update from the same click handler, neither caused the other. To handle this, we create a virtual event node for every external trigger:
Event_click → {setUser, setTheme, setTimestamp}
Now we can tell the difference between a dependency chain (A → effect → B) and siblings responding to the same trigger (Event → A, Event → B). Without this, every pair of simultaneous updates looks like a causal leak.
Eigenvector Centrality (PageRank for React Hooks)
Once we have the graph, we need to rank nodes by importance. A simple "count the edges" approach doesn't work because not all downstream nodes are equally important.
Instead, we calculate Eigenvector Centrality. The idea is the same as PageRank: a node is important if it triggers other important nodes.
We solve this iteratively (Power Iteration, typically converges in under 20 iterations). The node with the highest score is the Prime Mover: the single root cause driving the most downstream activity.
In practice, this means: instead of fixing 12 individual issues, you fix the one event handler or effect that's causing all of them.
Does It Actually Work?
I tested this on two real codebases:
Excalidraw (114k stars): The engine identified a sequential sync leak in theme management. A context update was triggering a local state sync via useEffect, causing a double render on every theme change. The lead-lag analysis at τ = +1 caught it.
shadcn-admin: Found a Global Event Fragmentation pattern where a single user action was updating three separate context providers. The spectral ranking correctly identified the event handler as the Prime Mover, not any of the individual state variables.
Both findings led to PRs that got merged.
Performance
The whole thing runs in the browser without blocking the UI:
- Signal recording: O(1) per update (ring buffer overwrite)
- Correlation analysis: O(D × N) where D = dirty variables (typically 2-3)
- Spectral ranking: ~0.3ms for a typical app (18 nodes), ~27ms for 30,000 nodes
- Memory: 0 bytes heap growth over a 20-minute endurance test
The signal recording uses requestAnimationFrame for sampling and requestIdleCallback for analysis. The math only runs when the browser is idle.
The Technique, Summarized
- Represent each state variable as a binary vector: did it update this frame (1) or not (0)?
- Use Cosine Similarity to find variables that update together (redundancy).
- Use Lead-Lag Correlation (offset ±1) to find causal chains (double renders).
- Assign roles (local vs. context) to detect context mirroring.
- Build a directed graph of causal relationships.
- Run Eigenvector Centrality to find the root cause.
None of this looks at actual state values. None of this reads your source code. It's purely behavioral, purely based on timing.
Try It
The technique is implemented as an open-source library called react-state-basis. It wraps your React hooks at build time via a Babel plugin, runs the analysis in development, and compiles down to zero-op passthrough hooks in production.
If the math interests you, the wiki goes deeper into the signal processing model, the spectral ranking algorithm, and the performance benchmarks.
If you try it on your codebase and find something interesting, I'd genuinely love to hear about it.



Top comments (0)