Cell 0 uses df. Cell 1 defines df.
Notebook works for you because your kernel ran the cells in some other order and the variable's still in memory. You commit. Someone clones the repo, hits Restart and Run All, dies on cell 0.
Standard Python linters can't catch this. ruff, flake8, mypy operate on one source file at a time. A notebook is N cells whose execution order in your kernel may have nothing to do with their order on disk. The bug isn't inside any single cell. It's in the relationship between cells.
nborder is a static linter for that relationship.
Rules
| Code | Flags |
|---|---|
| NB101 |
execution_count decreases in source order |
| NB201 | Name used in cell N, only defined in cell M where M > N |
| NB102 | Name used somewhere, never defined anywhere |
| NB103 | Stochastic call (numpy, torch, tensorflow, stdlib random) before any seed |
How the cross-cell analysis works
Each cell gets parsed with libCST. A visitor extracts symbol definitions (assignments, function defs, class defs, imports) and symbol uses (name references, attribute roots) per cell. Connect them across cells in source order, you get a dataflow graph at notebook scope.
NB201 findings are uses whose nearest matching definition lives in a later cell. NB102 findings are uses with no matching definition anywhere.
The graph also makes the auto-fix safe. When NB201 fires, the fixer runs a topological sort over cell dependency edges. Sort succeeds, cells get reordered to respect dataflow and execution counts get cleared. Cycle detected, fixer bails with an explicit message naming the cycle.
Input: Run Output: Cell IDs preserved. Execution counts cleared. Second NB201 fix example
# cell 0
result = df.head()
# cell 1
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
nborder check --fix notebook.ipynb:
notebook.ipynb:cell_0:1:10: NB201 Variable `df` used in cell 0 is only defined in cell 1. The notebook will fail on Restart-and-Run-All. [*]
Fix outcomes:
reorder: applied (reordered 2 cells and cleared execution counts)
# cell 0
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
# cell 1
result = df.head()
nborder check exits 0.
NB103 and seed injection
NB103 walks the same graph for stochastic calls (np.random.rand, torch.rand, tf.random.normal, random.random) firing before any matching seed. The fix injects a single seed cell at the right position. Multi-library notebooks get one cell:
import numpy as np
np.random.seed(42)
rng = np.random.default_rng(42)
import torch
torch.manual_seed(42)
Alias-aware. import numpy as numpy_lib produces a seed line using numpy_lib, not a redundant fresh import. After fixing a NumPy notebook, computed cell outputs are byte-identical across consecutive jupyter nbconvert --execute runs.
JAX and scikit-learn get diagnostic-only handling. JAX needs PRNGKey threading through call signatures. sklearn random_state=None needs a value chosen against your testing strategy. Neither is a single line you can inject.
Byte-stable writer
Parse a notebook, modify nothing, write it back, bytes match exactly. Verified against nbformat v4.0, v4.4, v4.5 fixtures plus a real-world notebook corpus. When the writer does mutate during a fix, only the cells that actually changed get rewritten. Cell IDs, metadata, and unrelated cells stay verbatim.
Outputs
Four reporters:
-
text: ruff-style
path:cell:line:col: NB### message - json: machine-readable
-
github:
::error file=...,line=...,title=NB201::annotations for PR inline comments - sarif: SARIF 2.1.0, schema-validated
Pre-commit hook and a composite GitHub Action included:
- uses: moonrunnerkc/nborder@v0.1.4
with:
path: notebooks/
select: NB201,NB103
What it doesn't do
- Doesn't execute notebooks. Pair with nbval or papermill for kernel-level validation.
- Doesn't lint cell-internal style. That's nbqa.
- Dynamic name resolution (
exec,getattr,**kwargs, monkey-patching) is invisible. Same limitation as any static analyzer. - Cell magics are stripped before analysis. Names introduced by
%%captureget tracked. Anything magic-internal does not.
Install
moonrunnerkc
/
nborder
A fast, opinionated linter and auto-fixer for Jupyter notebook hidden-state and execution-order bugs.
nborder
A fast, opinionated linter and auto-fixer for Jupyter notebook hidden-state and execution-order bugs.
What this catches
| Code | Name | One-line example |
|---|---|---|
| NB101 | Non-monotonic execution counts | Cell 1 ran with In [3]: after cell 0 ran with In [5]:. |
| NB102 | Won't survive Restart-and-Run-All |
print(df) references a name no cell in the notebook defines. |
| NB201 | Use-before-assign across cells | Cell 0 uses df; df = ... only appears in cell 1. |
| NB103 | Stochastic library used without seed |
np.random.rand(3) runs with no seed call before it. |
Each rule has a docs page under docs/rules/ explaining the bug class, a bad and good example, and the auto-fix behaviour. The four sections below walk through each rule with the diagnostic nborder actually emits.
NB101: out-of-order execution
The execution_count field on each cell records the order Jupyter actually ran cells in, not the order they appear in the file. When those orders disagree, the recorded…

Top comments (0)