Meta released TRIBE v2 last week - a foundation model that predicts fMRI brain activation from video, audio, and text. The question I kept coming back to was:
How do we actually compare AI models to the brain in a rigorous, statistical way?
So I built CortexLab - an open-source toolkit that adds the missing analysis layer on top of TRIBE v2.
The core idea
Take any model (CLIP, DINOv2, V-JEPA2, LLaMA) and ask:
- Do its internal features align with predicted brain activity patterns?
- Which brain regions does it match?
- Is that alignment statistically significant?
What you can do with it
Compare models against the brain
- RSA, CKA, Procrustes similarity scoring
- Permutation testing, bootstrap CIs, FDR correction per ROI
- Noise ceiling estimation (upper bound on achievable alignment)
Analyze brain responses
- Cognitive load scoring across 4 dimensions (visual, auditory, language, executive)
- Peak response latency per ROI (reveals cortical processing hierarchy)
- Lag correlations and sustained vs transient response decomposition
Study brain networks
- ROI connectivity matrices with partial correlation
- Network clustering, modularity, degree/betweenness centrality
Real-time inference
- Sliding-window streaming predictions for BCI-style pipelines
- Cross-subject adaptation with minimal calibration data
Example results
Benchmark output comparing 4 models (synthetic data, so scores reflect alignment method properties, not real brain claims):
clip-vit-b32:
rsa: +0.0407 (p=0.104, CI=[0.011, 0.203])
cka: +0.8561 (p=0.174, CI=[0.903, 0.937])
dinov2-vit-s:
rsa: -0.0052 (p=0.542, CI=[-0.042, 0.164])
cka: +0.8434 (p=0.403, CI=[0.895, 0.932])
vjepa2-vit-g:
rsa: +0.0121 (p=0.333, CI=[-0.010, 0.166])
cka: +0.8731 (p=0.438, CI=[0.915, 0.944])
llama-3.2-3b:
rsa: -0.0075 (p=0.642, CI=[-0.026, 0.145])
cka: +0.8848 (p=0.731, CI=[0.922, 0.949])
Why this isn't just TRIBE v2
TRIBE v2 gives raw vertex-level brain predictions. CortexLab adds:
- Statistical testing (is this score meaningful?)
- Interpretability (which ROIs, which modality, how does it evolve over time?)
- Model comparison framework (is model A significantly better than model B?)
Without that, you have predictions. With this, you can draw conclusions.
Interactive demo (no GPU needed)
There's a Streamlit dashboard with biologically realistic synthetic data (HRF convolution, modality-specific activation, spatial smoothing). You can explore all analysis tools interactively.
Links:
- GitHub: https://github.com/siddhant-rajhans/cortexlab
- Live demo: https://huggingface.co/spaces/SID2000/cortexlab-dashboard
- HuggingFace: https://huggingface.co/SID2000/cortexlab
76 tests, CC BY-NC 4.0, 3 external contributors already.
Looking for feedback
Especially interested in:
- Better alignment metrics beyond RSA/CKA/Procrustes
- Neuroscience validity of the ROI-to-cognitive-dimension mapping
- Ideas for real-world benchmarks (datasets, model comparisons)
Happy to answer questions about the implementation or methodology.
Top comments (0)