<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sid</title>
    <description>The latest articles on DEV Community by Sid (@sid_r).</description>
    <link>https://dev.to/sid_r</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3859951%2Feeda33cf-0f35-4e56-8b33-c8dfddc657e3.png</url>
      <title>DEV Community: Sid</title>
      <link>https://dev.to/sid_r</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sid_r"/>
    <language>en</language>
    <item>
      <title>Which AI models are actually "brain-like"? I built an open-source benchmark to measure it</title>
      <dc:creator>Sid</dc:creator>
      <pubDate>Fri, 03 Apr 2026 18:44:11 +0000</pubDate>
      <link>https://dev.to/sid_r/which-ai-models-are-actually-brain-like-i-built-an-open-source-benchmark-to-measure-it-fja</link>
      <guid>https://dev.to/sid_r/which-ai-models-are-actually-brain-like-i-built-an-open-source-benchmark-to-measure-it-fja</guid>
      <description>&lt;p&gt;Meta released TRIBE v2 last week - a foundation model that predicts fMRI brain activation from video, audio, and text. The question I kept coming back to was:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do we actually compare AI models to the brain in a rigorous, statistical way?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So I built CortexLab - an open-source toolkit that adds the missing analysis layer on top of TRIBE v2.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core idea
&lt;/h2&gt;

&lt;p&gt;Take any model (CLIP, DINOv2, V-JEPA2, LLaMA) and ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do its internal features align with predicted brain activity patterns?&lt;/li&gt;
&lt;li&gt;Which brain regions does it match?&lt;/li&gt;
&lt;li&gt;Is that alignment statistically significant?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What you can do with it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Compare models against the brain&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RSA, CKA, Procrustes similarity scoring&lt;/li&gt;
&lt;li&gt;Permutation testing, bootstrap CIs, FDR correction per ROI&lt;/li&gt;
&lt;li&gt;Noise ceiling estimation (upper bound on achievable alignment)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Analyze brain responses&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cognitive load scoring across 4 dimensions (visual, auditory, language, executive)&lt;/li&gt;
&lt;li&gt;Peak response latency per ROI (reveals cortical processing hierarchy)&lt;/li&gt;
&lt;li&gt;Lag correlations and sustained vs transient response decomposition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Study brain networks&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ROI connectivity matrices with partial correlation&lt;/li&gt;
&lt;li&gt;Network clustering, modularity, degree/betweenness centrality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Real-time inference&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sliding-window streaming predictions for BCI-style pipelines&lt;/li&gt;
&lt;li&gt;Cross-subject adaptation with minimal calibration data&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example results
&lt;/h2&gt;

&lt;p&gt;Benchmark output comparing 4 models (synthetic data, so scores reflect alignment method properties, not real brain claims):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  clip-vit-b32:
       rsa: +0.0407  (p=0.104, CI=[0.011, 0.203])
       cka: +0.8561  (p=0.174, CI=[0.903, 0.937])

  dinov2-vit-s:
       rsa: -0.0052  (p=0.542, CI=[-0.042, 0.164])
       cka: +0.8434  (p=0.403, CI=[0.895, 0.932])

  vjepa2-vit-g:
       rsa: +0.0121  (p=0.333, CI=[-0.010, 0.166])
       cka: +0.8731  (p=0.438, CI=[0.915, 0.944])

  llama-3.2-3b:
       rsa: -0.0075  (p=0.642, CI=[-0.026, 0.145])
       cka: +0.8848  (p=0.731, CI=[0.922, 0.949])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why this isn't just TRIBE v2
&lt;/h2&gt;

&lt;p&gt;TRIBE v2 gives raw vertex-level brain predictions. CortexLab adds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Statistical testing (is this score meaningful?)&lt;/li&gt;
&lt;li&gt;Interpretability (which ROIs, which modality, how does it evolve over time?)&lt;/li&gt;
&lt;li&gt;Model comparison framework (is model A significantly better than model B?)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without that, you have predictions. With this, you can draw conclusions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Interactive demo (no GPU needed)
&lt;/h2&gt;

&lt;p&gt;There's a Streamlit dashboard with biologically realistic synthetic data (HRF convolution, modality-specific activation, spatial smoothing). You can explore all analysis tools interactively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/siddhant-rajhans/cortexlab" rel="noopener noreferrer"&gt;https://github.com/siddhant-rajhans/cortexlab&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Live demo: &lt;a href="https://huggingface.co/spaces/SID2000/cortexlab-dashboard" rel="noopener noreferrer"&gt;https://huggingface.co/spaces/SID2000/cortexlab-dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;HuggingFace: &lt;a href="https://huggingface.co/SID2000/cortexlab" rel="noopener noreferrer"&gt;https://huggingface.co/SID2000/cortexlab&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;76 tests, CC BY-NC 4.0, 3 external contributors already.&lt;/p&gt;

&lt;h2&gt;
  
  
  Looking for feedback
&lt;/h2&gt;

&lt;p&gt;Especially interested in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Better alignment metrics beyond RSA/CKA/Procrustes&lt;/li&gt;
&lt;li&gt;Neuroscience validity of the ROI-to-cognitive-dimension mapping&lt;/li&gt;
&lt;li&gt;Ideas for real-world benchmarks (datasets, model comparisons)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy to answer questions about the implementation or methodology.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>machinelearning</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
