Chris Stafford

Posted on Apr 15

How to See Inside Your AI Model in 3 Lines of Python

#ai #python #opensource #machinelearning

I built a tool that makes any PyTorch model inspectable with one line of code. No retraining, no architecture changes, no extra memory. Here's how it works.

The Problem

You train a model. It works. But why does it work? Which layers matter? Are any neurons dead? What are the attention heads actually doing?

Most interpretability tools try to answer these questions after the fact -- approximations bolted onto a black box. I wanted something different: exact traces of what actually happened inside the model during inference.

The Solution: 3 Lines

pip install hdna-workbench[pytorch]

import workbench

model = workbench.inspect(model)   # swap layers for inspectable versions
output = model(input)              # same math, same output
traces = workbench.trace(model)    # see what every layer did

That's it. workbench.inspect() walks your model and replaces each layer with a subclass that records what happens during forward passes. nn.Linear becomes InspectableLinear, nn.MultiheadAttention becomes InspectableMultiheadAttention, etc.

Because they're subclasses:

isinstance(layer, nn.Linear) is still True
model.state_dict() works unchanged
torch.save(model) works unchanged
Output is numerically identical

What You Get

Here's real output from a small transformer:

embedding                   calls=1  shape=  (2, 32, 128)  time=0.08ms
layers.0.self_attn          calls=1  shape=  (2, 32, 128)  time=2.29ms
  Head entropy:  ['2.922', '2.987', '2.984', '2.970']
  Head sharpness: ['0.173', '0.159', '0.152', '0.156']
  Head redundancy: 0.5618
layers.0.linear1            calls=1  shape=  (2, 32, 256)  time=0.07ms
  Weights: mean=0.0001 std=0.0511 sparsity=0.0%
layers.1.self_attn          calls=1  shape=  (2, 32, 128)  time=0.66ms
  Head redundancy: 0.8213
norm                        calls=1  shape=  (2, 32, 128)  time=0.03ms
head                        calls=1  shape= (2, 32, 1000)  time=0.12ms

Per-layer timing. Attention head entropy (how spread out the attention is). Head redundancy (how similar heads are to each other). Weight statistics. All automatic.

Going Deeper

Find Dead Neurons and Anomalies

anomalies = workbench.anomalies(model)
for a in anomalies:
    print(f"WARNING: {a['layer']} -- {a['issue']}")

Inspect Attention Patterns

for name, module in model.named_modules():
    if hasattr(module, 'attention_weights'):
        heads = module.head_summary()
        for h in heads:
            print(f"Head {h['head']}: entropy={h['entropy']:.3f}")

Track Embedding Usage

for name, module in model.named_modules():
    if hasattr(module, 'most_accessed'):
        print(f"Top tokens: {module.most_accessed(10)}")
        print(f"Never accessed: {len(module.never_accessed())} tokens")

Set Breakpoints

# Halt when output magnitude exceeds threshold
layer.add_breakpoint(lambda l, inp, out: out.abs().max() > 100)

Control Trace Depth

from workbench import TraceDepth

workbench.set_depth(model, TraceDepth.FULL)    # activations + gradients + history
workbench.set_depth(model, TraceDepth.STATS)   # running statistics only
workbench.set_depth(model, TraceDepth.OFF)     # disable for benchmarking

Revert When Done

model = workbench.revert(model)    # back to standard PyTorch
torch.save(model, "clean.pt")     # no workbench dependency in saved model

It's More Than a Wrapper

The inspection wrapper is one part of a larger platform called HDNA Workbench. HDNA stands for Highly Dynamic Neural Architecture -- it includes:

An open-box AI engine where every neuron has persistent memory, mutable routing tables, and semantic tags. Not a black box with explanations bolted on -- transparent by design. Core runs on numpy alone.
Universal adapters that connect any model (PyTorch, HuggingFace, ONNX, or API) to the same research tools
6 research tools: Inspector, Decision Replay, Daemon Studio, Experiment Forge, Model Comparison, and Exporter
3 built-in curricula: Math (14 phases), Language (sentiment/topic/emotion/intent), Spatial (grid pattern recognition)
Compliance mapping to EU AI Act, NIST AI RMF, and ISO/IEC 42001

If you just want the PyTorch inspection, pip install hdna-workbench[pytorch] and use the 3 lines above. If you want to study how AI learns from the ground up, the HDNA core is there too.

14 Supported Layer Types

Category	Layers
Core	Linear, Embedding, Sequential
Transformer	MultiheadAttention, TransformerEncoderLayer, TransformerDecoderLayer
Normalization	LayerNorm, BatchNorm1d, BatchNorm2d
Convolution	Conv1d, Conv2d
Activation	ReLU, GELU, Softmax

Custom layers: workbench.register(MyLayer, InspectableMyLayer)

DEV Community