DEV Community

BN
BN

Posted on

I built a token-level debugger for comparing two LLMs

Same prompt, two models, different outputs. No tooling was actually showing me where they diverged.
Built tokenflame that gives entropy heatmaps, tokenizer diffs, divergence markers, token-by-token replay. One command, one HTML file.
pip install tokenflame

Top comments (0)