DEV Community

MxGuru

Posted on May 20

Two Localizers, Both Wrong: Bounding a Quantization Cost That Wouldn't Close

#quantization #hsaq #methodology #granite

Part 2 of the quantization series. Spent two days and $12 hunting for the right localizer after Part 1 showed the per-layer drift metric lies. Both candidates — token-level logit-divergence at wrong tokens, AWQ-clipping on the surfaced layers — came back empty. Honest finding: an 8B model on a 12GB card costs ~12.7% PPL on wikitext-2, the gap is diffuse and proportional, no clever subset-targeted fix closes it. One process habit (a no-op control reproducing the baseline to 4 decimals) caught a silent bug that would have shipped a wrong 'AWQ-clipping wins' claim.

Top comments (0)

Subscribe

MxGuru

AI security engineer focused on how LLMs fail — prompt injection, jailbreaks, and agent behaviour. I build systems that stress, break, and harden AI under real-world conditions.

Location

Australia ,Queensland
Education

Self Taught
Work

Sovereign Hive
Joined

May 16, 2026

The Best Result This Week Was a Failed Prediction — Phase-3a Doesn't Transfer

#quantization #hsaq #methodology #granite

When the Sensitivity Metric Lies: A Drift-Inversion Smoking Gun in Mixed-Precision LLM Quantization

#quantization #hsaq #awq #granite