DEV Community

Aman Sachan
Aman Sachan

Posted on

I Compressed GPT-2 to Run on an Arduino

The Impossible Problem

GPT-2 Small: 124M parameters = ~500MB

Arduino Uno: 2KB RAM, 32KB Flash

Gap: ~250,000x

The Solution

I built BitForge - aggressive LLM quantization for microcontrollers.

What It Does

  • 1-bit to 8-bit quantization
  • Adaptive per-layer bit width
  • Pure C99 output
  • No dependencies

Results

  • 8x compression achieved
  • 99.3% correlation preserved
  • Tested on ESP32, Arduino, STM32 targets

Try It

pip install bitforge
bitforge compress gpt2 --target esp32-s3 --bits 4
Enter fullscreen mode Exit fullscreen mode

GitHub: https://github.com/AmSach/bitforge

Top comments (0)