Skip to content

DEV Community

Aman Sachan

Posted on Apr 30

I Compressed GPT-2 to Run on an Arduino

#llm #embedded #tinyml #python

The Impossible Problem

GPT-2 Small: 124M parameters = ~500MB

Arduino Uno: 2KB RAM, 32KB Flash

Gap: ~250,000x

The Solution

I built BitForge - aggressive LLM quantization for microcontrollers.

What It Does

1-bit to 8-bit quantization
Adaptive per-layer bit width
Pure C99 output
No dependencies

Results

8x compression achieved
99.3% correlation preserved
Tested on ESP32, Arduino, STM32 targets

Try It

pip install bitforge
bitforge compress gpt2 --target esp32-s3 --bits 4

GitHub: https://github.com/AmSach/bitforge

Top comments (0)

Subscribe