LoRA vs Adapter vs Prefix Tuning: PEFT Memory Comparison

#lora #peft #adapter #prefixtuning

Why Full Fine-Tuning Became Unaffordable

Fine-tuning GPT-3 175B requires updating 175 billion parameters. That's 700GB of optimizer states alone (Adam needs 2 copies per parameter). Most teams can't afford that.

Parameter-Efficient Fine-Tuning (PEFT) methods solve this by freezing the base model and training a tiny subset of parameters. LoRA, Adapter layers, and Prefix Tuning are the three most cited approaches. They all claim "competitive performance with <1% trainable parameters," but they achieve it in completely different ways.

This post compares the three methods mechanically: where the new parameters live, what the forward pass looks like, and which one actually saves you money on your next fine-tuning job. You can read the original LoRA paper here, Adapters from Houlsby et al. (2019), and Prefix Tuning from Li and Liang (2021).