DEV Community

Cover image for How many layers to fine-tune?
Alyonkka for Qdrant

Posted on

How many layers to fine-tune?

Model fine-tuning allows you to improve the quality of the pre-trained models with just a fraction of the resources spent on training the original model.

But there is a trade-off between the number of layers you tune and the precision you get.

Using fewer layers allows for faster training with larger batch size, while more layers increase the model's capacity.

Qdrant team run experiments so you could make more educated choices.

Image description

Here are some highlights:

  • Training only the head of a model (5% of weights) gives 2x boost on metrics, while full training gives only 3x.

  • Training only a head layer allows using larger models with bigger batch sizes, compensating for the precision.

  • If you only have a small dataset, full model tuning will give a more negligible effect

Read more in the article by Yusuf Sarıgöz.

Image of Datadog

How to Diagram Your Cloud Architecture

Cloud architecture diagrams provide critical visibility into the resources in your environment and how they’re connected. In our latest eBook, AWS Solution Architects Jason Mimick and James Wenzel walk through best practices on how to build effective and professional diagrams.

Download the Free eBook

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more