DEV Community

Julien Simon
Julien Simon

Posted on • Originally published at julsimon.Medium on

Deep dive: model merging, part2

Model merging is an increasingly popular technique for adding or removing capabilities from transformer models without additional training. In a previous video, we introduced model merging and studied several merging algorithms implemented in the mergekit library (https://github.com/arcee-ai): model soups, SLERP, Task Arithmetic, TIES, DARE, and Franken-merging.

This new video builds upon the previous one and explores new merging methods: model breadcrumbs, model stock, and DELLA. We also quickly look at model merging in Arcee Cloud, which you can run for free as part of the free tier!

Top comments (0)