DEV Community

Julien Simon
Julien Simon

Posted on • Originally published at julsimon.Medium on

Deep dive: model merging, part2

Model merging is an increasingly popular technique for adding or removing capabilities from transformer models without additional training. In a previous video, we introduced model merging and studied several merging algorithms implemented in the mergekit library (https://github.com/arcee-ai): model soups, SLERP, Task Arithmetic, TIES, DARE, and Franken-merging.

This new video builds upon the previous one and explores new merging methods: model breadcrumbs, model stock, and DELLA. We also quickly look at model merging in Arcee Cloud, which you can run for free as part of the free tier!

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (0)

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay