
Sign up to save your podcasts
Or


Note: some of the audio in the second half is a little wonky, but the general voice was upgraded so hopefully it's a little less "poppy" until then!
I'm trying to fix little pronunciation problems on a weekly basis. Thanks to my early fans! It'll keep improving. E.g. some of the months were wonky.
When what seems like pure LLM black magic is actually supported by the literature.
This is AI generated audio with Python and 11Labs
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/model-merging
00:00 Model merging lessons in The Waifu Research Department
02:21 How and why does model merging work?
07:13 Aside: merging vs. ensembles vs. mixture of experts
08:21 Why are people doing this?
11:22 Tools & Links
11:51 Brief (visual) literature review
12:07 Full model merging and recent methods
15:55 Weight averaging during pretraining
17:18 LoRA merging
17:53 More background
Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_005.png
Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_016.png
Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_042.png
Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_051.png
Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_055.png
Figure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_058.png
Figure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_060.png
Figure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_062.png
Figure 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_065.png
Figure 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_075.png
Figure 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_077.png
Figure 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_084.png
By Nathan Lambert4.1
99 ratings
Note: some of the audio in the second half is a little wonky, but the general voice was upgraded so hopefully it's a little less "poppy" until then!
I'm trying to fix little pronunciation problems on a weekly basis. Thanks to my early fans! It'll keep improving. E.g. some of the months were wonky.
When what seems like pure LLM black magic is actually supported by the literature.
This is AI generated audio with Python and 11Labs
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/model-merging
00:00 Model merging lessons in The Waifu Research Department
02:21 How and why does model merging work?
07:13 Aside: merging vs. ensembles vs. mixture of experts
08:21 Why are people doing this?
11:22 Tools & Links
11:51 Brief (visual) literature review
12:07 Full model merging and recent methods
15:55 Weight averaging during pretraining
17:18 LoRA merging
17:53 More background
Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_005.png
Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_016.png
Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_042.png
Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_051.png
Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_055.png
Figure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_058.png
Figure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_060.png
Figure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_062.png
Figure 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_065.png
Figure 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_075.png
Figure 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_077.png
Figure 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_084.png

538 Listeners

1,095 Listeners

292 Listeners

208 Listeners

202 Listeners

313 Listeners

99 Listeners

576 Listeners

143 Listeners

101 Listeners

226 Listeners

146 Listeners

490 Listeners

33 Listeners

39 Listeners