
Sign up to save your podcasts
Or
Note: some of the audio in the second half is a little wonky, but the general voice was upgraded so hopefully it's a little less "poppy" until then!
I'm trying to fix little pronunciation problems on a weekly basis. Thanks to my early fans! It'll keep improving. E.g. some of the months were wonky.
When what seems like pure LLM black magic is actually supported by the literature.
This is AI generated audio with Python and 11Labs
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/model-merging
00:00 Model merging lessons in The Waifu Research Department
02:21 How and why does model merging work?
07:13 Aside: merging vs. ensembles vs. mixture of experts
08:21 Why are people doing this?
11:22 Tools & Links
11:51 Brief (visual) literature review
12:07 Full model merging and recent methods
15:55 Weight averaging during pretraining
17:18 LoRA merging
17:53 More background
Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_005.png
Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_016.png
Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_042.png
Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_051.png
Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_055.png
Figure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_058.png
Figure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_060.png
Figure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_062.png
Figure 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_065.png
Figure 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_075.png
Figure 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_077.png
Figure 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_084.png
4.1
99 ratings
Note: some of the audio in the second half is a little wonky, but the general voice was upgraded so hopefully it's a little less "poppy" until then!
I'm trying to fix little pronunciation problems on a weekly basis. Thanks to my early fans! It'll keep improving. E.g. some of the months were wonky.
When what seems like pure LLM black magic is actually supported by the literature.
This is AI generated audio with Python and 11Labs
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/model-merging
00:00 Model merging lessons in The Waifu Research Department
02:21 How and why does model merging work?
07:13 Aside: merging vs. ensembles vs. mixture of experts
08:21 Why are people doing this?
11:22 Tools & Links
11:51 Brief (visual) literature review
12:07 Full model merging and recent methods
15:55 Weight averaging during pretraining
17:18 LoRA merging
17:53 More background
Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_005.png
Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_016.png
Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_042.png
Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_051.png
Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_055.png
Figure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_058.png
Figure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_060.png
Figure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_062.png
Figure 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_065.png
Figure 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_075.png
Figure 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_077.png
Figure 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_084.png
1,002 Listeners
509 Listeners
270 Listeners
193 Listeners
201 Listeners
281 Listeners
88 Listeners
347 Listeners
124 Listeners
190 Listeners
61 Listeners
137 Listeners
445 Listeners
29 Listeners
31 Listeners