
Sign up to save your podcasts
Or


Study explores layer pruning in pretrained LLMs, finding minimal performance drop until half of layers removed. Optimal block identified for pruning, followed by finetuning with PEFT methods for efficiency and improved inference.
https://arxiv.org/abs//2403.17887
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
By Igor Melnyk5
33 ratings
Study explores layer pruning in pretrained LLMs, finding minimal performance drop until half of layers removed. Optimal block identified for pruning, followed by finetuning with PEFT methods for efficiency and improved inference.
https://arxiv.org/abs//2403.17887
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

956 Listeners

1,943 Listeners

438 Listeners

111,863 Listeners

9,997 Listeners

5,522 Listeners

212 Listeners

51 Listeners

92 Listeners

473 Listeners