
Sign up to save your podcasts
Or


Study explores layer pruning in pretrained LLMs, finding minimal performance drop until half of layers removed. Optimal block identified for pruning, followed by finetuning with PEFT methods for efficiency and improved inference.
https://arxiv.org/abs//2403.17887
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
By Igor Melnyk5
33 ratings
Study explores layer pruning in pretrained LLMs, finding minimal performance drop until half of layers removed. Optimal block identified for pruning, followed by finetuning with PEFT methods for efficiency and improved inference.
https://arxiv.org/abs//2403.17887
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

956 Listeners

1,942 Listeners

437 Listeners

111,970 Listeners

9,971 Listeners

5,512 Listeners

211 Listeners

49 Listeners

92 Listeners

473 Listeners