
Sign up to save your podcasts
Or


Study explores layer pruning in pretrained LLMs, finding minimal performance drop until half of layers removed. Optimal block identified for pruning, followed by finetuning with PEFT methods for efficiency and improved inference.
https://arxiv.org/abs//2403.17887
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
By Igor Melnyk5
33 ratings
Study explores layer pruning in pretrained LLMs, finding minimal performance drop until half of layers removed. Optimal block identified for pruning, followed by finetuning with PEFT methods for efficiency and improved inference.
https://arxiv.org/abs//2403.17887
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

950 Listeners

1,937 Listeners

436 Listeners

112,105 Listeners

10,026 Listeners

5,522 Listeners

212 Listeners

51 Listeners

92 Listeners

474 Listeners