Share Movement Pruning: Adaptive Sparsity by Fine-Tuning

Copy link

August 08, 2025

Movement Pruning: Adaptive Sparsity by Fine-Tuning

15 minutes

This academic paper introduces movement pruning, a novel method for reducing the size of large pre-trained language models like BERT during fine-tuning. Unlike traditional magnitude pruning which removes weights based on their absolute values, movement pruning prioritizes weights that change significantly during the fine-tuning process, demonstrating superior performance in high-sparsity scenarios. The authors provide mathematical foundations for their approach and empirically compare it against existing zeroth- and first-order pruning techniques, highlighting its effectiveness, especially when combined with distillation. The research emphasizes the potential for resource reduction, enabling the deployment of complex models on less powerful hardware and fostering broader accessibility in the field of natural language processing.

Source: Published 2020

https://papers.neurips.cc/paper_files/paper/2020/file/eae15aabaa768ae4a5993a8a4f4fa6e4-Paper.pdf

...more

View all episodes

By mcgrof