Share Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings

Copy link

January 13, 2026

Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings

11 minutes

The researchers introduce DroPE, a novel method for extending the context length of large language models by removing positional embeddings after pretraining. While explicit positional information like RoPE is essential for fast training convergence, it creates a "bottleneck" that prevents models from processing sequences longer than those seen during training. The authors demonstrate that these embeddings act as a temporary scaffold that can be discarded and replaced with a brief recalibration phase at the original context length. This approach allows models to achieve zero-shot context extension far beyond their initial training limits without the performance degradation typically seen in traditional scaling methods. Empirically, DroPE maintains high accuracy on long-range retrieval tasks across various model sizes, outperforming specialized architectures and complex frequency-scaling techniques. Ultimately, the work suggests that the inductive bias of positions is only necessary during early learning and can be removed to unlock robust, scalable inference.

...more

View all episodes

By Enoch H. Kang

January 13, 2026

Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings

11 minutes

...more

Sign up to save your podcasts