
Sign up to save your podcasts
Or


This paper introduces In-Place Test-Time Training (In-Place TTT), a novel framework designed to let Large Language Models (LLMs) dynamically update their knowledge during inference. Traditional models remain static after deployment, but this approach repurposes existing MLP blocks as "fast weights" that adapt to new information in real-time. By utilizing a chunk-wise update mechanism and a learning objective aligned with Next-Token Prediction, the system achieves high computational efficiency on modern hardware. Experiments demonstrate that this "drop-in" enhancement significantly improves performance on long-context tasks up to 128k tokens without requiring expensive retraining from scratch. Ultimately, the research offers a scalable path toward continual learning, allowing models to internalize evolving contextual data more effectively than standard attention mechanisms.
By Enoch H. KangThis paper introduces In-Place Test-Time Training (In-Place TTT), a novel framework designed to let Large Language Models (LLMs) dynamically update their knowledge during inference. Traditional models remain static after deployment, but this approach repurposes existing MLP blocks as "fast weights" that adapt to new information in real-time. By utilizing a chunk-wise update mechanism and a learning objective aligned with Next-Token Prediction, the system achieves high computational efficiency on modern hardware. Experiments demonstrate that this "drop-in" enhancement significantly improves performance on long-context tasks up to 128k tokens without requiring expensive retraining from scratch. Ultimately, the research offers a scalable path toward continual learning, allowing models to internalize evolving contextual data more effectively than standard attention mechanisms.