Share Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Copy link

January 16, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

14 minutes

The researchers introduce Engram, a novel conditional memory module that enhances Large Language Models by integrating a scalable lookup mechanism for static knowledge. While modern models rely on Mixture-of-Experts (MoE) for sparse computation, Engram uses N-gram embeddings to retrieve formulaic or factual information in constant time. This architectural shift creates a U-shaped scaling law that balances neural processing with static memory, allowing the model to offload simple retrieval tasks to early layers. By delegating local patterns to these lookups, the transformer's attention capacity is preserved for complex reasoning and long-context processing. Experiments show that an Engram-augmented 27B model significantly outperforms standard MoE baselines in math, coding, and general reasoning. Furthermore, the system supports offloading massive parameter tables to host memory, ensuring high efficiency with minimal computational overhead.

...more

View all episodes

By Enoch H. Kang

January 16, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

14 minutes

...more

Sign up to save your podcasts