February 11, 2026

LongCat: Scaling Embeddings Outperforms Scaling Experts in Language Models

16 minutes

Researchers from the LongCat introduced LongCat-Flash-Lite on January 2026, demonstrating that scaling embeddings via N-gram layers outperforms increasing Mixture-of-Experts parameters in high-sparsity regimes. This architecture uses system optimizations and speculative decoding to boost inference speed. Source: January 2026 Scaling Embeddings Outperforms Scaling Experts in Language Models Meituan LongCat Team Hong Liu, Jiaqi Zhang, Chao Wang, Xing Hu, Linkun Lyu, Jiaqi Sun, Xurui Yang, Bo Wang, Fengcun Li, Yulei Qian, Lingtong Si, Yerui Sun, Rumei Li, Peng Pei, Yuchen Xie, Xunliang Cai https://arxiv.org/pdf/2601.21204

...more

View all episodes

By mcgrof

February 11, 2026

LongCat: Scaling Embeddings Outperforms Scaling Experts in Language Models

16 minutes

...more

Share LongCat: Scaling Embeddings Outperforms Scaling Experts in Language Models

Sign up to save your podcasts

LongCat: Scaling Embeddings Outperforms Scaling Experts in Language Models

LongCat: Scaling Embeddings Outperforms Scaling Experts in Language Models