This October 23, 2025 technical report from the Ling Team introduces the Ring-linear model series, specifically Ring-mini-linear-2.0 and Ring-flash-linear-2.0, which utilize a hybrid attention architecture combining linear and softmax attention mechanisms to enhance efficiency in long-context reasoning. The paper explains how this architecture, featuring Mixture-of-Experts (MoE) and advanced FP8 training optimization through kernels like LingHe, significantly reduces inference costs and improves training throughput. A major focus is on systematic training-inference alignment to achieve stable reinforcement learning (RL) training, addressing disparities in components like the KV Cache and RMSNorm that often lead to RL collapse in long-context models. Finally, the report presents benchmark results demonstrating that the Ring-linear models maintain state-of-the-art performance across various complex reasoning tasks compared to similar-scale counterparts. Source: https://arxiv.org/pdf/2510.19338