
Sign up to save your podcasts
Or


The February 12, 2025 KuaiShou Inc paper introduces **ELASTIC**, an Efficient Linear Attention for SequenTial Interest Compression framework designed to address the **scalability issues** of traditional transformer-based sequential recommender systems, which suffer from quadratic complexity with respect to sequence length. ELASTIC achieves this by proposing a **Linear Dispatcher Attention (LDA) layer** that compresses long user behavior sequences into a more compact representation, leading to **linear time complexity** and significant reductions in GPU memory usage and increased inference speed. Furthermore, the framework incorporates an **Interest Memory Retrieval (IMR) technique** that uses a large, sparsely retrieved interest memory bank to expand the model's capacity and **maintain recommendation accuracy** despite the computational optimizations. Empirical results from experiments on datasets like ML-1M and XLong demonstrate that ELASTIC **outperforms baseline methods** while offering superior computational efficiency, especially when modeling long user sequences.
Source:
https://arxiv.org/pdf/2408.09380
By mcgrofThe February 12, 2025 KuaiShou Inc paper introduces **ELASTIC**, an Efficient Linear Attention for SequenTial Interest Compression framework designed to address the **scalability issues** of traditional transformer-based sequential recommender systems, which suffer from quadratic complexity with respect to sequence length. ELASTIC achieves this by proposing a **Linear Dispatcher Attention (LDA) layer** that compresses long user behavior sequences into a more compact representation, leading to **linear time complexity** and significant reductions in GPU memory usage and increased inference speed. Furthermore, the framework incorporates an **Interest Memory Retrieval (IMR) technique** that uses a large, sparsely retrieved interest memory bank to expand the model's capacity and **maintain recommendation accuracy** despite the computational optimizations. Empirical results from experiments on datasets like ML-1M and XLong demonstrate that ELASTIC **outperforms baseline methods** while offering superior computational efficiency, especially when modeling long user sequences.
Source:
https://arxiv.org/pdf/2408.09380