
Sign up to save your podcasts
Or


Linear Transformers address the computational limitations of standard Transformer models, which have a quadratic complexity, O(n^2), with respect to input sequence length. Linear Transformers aim for linear complexity, O(n), making them suitable for longer sequences. They achieve this through methods such as low-rank approximations, local attention, or kernelized attention. Examples include Linformer (low-rank matrices), Longformer (sliding window attention), and Performer (kernelized attention). Efficient attention, a type of linear attention, interprets keys as template attention maps and aggregates values into global context vectors, thus differing from dot-product attention which synthesizes pixel-wise attention maps. This approach allows more efficient resource usage in domains with large inputs or tight constraints.
By AI-Talk4
44 ratings
Linear Transformers address the computational limitations of standard Transformer models, which have a quadratic complexity, O(n^2), with respect to input sequence length. Linear Transformers aim for linear complexity, O(n), making them suitable for longer sequences. They achieve this through methods such as low-rank approximations, local attention, or kernelized attention. Examples include Linformer (low-rank matrices), Longformer (sliding window attention), and Performer (kernelized attention). Efficient attention, a type of linear attention, interprets keys as template attention maps and aggregates values into global context vectors, thus differing from dot-product attention which synthesizes pixel-wise attention maps. This approach allows more efficient resource usage in domains with large inputs or tight constraints.

303 Listeners

341 Listeners

112,584 Listeners

264 Listeners

110 Listeners

3 Listeners