KnowledgeDB.ai

Longformer: The Long-Document Transformer


Listen Later

Ref: https://arxiv.org/abs/2004.05150


The

paper introduces Longformer, a Transformer model designed to
efficiently process long sequences. It addresses the quadratic
complexity of standard self-attention by using a linear-scaling
mechanism combining local windowed attention and task-motivated global
attention. The authors demonstrate Longformer's effectiveness on
character-level language modeling and various downstream tasks,
achieving state-of-the-art results. Furthermore, they introduce
Longformer-Encoder-Decoder (LED), a variant for sequence-to-sequence
tasks, showcasing its success in long document summarization. The
improved efficiency and performance are achieved through architectural
modifications and strategic training procedures.

...more
View all episodesView all episodes
Download on the App Store

KnowledgeDB.aiBy KnowledgeDB