Review of the 2019 pun titled paper "BART: Denoising Sequence-to-Sequence Pre-training for Natural
Language Generation, Translation, and Comprehension" by the folks at Facebook.
This research introduces BART, a novel denoising autoencoder designed for pre-training sequence-to-sequence models, which proves effective for various natural language processing tasks, including generation, translation, and comprehension. BART distinguishes itself by corrupting text with arbitrary noising functions and learning to reconstruct the original input, combining elements of existing models like BERT and GPT. The study evaluates different noising strategies, finding that random sentence shuffling and a text-infilling scheme yield the best performance. Results indicate BART performs comparably to state-of-the-art models on classification tasks while achieving new benchmarks in text generation for abstractive dialogue, question answering, and summarization. Furthermore, the paper demonstrates BART's utility in enhancing machine translation decoders, offering a flexible and robust pre-training framework.