Share EP007: How GPT-2 Hallucinated Ovid's Unicorn

Copy link

February 24, 2026

EP007: How GPT-2 Hallucinated Ovid's Unicorn

20 minutes

The paper "Language Models are Unsupervised Multitask Learners" demonstrates that high-capacity language models can perform various natural language processing (NLP) tasks—such as question answering, machine translation, and summarization—without any explicit supervision. By training a 1.5-billion parameter Transformer model named GPT-2 on a new, diverse dataset of millions of webpages called WebText, the researchers found that the model begins to learn these tasks naturally through unsupervised multitask learning.

In a zero-shot setting, where the model receives no task-specific training or architectural modifications, GPT-2 achieved state-of-the-art results on seven out of eight tested language modeling datasets. Notably, on the CoQA reading comprehension dataset, the model matched or exceeded the performance of three out of four baseline systems without using any of the 127,000+ training examples. The study highlights that model capacity is essential to the success of zero-shot task transfer, with performance improving in a log-linear fashion as the number of parameters increases. Ultimately, the findings suggest a promising path toward building generalist systems that learn to perform tasks directly from naturally occurring demonstrations in text.

...more

View all episodes

By Yun Wu

February 24, 2026

EP007: How GPT-2 Hallucinated Ovid's Unicorn

20 minutes

...more

Sign up to save your podcasts