
Sign up to save your podcasts
Or


The paper presents Llama 3, a new family of foundation language models developed by Meta, featuring models with 8B, 70B, and a flagship 405B parameters. These models natively support multilinguality, coding, reasoning, and tool usage, with the 405B model capable of processing information in a context window of up to 128K tokens.
The development of Llama 3 focuses on optimizing data, scale, and complexity:
Extensive empirical and human evaluations demonstrate that the flagship 405B model performs on par with leading closed-source models like GPT-4 across a wide variety of tasks, while the 8B and 70B models establish best-in-class performance compared to alternative models of similar sizes.
The paper also highlights robust safety measures, including the release of Llama Guard 3 for system-level input and output safety. Finally, the authors detail ongoing, unreleased experiments integrating image, video, and speech capabilities into Llama 3 using a compositional approach, which has shown competitive results against state-of-the-art multimodal models.
By Yun WuThe paper presents Llama 3, a new family of foundation language models developed by Meta, featuring models with 8B, 70B, and a flagship 405B parameters. These models natively support multilinguality, coding, reasoning, and tool usage, with the 405B model capable of processing information in a context window of up to 128K tokens.
The development of Llama 3 focuses on optimizing data, scale, and complexity:
Extensive empirical and human evaluations demonstrate that the flagship 405B model performs on par with leading closed-source models like GPT-4 across a wide variety of tasks, while the 8B and 70B models establish best-in-class performance compared to alternative models of similar sizes.
The paper also highlights robust safety measures, including the release of Llama Guard 3 for system-level input and output safety. Finally, the authors detail ongoing, unreleased experiments integrating image, video, and speech capabilities into Llama 3 using a compositional approach, which has shown competitive results against state-of-the-art multimodal models.