Share Multi Query Attention: PaLM: Scaling Language Modeling with Pathways

Copy link

August 08, 2025

Multi Query Attention: PaLM: Scaling Language Modeling with Pathways

30 minutes

67 authors were involved in this research!

This source is an academic paper titled "PaLM: Scaling Language Modeling with Pathways," authored by Aakanksha Chowdhery and numerous collaborators. It details the development and capabilities of PaLM (Pathways Language Model), a 540-billion parameter Transformer language model trained on 6144 TPU v4 chips using a new ML system called Pathways. The paper highlights PaLM's state-of-the-art performance in few-shot learning across various natural language tasks, including multilingual tasks and source code generation. Additionally, the authors provide analysis on bias, toxicity, and training data memorization, alongside a discussion of ethical considerations related to large language models. The document is hosted on arXiv, an open-access repository for scholarly articles, and includes submission history, full-text links, and citation tools.

https://arxiv.org/abs/2204.02311

...more

View all episodes

By mcgrof