Large Language Model (LLM) Talk

Qwen-2.5


Listen Later

Qwen2.5 is a series of large language models (LLMs) with significant improvements over previous models, focusing on efficiency, performance, and long sequence handling. Key architectural advancements include Grouped Query Attention (GQA) for better memory management, Mixture-of-Experts (MoE) for enhanced capacity, and Rotary Positional Embeddings (RoPE) for effective long-sequence modeling. Qwen2.5 uses two-phase pre-training and progressive context length expansion to enhance long-context capabilities, along with techniques like YARN, Dual Chunk Attention (DCA), and sparse attention. It also features an expanded tokenizer and uses SwiGLU activation, QKV bias and RMSNorm for stable training.

...more
View all episodesView all episodes
Download on the App Store

Large Language Model (LLM) TalkBy AI-Talk

  • 4
  • 4
  • 4
  • 4
  • 4

4

4 ratings


More shows like Large Language Model (LLM) Talk

View all
Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

303 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

341 Listeners

The Daily by The New York Times

The Daily

112,584 Listeners

Learning English from the News by BBC Radio

Learning English from the News

264 Listeners

Thinking in English by Thomas Wilkinson

Thinking in English

110 Listeners

AI Agents: Top Trend of 2025 - by AIAgentStore.ai by AIAgentStore.ai

AI Agents: Top Trend of 2025 - by AIAgentStore.ai

3 Listeners