February 02, 2025

【月末特辑】1月最火AI论文 | DeepSeek-R1强化学习提升LLM推理能力；长文本处理突破

Listen Later

24 minutes

本期的 10 篇论文如下：

[00:40] TOP1(🔥281) | 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning（DeepSeek-R1：通过强化学习激励大语言模型的推理能力）

[03:13] TOP2(🔥271) | ⚡ MiniMax-01: Scaling Foundation Models with Lightning Attention（MiniMax-01：基于闪电注意力机制扩展基础模型）

[05:36] TOP3(🔥249) | 🧠 rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking（rStar-Math：小型语言模型通过自我进化的深度思考掌握数学推理）

[08:13] TOP4(🔥103) | 🧠 Evolving Deeper LLM Thinking（演化更深层次的LLM思维）

[10:28] TOP5(🔥99) | 📚 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining（2.5年课堂：用于视觉-语言预训练的多模态教科书）

[12:51] TOP6(🔥90) | 🚀 REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models（REINFORCE++：一种简单高效的大语言模型对齐方法）

[15:15] TOP7(🔥90) | 🧠 Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though（迈向LLMs中的系统2推理：学习如何通过元思维链进行思考）

[17:14] TOP8(🔥89) | 📊 The Lessons of Developing Process Reward Models in Mathematical Reasoning（数学推理中过程奖励模型开发的经验教训）

[19:33] TOP9(🔥88) | 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training（Agent-R：通过迭代自训练使语言模型代理具备反思能力）

[21:35] TOP10(🔥87) | 🧠 The GAN is dead; long live the GAN! A Modern GAN Baseline（GAN已死；GAN万岁！一个现代的GAN基线）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

HuggingFace 每日AI论文速递

By duan

5

22 ratings

February 02, 2025

【月末特辑】1月最火AI论文 | DeepSeek-R1强化学习提升LLM推理能力；长文本处理突破

Listen Later

24 minutes

本期的 10 篇论文如下：

[00:40] TOP1(🔥281) | 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning（DeepSeek-R1：通过强化学习激励大语言模型的推理能力）

[03:13] TOP2(🔥271) | ⚡ MiniMax-01: Scaling Foundation Models with Lightning Attention（MiniMax-01：基于闪电注意力机制扩展基础模型）

[05:36] TOP3(🔥249) | 🧠 rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking（rStar-Math：小型语言模型通过自我进化的深度思考掌握数学推理）

[08:13] TOP4(🔥103) | 🧠 Evolving Deeper LLM Thinking（演化更深层次的LLM思维）

[10:28] TOP5(🔥99) | 📚 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining（2.5年课堂：用于视觉-语言预训练的多模态教科书）

[12:51] TOP6(🔥90) | 🚀 REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models（REINFORCE++：一种简单高效的大语言模型对齐方法）

[15:15] TOP7(🔥90) | 🧠 Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though（迈向LLMs中的系统2推理：学习如何通过元思维链进行思考）

[17:14] TOP8(🔥89) | 📊 The Lessons of Developing Process Reward Models in Mathematical Reasoning（数学推理中过程奖励模型开发的经验教训）

[19:33] TOP9(🔥88) | 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training（Agent-R：通过迭代自训练使语言模型代理具备反思能力）

[21:35] TOP10(🔥87) | 🧠 The GAN is dead; long live the GAN! A Modern GAN Baseline（GAN已死；GAN万岁！一个现代的GAN基线）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文，败类 by 思文败类

思文，败类

157 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners