February 11, 2025

2025.02.11 | LLMs生成多语言去毒数据，强化学习提升数学推理效率。

16 minutes

本期的 21 篇论文如下：

[00:25] 🤖 SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators（SynthDetoxM：现代大语言模型是少样本并行去毒化数据标注器）

[01:10] 🧠 Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning（探索数学推理中结果奖励的学习极限）

[01:55] 🤔 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling（10亿参数LLM能否超越4050亿参数LLM？重新思考计算最优的测试时缩放）

[02:38] ⚡ Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding（基于时间局部性的层次化草稿实现大语言模型无损加速）

[03:19] 🚀 Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation（Show-o Turbo：迈向加速统一多模态理解和生成）

[03:57] 🤖 Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning（利用多智能体强化学习训练语言模型进行社会推理）

[04:38] 🧠 ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates（ReasonFlux：通过扩展思维模板实现分层LLM推理）

[05:28] 🌐 EVEv2: Improved Baselines for Encoder-Free Vision-Language Models（EVEv2：改进的无编码器视觉语言模型基线）

[06:11] 🧠 LM2: Large Memory Models（大型记忆模型）

[06:57] 🧠 The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering（标记的隐秘生命：通过视觉信息引导减少大型视觉语言模型的幻觉）

[07:50] 🪆 Matryoshka Quantization（嵌套量化）

[08:35] 🎥 Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT（Lumina-Video: 多尺度Next-DiT的高效灵活视频生成）

[09:22] 🎥 History-Guided Video Diffusion（历史引导的视频扩散）

[10:12] 🎥 CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers（CustomVideoX：三维参考注意力驱动的零样本定制视频扩散变换器动态适应）

[10:59] ⚡ APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding（自适应并行编码：通过自适应并行编码实现更快更长的上下文增强生成）

[11:38] ⏱ Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile（高效视频扩散Transformer模型）

[12:21] 🤖 MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents（元链：一个全自动且无需代码的LLM代理框架）

[13:03] 🚀 Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM（Steel-LLM：从零到开源——构建以中文为中心的LLM的个人历程）

[13:47] 🧠 The Curse of Depth in Large Language Models（深度在大语言模型中的诅咒）

[14:24] 🎨 DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization（DreamDPO：通过直接偏好优化对齐文本到3D生成与人偏好）

[15:14] 🎨 Dual Caption Preference Optimization for Diffusion Models（双标题偏好优化用于扩散模型）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

By duan

22 ratings

February 11, 2025

2025.02.11 | LLMs生成多语言去毒数据，强化学习提升数学推理效率。

16 minutes

本期的 21 篇论文如下：

[00:25] 🤖 SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators（SynthDetoxM：现代大语言模型是少样本并行去毒化数据标注器）

[01:10] 🧠 Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning（探索数学推理中结果奖励的学习极限）

[01:55] 🤔 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling（10亿参数LLM能否超越4050亿参数LLM？重新思考计算最优的测试时缩放）

[03:19] 🚀 Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation（Show-o Turbo：迈向加速统一多模态理解和生成）

[03:57] 🤖 Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning（利用多智能体强化学习训练语言模型进行社会推理）

[04:38] 🧠 ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates（ReasonFlux：通过扩展思维模板实现分层LLM推理）

[05:28] 🌐 EVEv2: Improved Baselines for Encoder-Free Vision-Language Models（EVEv2：改进的无编码器视觉语言模型基线）

[06:11] 🧠 LM2: Large Memory Models（大型记忆模型）

[07:50] 🪆 Matryoshka Quantization（嵌套量化）

[08:35] 🎥 Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT（Lumina-Video: 多尺度Next-DiT的高效灵活视频生成）

[09:22] 🎥 History-Guided Video Diffusion（历史引导的视频扩散）

[10:59] ⚡ APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding（自适应并行编码：通过自适应并行编码实现更快更长的上下文增强生成）

[11:38] ⏱ Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile（高效视频扩散Transformer模型）

[12:21] 🤖 MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents（元链：一个全自动且无需代码的LLM代理框架）

[13:03] 🚀 Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM（Steel-LLM：从零到开源——构建以中文为中心的LLM的个人历程）

[13:47] 🧠 The Curse of Depth in Large Language Models（深度在大语言模型中的诅咒）

[14:24] 🎨 DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization（DreamDPO：通过直接偏好优化对齐文本到3D生成与人偏好）

[15:14] 🎨 Dual Caption Preference Optimization for Diffusion Models（双标题偏好优化用于扩散模型）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

View all

硅谷101|中国版

56 Listeners

商业就是这样

292 Listeners

声动早咖啡

293 Listeners

思文，败类

157 Listeners

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活

0 Listeners

Share 2025.02.11 | LLMs生成多语言去毒数据，强化学习提升数学推理效率。

Sign up to save your podcasts

2025.02.11 | LLMs生成多语言去毒数据，强化学习提升数学推理效率。

2025.02.11 | LLMs生成多语言去毒数据，强化学习提升数学推理效率。

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版

商业就是这样

声动早咖啡

思文，败类

不开玩笑 Jokes Aside

人民公园说AI

數創實驗室 - AI時代的學習指南

AI可可AI生活