March 02, 2025

【月末特辑】2月最火AI论文 | 以数据为中心的小型语言模型训练；人类动画新框架。

23 minutes

本期的 10 篇论文如下：

[00:39] TOP1(🔥196) | 🤖 SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model（SmolLM2：当小型模型走向大型化——以数据为中心的小型语言模型训练）

[02:32] TOP2(🔥183) | 🎥 OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models（OmniHuman-1：重新思考一阶段条件人类动画模型的扩展）

[05:02] TOP3(🔥182) | 🦜 The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding（随机鹦鹉在大语言模型肩上：物理概念理解的总结性评估）

[06:41] TOP4(🔥167) | 🧠 MLGym: A New Framework and Benchmark for Advancing AI Research Agents（MLGym：推进AI研究代理的新框架与基准）

[09:03] TOP5(🔥152) | 🌐 Qwen2.5-VL Technical Report（Qwen2.5-VL 技术报告）

[11:48] TOP6(🔥152) | 🔍 LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers（LLM显微镜：揭示标点符号在Transformer上下文记忆中的隐藏作用）

[13:41] TOP7(🔥142) | 🚀 InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU（InfiniteHiP：在单个GPU上扩展语言模型上下文至300万 tokens）

[16:06] TOP8(🔥140) | 🤔 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling（10亿参数LLM能否超越4050亿参数LLM？重新思考计算最优的测试时缩放）

[18:40] TOP9(🔥137) | ⚡ Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention（原生稀疏注意力：硬件对齐与原生可训练的稀疏注意力）

[20:46] TOP10(🔥125) | 💼 Expect the Unexpected: FailSafe Long Context QA for Finance（预料之外：金融领域长上下文问答的FailSafe）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

By duan

22 ratings

March 02, 2025

【月末特辑】2月最火AI论文 | 以数据为中心的小型语言模型训练；人类动画新框架。

23 minutes

本期的 10 篇论文如下：

[02:32] TOP2(🔥183) | 🎥 OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models（OmniHuman-1：重新思考一阶段条件人类动画模型的扩展）

[06:41] TOP4(🔥167) | 🧠 MLGym: A New Framework and Benchmark for Advancing AI Research Agents（MLGym：推进AI研究代理的新框架与基准）

[09:03] TOP5(🔥152) | 🌐 Qwen2.5-VL Technical Report（Qwen2.5-VL 技术报告）

[13:41] TOP7(🔥142) | 🚀 InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU（InfiniteHiP：在单个GPU上扩展语言模型上下文至300万 tokens）

[16:06] TOP8(🔥140) | 🤔 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling（10亿参数LLM能否超越4050亿参数LLM？重新思考计算最优的测试时缩放）

[18:40] TOP9(🔥137) | ⚡ Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention（原生稀疏注意力：硬件对齐与原生可训练的稀疏注意力）

[20:46] TOP10(🔥125) | 💼 Expect the Unexpected: FailSafe Long Context QA for Finance（预料之外：金融领域长上下文问答的FailSafe）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

View all

硅谷101|中国版

56 Listeners

商业就是这样

292 Listeners

声动早咖啡

293 Listeners

思文，败类

156 Listeners

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活

0 Listeners

Share 【月末特辑】2月最火AI论文 | 以数据为中心的小型语言模型训练；人类动画新框架。

Sign up to save your podcasts

【月末特辑】2月最火AI论文 | 以数据为中心的小型语言模型训练；人类动画新框架。

【月末特辑】2月最火AI论文 | 以数据为中心的小型语言模型训练；人类动画新框架。

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版

商业就是这样

声动早咖啡

思文，败类

不开玩笑 Jokes Aside

人民公园说AI

數創實驗室 - AI時代的學習指南

AI可可AI生活