July 10, 2025

2025.07.10 | 零样本运动生成突破；4K图像超分辨率提升。

Listen Later

10 minutes

本期的 14 篇论文如下：

[00:22] 🤸 Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data（趋向于零：基于百万级数据的零样本运动生成）

[01:03] 🖼 4KAgent: Agentic Any Image to 4K Super-Resolution（4KAgent：将任意图像转化为4K超分辨率的智能体系统）

[01:39] 🖼 Perception-Aware Policy Optimization for Multimodal Reasoning（多模态推理的感知感知策略优化）

[02:24] 🧪 Rethinking Verification for LLM Code Generation: From Generation to Testing（重新思考LLM代码生成的验证：从生成到测试）

[03:05] 🤔 A Systematic Analysis of Hybrid Linear Attention（混合线性注意力机制的系统性分析）

[03:42] 🧠 First Return, Entropy-Eliciting Explore（首次回报，熵驱动探索）

[04:23] 🤖 AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs（AutoTriton：基于大型语言模型中强化学习的自动Triton编程）

[05:05] 🧩 Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving（通过解耦推理与证明来解决更具挑战性的国际数学奥林匹克竞赛题）

[05:47] 🚗 A Survey on Vision-Language-Action Models for Autonomous Driving（面向自动驾驶的视觉-语言-动作模型综述）

[06:29] 🧪 DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models（DiffSpectra：使用扩散模型从光谱中解析分子结构）

[07:09] 🗣 ModelCitizens: Representing Community Voices in Online Safety（模范公民：在线安全中代表社区的声音）

[07:50] 🤖 SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning（SRT-H：基于语言条件模仿学习的自主手术分层框架）

[08:32] 🔬 Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework（基于前沿模型安全框架评估亚马逊Nova Premier的关键风险）

[09:21] 🧐 AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness（AdamMeme：自适应地探查多模态大型语言模型在有害性上的推理能力）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

HuggingFace 每日AI论文速递

By duan

5

22 ratings

July 10, 2025

2025.07.10 | 零样本运动生成突破；4K图像超分辨率提升。

Listen Later

10 minutes

本期的 14 篇论文如下：

[00:22] 🤸 Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data（趋向于零：基于百万级数据的零样本运动生成）

[01:03] 🖼 4KAgent: Agentic Any Image to 4K Super-Resolution（4KAgent：将任意图像转化为4K超分辨率的智能体系统）

[01:39] 🖼 Perception-Aware Policy Optimization for Multimodal Reasoning（多模态推理的感知感知策略优化）

[02:24] 🧪 Rethinking Verification for LLM Code Generation: From Generation to Testing（重新思考LLM代码生成的验证：从生成到测试）

[03:05] 🤔 A Systematic Analysis of Hybrid Linear Attention（混合线性注意力机制的系统性分析）

[03:42] 🧠 First Return, Entropy-Eliciting Explore（首次回报，熵驱动探索）

[04:23] 🤖 AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs（AutoTriton：基于大型语言模型中强化学习的自动Triton编程）

[05:05] 🧩 Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving（通过解耦推理与证明来解决更具挑战性的国际数学奥林匹克竞赛题）

[05:47] 🚗 A Survey on Vision-Language-Action Models for Autonomous Driving（面向自动驾驶的视觉-语言-动作模型综述）

[06:29] 🧪 DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models（DiffSpectra：使用扩散模型从光谱中解析分子结构）

[07:09] 🗣 ModelCitizens: Representing Community Voices in Online Safety（模范公民：在线安全中代表社区的声音）

[07:50] 🤖 SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning（SRT-H：基于语言条件模仿学习的自主手术分层框架）

[08:32] 🔬 Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework（基于前沿模型安全框架评估亚马逊Nova Premier的关键风险）

[09:21] 🧐 AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness（AdamMeme：自适应地探查多模态大型语言模型在有害性上的推理能力）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文，败类 by 思文败类

思文，败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners