July 30, 2025

2025.07.29 | ARPO提升LLM工具交互性能；ARC-Hunyuan-Video-7B深耕短视频理解。

10 minutes

本期的 15 篇论文如下：

[00:23] 🤖 Agentic Reinforced Policy Optimization（智能体强化策略优化）

[00:55] 🧠 ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts（ARC-Hunyuan-Video-7B：真实世界短视频的结构化理解）

[01:35] 🚀 Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning（Rep-MTL：释放表示层任务显著性在多任务学习中的力量）

[02:03] 🌐 Reconstructing 4D Spatial Intelligence: A Survey（重建4D空间智能：一项综述）

[02:55] 💡 SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment（SmallThinker：原生为本地部署而训练的高效大型语言模型家族）

[03:35] 🚀 A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence（自进化智能体综述：通往人工超级智能之路）

[04:17] ⚖ Geometric-Mean Policy Optimization（几何平均策略优化）

[04:59] 🎯 Region-based Cluster Discrimination for Visual Representation Learning（面向视觉表征学习的区域聚类判别）

[05:38] ✨ GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset（GPT-IMAGE-EDIT-1.5M：一个百万规模的GPT生成图像数据集）

[06:18] 🚀 UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities（UloRL：一种提升大型语言模型推理能力的超长输出强化学习方法）

[06:47] ⚡ Met$^2$Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems（Met$^2$Net：一种针对复杂气象系统的解耦两阶段时空预测模型）

[07:18] ✨ ForCenNet: Foreground-Centric Network for Document Image Rectification（ForCenNet：面向前景的文档图像矫正网络）

[07:52] 🎨 ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment（ScenePainter：基于概念关系对齐的语义一致永续三维场景生成）

[08:43] 🏆 Music Arena: Live Evaluation for Text-to-Music（Music Arena：文本到音乐的实时评估）

[09:13] 🎶 JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment（JAM：一个具有细粒度可控性和审美对齐的微型基于流的歌曲生成器）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

By duan

22 ratings

July 30, 2025

2025.07.29 | ARPO提升LLM工具交互性能；ARC-Hunyuan-Video-7B深耕短视频理解。

10 minutes

本期的 15 篇论文如下：

[00:23] 🤖 Agentic Reinforced Policy Optimization（智能体强化策略优化）

[00:55] 🧠 ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts（ARC-Hunyuan-Video-7B：真实世界短视频的结构化理解）

[01:35] 🚀 Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning（Rep-MTL：释放表示层任务显著性在多任务学习中的力量）

[02:03] 🌐 Reconstructing 4D Spatial Intelligence: A Survey（重建4D空间智能：一项综述）

[02:55] 💡 SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment（SmallThinker：原生为本地部署而训练的高效大型语言模型家族）

[03:35] 🚀 A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence（自进化智能体综述：通往人工超级智能之路）

[04:17] ⚖ Geometric-Mean Policy Optimization（几何平均策略优化）

[04:59] 🎯 Region-based Cluster Discrimination for Visual Representation Learning（面向视觉表征学习的区域聚类判别）

[05:38] ✨ GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset（GPT-IMAGE-EDIT-1.5M：一个百万规模的GPT生成图像数据集）

[06:47] ⚡ Met$^2$Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems（Met$^2$Net：一种针对复杂气象系统的解耦两阶段时空预测模型）

[07:18] ✨ ForCenNet: Foreground-Centric Network for Document Image Rectification（ForCenNet：面向前景的文档图像矫正网络）

[07:52] 🎨 ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment（ScenePainter：基于概念关系对齐的语义一致永续三维场景生成）

[08:43] 🏆 Music Arena: Live Evaluation for Text-to-Music（Music Arena：文本到音乐的实时评估）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

View all

硅谷101|中国版

56 Listeners

商业就是这样

292 Listeners

声动早咖啡

293 Listeners

思文，败类

156 Listeners

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活

0 Listeners

Share 2025.07.29 | ARPO提升LLM工具交互性能；ARC-Hunyuan-Video-7B深耕短视频理解。

Sign up to save your podcasts

2025.07.29 | ARPO提升LLM工具交互性能；ARC-Hunyuan-Video-7B深耕短视频理解。

2025.07.29 | ARPO提升LLM工具交互性能；ARC-Hunyuan-Video-7B深耕短视频理解。

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版

商业就是这样

声动早咖啡

思文，败类

不开玩笑 Jokes Aside

人民公园说AI

數創實驗室 - AI時代的學習指南

AI可可AI生活