HuggingFace 每日AI论文速递

2025.07.29 | ARPO提升LLM工具交互性能;ARC-Hunyuan-Video-7B深耕短视频理解。


Listen Later

本期的 15 篇论文如下:

[00:23] 🤖 Agentic Reinforced Policy Optimization(智能体强化策略优化)

[00:55] 🧠 ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts(ARC-Hunyuan-Video-7B:真实世界短视频的结构化理解)

[01:35] 🚀 Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning(Rep-MTL:释放表示层任务显著性在多任务学习中的力量)

[02:03] 🌐 Reconstructing 4D Spatial Intelligence: A Survey(重建4D空间智能:一项综述)

[02:55] 💡 SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment(SmallThinker:原生为本地部署而训练的高效大型语言模型家族)

[03:35] 🚀 A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence(自进化智能体综述:通往人工超级智能之路)

[04:17] ⚖ Geometric-Mean Policy Optimization(几何平均策略优化)

[04:59] 🎯 Region-based Cluster Discrimination for Visual Representation Learning(面向视觉表征学习的区域聚类判别)

[05:38] ✨ GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset(GPT-IMAGE-EDIT-1.5M:一个百万规模的GPT生成图像数据集)

[06:18] 🚀 UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities(UloRL:一种提升大型语言模型推理能力的超长输出强化学习方法)

[06:47] ⚡ Met$^2$Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems(Met$^2$Net:一种针对复杂气象系统的解耦两阶段时空预测模型)

[07:18] ✨ ForCenNet: Foreground-Centric Network for Document Image Rectification(ForCenNet:面向前景的文档图像矫正网络)

[07:52] 🎨 ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment(ScenePainter:基于概念关系对齐的语义一致永续三维场景生成)

[08:43] 🏆 Music Arena: Live Evaluation for Text-to-Music(Music Arena:文本到音乐的实时评估)

[09:13] 🎶 JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment(JAM:一个具有细粒度可控性和审美对齐的微型基于流的歌曲生成器)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan