HuggingFace 每日AI论文速递

2025.09.12 | HuMo多模态控人视频;SimpleVLA-RL强化升效


Listen Later

本期的 15 篇论文如下:

[00:27] 🎭 HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning(HuMo:通过协同多模态条件控制实现以人为中心的视频生成)

[01:18] 🤖 SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning(SimpleVLA-RL:通过强化学习实现VLA训练规模化)

[02:02] 🗣 EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs(EchoX:基于回声训练弥合声学-语义鸿沟的语音大模型研究)

[02:37] 🎭 Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis(Kling-Avatar:面向级联长时化身动画合成的多模态指令语义落地方法)

[03:11] 🧭 Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents(驾驭不确定性:面向长周期LLM智能体的熵调制策略梯度方法)

[03:57] 🎨 FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark(FLUX-Reason-6M和PRISM-Bench:百万级文生图推理数据集与全面评测基准)

[04:34] 🤖 VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model(VLA-Adapter:面向小型视觉-语言-动作模型的有效范式)

[05:14] 🔄 Can Understanding and Generation Truly Benefit Together -- or Just Coexist?(理解与生成真能互惠共进,抑或仅共存?)

[05:46] 📹 SpatialVID: A Large-Scale Video Dataset with Spatial Annotations(SpatialVID大规模带空间标注的视频数据集)

[06:16] 📊 Visual Programmability: A Guide for Code-as-Thought in Chart Understanding(视觉可编程性:面向图表理解的Code-as-Thought指南)

[06:55] 🕵 Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval(梯度-注意力引导的双重掩码协同框架用于鲁棒的基于文本的人物检索)

[07:35] 🖼 2D Gaussian Splatting with Semantic Alignment for Image Inpainting(面向图像修复的语义对齐2D高斯泼溅)

[08:10] 📏 LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering(LoCoBench:面向复杂软件工程的长上下文大模型基准测试)

[08:45] 🤖 OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment-aware Reasoning(OmniEVA:面向具身任务的自适应3D感知与本体约束联合规划器)

[09:31] 🎯 The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward(散度选择:缓解可验证奖励强化学习多样性坍缩的关键)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文,败类 by 思文败类

思文,败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

135 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners