HuggingFace 每日AI论文速递

2025.05.30 | 推理扩展提升表格推理;多模态模型视频反馈有待优化。


Listen Later

本期的 15 篇论文如下:

[00:22] 📊 Table-R1: Inference-Time Scaling for Table Reasoning(Table-R1:表格推理的推理时扩展)

[01:02] 🤖 VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos(VF-Eval:评估多模态大语言模型生成AIGC视频反馈的能力)

[01:45] 🧠 Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence(Spatial-MLLM:提升多模态大语言模型在基于视觉的空间智能方面的能力)

[02:25] 🧠 The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason(行胜于言:论证推理学习中的噪声奖励)

[03:11] 🤖 ZeroGUI: Automating Online GUI Learning at Zero Human Cost(ZeroGUI:零人工成本的在线GUI学习自动化)

[03:45] 🤔 VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?(VideoReasonBench:多模态大语言模型能否执行以视觉为中心的复杂视频推理?)

[04:39] 🧬 Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering(Satori-SWE: 面向高效软件工程的演化测试时扩展)

[05:15] 🤔 Are Reasoning Models More Prone to Hallucination?(推理模型更容易产生幻觉吗?)

[05:51] 🤖 cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning(cadrille:基于在线强化学习的多模态CAD重建)

[06:29] 🎨 D-AR: Diffusion via Autoregressive Models(D-AR:基于自回归模型的扩散)

[07:16] 📸 AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views(AnySplat:来自非约束视角的Feed-forward 3D高斯溅射)

[07:53] 🛠 SWE-bench Goes Live!(SWE-bench-Live:一个实时更新的问题解决基准评测)

[08:36] 💡 Multi-Domain Explainability of Preferences(偏好的多领域可解释性)

[09:16] 🤖 UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning(UniRL:基于监督学习和强化学习的自提升统一多模态模型)

[10:01] 🗣 FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian(FAMA:首个面向英语和意大利语的大规模开放科学语音基础模型)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan