HuggingFace 每日AI论文速递

2025.11.19 | 像素演员难推理;视觉误导测真章


Listen Later

本期的 11 篇论文如下:

[00:23] 🧠 Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark(世界模拟器会推理吗?Gen-ViRe生成式视觉推理基准)

[01:03] 🕵 MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs(MVI-Bench:评估大型视觉语言模型对误导性视觉输入鲁棒性的综合基准)

[01:49] 🎞 REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding(REVISOR:超越文本反思,迈向长视频理解中的多模态内省推理)

[03:02] 🧪 ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning(ATLAS:面向通用人工智能的高难度跨学科科学推理基准)

[03:43] 🔍 Large Language Models Meet Extreme Multi-label Classification: Scaling and Multi-modal Framework(大语言模型遇上极端多标签分类:可扩展多模态框架)

[04:16] 🤖 Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning(Agent-R1:以端到端强化学习训练强大语言模型智能体)

[05:02] 🤖 Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution(Orion:统一视觉智能体,实现多模态感知、高级视觉推理与执行)

[05:32] ⚖ Mitigating Label Length Bias in Large Language Models(缓解大语言模型中的标签长度偏差)

[06:14] 🧠 Agent READMEs: An Empirical Study of Context Files for Agentic Coding(智能体README:面向代理编程的上下文文件实证研究)

[06:49] 🎧 Proactive Hearing Assistants that Isolate Egocentric Conversations(主动式听力助手:以自我为中心的对话自动分离技术)

[07:20] 🎯 Error-Driven Scene Editing for 3D Grounding in Large Language Models(面向3D大模型的误差驱动场景编辑实现精准视觉定位)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文,败类 by 思文败类

思文,败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

135 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners