HuggingFace 每日AI论文速递

2025.01.22 | Agent-R提升语言模型实时纠错能力,MMVU评估多学科视频理解专家级表现。


Listen Later

本期的 16 篇论文如下:

[00:24] 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training(Agent-R:通过迭代自训练使语言模型代理具备反思能力)

[00:59] 🎥 MMVU: Measuring Expert-Level Multi-Discipline Video Understanding(MMVU:专家级多学科视频理解的测量)

[01:35] ⚖ Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models(细节中的魔鬼:实现负载均衡损失以训练专业化专家混合模型)

[02:17] 🤖 UI-TARS: Pioneering Automated GUI Interaction with Native Agents(UI-TARS:开创性的原生GUI交互自动化代理)

[02:55] 🤖 Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks(Mobile-Agent-E:面向复杂任务的自我进化移动助手)

[03:31] 🎨 TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space(TokenVerse:基于令牌调制空间的多概念个性化方法)

[04:14] 🏆 InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model(InternLM-XComposer2.5-Reward:一种简单而有效的多模态奖励模型)

[04:57] 🎥 Video Depth Anything: Consistent Depth Estimation for Super-Long Videos(视频深度任意:超长视频的一致性深度估计)

[05:39] 🤖 Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments(通过交互学习:现实环境中自适应代理的数据中心框架)

[06:18] 🧠 Reasoning Language Models: A Blueprint(推理语言模型:蓝图)

[06:58] 🎨 Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation(Hunyuan3D 2.0:扩展扩散模型以生成高分辨率纹理3D资产)

[07:40] 🧠 Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement(Condor:通过知识驱动的数据合成与精炼增强大语言模型的对齐能力)

[08:21] 🎥 EMO2: End-Effector Guided Audio-Driven Avatar Video Generation(EMO2:基于末端执行器引导的音频驱动虚拟形象视频生成)

[08:55] 🎥 Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise(随流而动:使用实时扭曲噪声实现运动可控的视频扩散模型)

[09:32] 🌍 GPS as a Control Signal for Image Generation(GPS作为图像生成的控制信号)

[10:11] ⚠ MSTS: A Multimodal Safety Test Suite for Vision-Language Models(MSTS:面向视觉-语言模型的多模态安全测试套件)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan