HuggingFace 每日AI论文速递

2025.10.10 | 早期经验的Agent Learning;图文交错反思链跃升至24.9%


Listen Later

本期的 14 篇论文如下:

[00:16] 🌱 Agent Learning via Early Experience(基于早期经验的主体学习)

[00:50] 🧠 MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization(MM-HELIX:以整体平台与自适应混合策略优化激发多模态长链反思推理)

[01:42] 🧪 From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning(从“是什么”到“为什么”:面向循证化学反应条件推理的多智能体系统)

[02:19] 🎬 UniVideo: Unified Understanding, Generation, and Editing for Videos(UniVideo:统一理解、生成与编辑视频的多模态框架)

[03:01] 🧠 When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs(当思想邂逅事实:面向长上下文语言模型的可复用推理)

[03:43] 🧠 Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning(元认知增强推理模型:自对齐强化学习)

[04:25] 🧠 MemMamba: Rethinking Memory Patterns in State Space Model(MemMamba:重新思考状态空间模型中的记忆模式)

[05:17] 🛡 The Alignment Waltz: Jointly Training Agents to Collaborate for Safety(对齐圆舞曲:联合训练智能体协同守护安全)

[05:53] 🎯 Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense(混合强化:奖励稀疏时,密集信号更胜一筹)

[06:40] 🧪 NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents(NewtonBench:评测大模型智能体在通用科学定律发现中的基准)

[07:17] 🪚 DeepPrune: Parallel Scaling without Inter-trace Redundancy(DeepPrune:并行扩展中消除跨路径冗余的高效推理框架)

[07:54] 🚀 Training-Free Group Relative Policy Optimization(免训练群组相对策略优化)

[08:24] 🪄 ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation(ARTDECO:面向高效高保真即时三维重建的结构化场景表征)

[08:55] 🤥 LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions(大模型在欺骗性样本与偏见人机交互中意外学会欺骗:不诚实行为的新兴错位)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文,败类 by 思文败类

思文,败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

135 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners