October 08, 2024

2024.10.08 每日AI论文 | 差分Transformer优化注意力，LLM幻觉研究揭示错误模式。

15 minutes

本期的 21 篇论文如下：

[00:26] 🔍 Differential Transformer（差分Transformer）

[01:04] 🧠 LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations（大语言模型知多于表：关于LLM幻觉的内在表征）

[01:50] 📹 VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide（视频指南：通过教师指导提升视频扩散模型无需训练）

[02:28] 📈 FAN: Fourier Analysis Networks（傅里叶分析网络）

[03:05] 🏥 Named Clinical Entity Recognition Benchmark（命名临床实体识别基准）

[03:37] 🔬 ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery（科学智能基准：面向数据驱动科学发现的语言智能体严格评估）

[04:19] 🎶 UniMuMo: Unified Text, Music and Motion Generation（统一文本、音乐与动作生成）

[04:55] 🔍 TLDR: Token-Level Detective Reward Model for Large Vision Language Models（TLDR：大视觉语言模型的令牌级侦探奖励模型）

[05:35] 🎵 Presto! Distilling Steps and Layers for Accelerating Music Generation（快速！加速音乐生成的步骤和层级蒸馏）

[06:08] 🖥 Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents（像人类一样导航数字世界：GUI代理的通用视觉基础）

[06:49] 🖼 OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction（全能展台：通过多模态指令学习图像合成的潜在控制）

[07:29] 🌀 MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion（MonST3R：一种在动态场景中估计几何的简单方法）

[08:09] 🧠 LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning（LLaMA-Berry：O1类奥林匹克级数学推理的成对优化）

[08:50] 📊 MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs（MathHay：LLMs长上下文数学推理自动化基准）

[09:39] 📊 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models（GSM-符号化：理解大型语言模型在数学推理中的局限性）

[10:34] 🤖 Autonomous Character-Scene Interaction Synthesis from Text Instruction（从文本指令自主合成角色场景互动）

[11:12] 🧩 TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles（TurtleBench：通过真实世界的Yes/No谜题评估顶级语言模型）

[12:00] 🤖 Grounding Language in Multi-Perspective Referential Communication（多视角指称通信中的语言接地）

[12:48] 🎯 SePPO: Semi-Policy Preference Optimization for Diffusion Alignment（SePPO：扩散模型对齐的半策略偏好优化）

[13:25] 🧩 What Matters for Model Merging at Scale?（大规模模型合并的关键因素是什么？）

[14:02] 📊 SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification（SELECT：图像分类数据策展策略的大规模基准）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

By duan

22 ratings