
Sign up to save your podcasts
Or
本期的 15 篇论文如下:
[00:26] 🧠 SRMT: Shared Memory for Multi-agent Lifelong Pathfinding(SRMT:多智能体终身路径规划中的共享记忆)
[01:05] 🎥 Improving Video Generation with Human Feedback(利用人类反馈改进视频生成)
[01:40] ⚡ Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models(Sigma:查询、键和值的差分重缩放以实现高效语言模型)
[02:20] 🖼 Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step(能否通过思维链生成图像?逐步验证和强化图像生成)
[02:55] 🖼 IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models(IMAGINE-E:最先进文本到图像模型的图像生成智能评估)
[03:32] 📚 Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos(Video-MMMU:评估从多学科专业视频中获取知识的能力)
[04:14] 🎥 DiffuEraser: A Diffusion Model for Video Inpainting(DiffuEraser:基于扩散模型的视频修复)
[04:50] 🎥 Temporal Preference Optimization for Long-Form Video Understanding(长视频理解中的时序偏好优化)
[05:29] 🎨 One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt(一提示一故事:使用单一提示实现免费午餐式一致的文本到图像生成)
[06:07] 🎥 EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion(EchoVideo:基于多模态特征融合的身份保持人类视频生成)
[06:42] 🧠 Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback(Step-KTO:通过逐步二元反馈优化数学推理)
[07:17] 🧠 Debate Helps Weak-to-Strong Generalization(辩论助力弱到强泛化)
[07:53] 🤔 Evolution and The Knightian Blindspot of Machine Learning(进化与机器学习的奈特盲点)
[08:30] 🧪 Hallucinations Can Improve Large Language Models in Drug Discovery(幻觉可以提升大语言模型在药物发现中的表现)
[09:10] 🌀 GSTAR: Gaussian Surface Tracking and Reconstruction(GSTAR:高斯曲面跟踪与重建)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 15 篇论文如下:
[00:26] 🧠 SRMT: Shared Memory for Multi-agent Lifelong Pathfinding(SRMT:多智能体终身路径规划中的共享记忆)
[01:05] 🎥 Improving Video Generation with Human Feedback(利用人类反馈改进视频生成)
[01:40] ⚡ Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models(Sigma:查询、键和值的差分重缩放以实现高效语言模型)
[02:20] 🖼 Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step(能否通过思维链生成图像?逐步验证和强化图像生成)
[02:55] 🖼 IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models(IMAGINE-E:最先进文本到图像模型的图像生成智能评估)
[03:32] 📚 Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos(Video-MMMU:评估从多学科专业视频中获取知识的能力)
[04:14] 🎥 DiffuEraser: A Diffusion Model for Video Inpainting(DiffuEraser:基于扩散模型的视频修复)
[04:50] 🎥 Temporal Preference Optimization for Long-Form Video Understanding(长视频理解中的时序偏好优化)
[05:29] 🎨 One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt(一提示一故事:使用单一提示实现免费午餐式一致的文本到图像生成)
[06:07] 🎥 EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion(EchoVideo:基于多模态特征融合的身份保持人类视频生成)
[06:42] 🧠 Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback(Step-KTO:通过逐步二元反馈优化数学推理)
[07:17] 🧠 Debate Helps Weak-to-Strong Generalization(辩论助力弱到强泛化)
[07:53] 🤔 Evolution and The Knightian Blindspot of Machine Learning(进化与机器学习的奈特盲点)
[08:30] 🧪 Hallucinations Can Improve Large Language Models in Drug Discovery(幻觉可以提升大语言模型在药物发现中的表现)
[09:10] 🌀 GSTAR: Gaussian Surface Tracking and Reconstruction(GSTAR:高斯曲面跟踪与重建)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递