
Sign up to save your podcasts
Or
本期的 10 篇论文如下:
[00:24] 🤖 OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints(OmniManip:通过以对象为中心的交互原语作为空间约束实现通用机器人操作)
[01:02] 🎥 VideoRAG: Retrieval-Augmented Generation over Video Corpus(VideoRAG:基于视频语料库的检索增强生成)
[01:38] 🎥 OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?(OVO-Bench:你的视频大语言模型离现实世界在线视频理解还有多远?)
[02:26] 🧠 LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs(LlamaV-o1:重新思考大语言模型中的逐步视觉推理)
[03:01] 🧠 Enabling Scalable Oversight via Self-Evolving Critic(通过自进化批评实现可扩展监督)
[03:34] 🎥 ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning(ConceptMaster:无需测试时调优的扩散变换器模型上的多概念视频定制)
[04:09] 🎥 Multi-subject Open-set Personalization in Video Generation(多主体开放集个性化视频生成)
[04:47] 🔍 ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding(ReFocus:视觉编辑作为结构化图像理解的思维链)
[05:23] 🤖 Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains(多智能体微调:通过多样化推理链实现自我改进)
[06:00] 🦠 Infecting Generative AI With Viruses(感染生成式人工智能的病毒)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 10 篇论文如下:
[00:24] 🤖 OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints(OmniManip:通过以对象为中心的交互原语作为空间约束实现通用机器人操作)
[01:02] 🎥 VideoRAG: Retrieval-Augmented Generation over Video Corpus(VideoRAG:基于视频语料库的检索增强生成)
[01:38] 🎥 OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?(OVO-Bench:你的视频大语言模型离现实世界在线视频理解还有多远?)
[02:26] 🧠 LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs(LlamaV-o1:重新思考大语言模型中的逐步视觉推理)
[03:01] 🧠 Enabling Scalable Oversight via Self-Evolving Critic(通过自进化批评实现可扩展监督)
[03:34] 🎥 ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning(ConceptMaster:无需测试时调优的扩散变换器模型上的多概念视频定制)
[04:09] 🎥 Multi-subject Open-set Personalization in Video Generation(多主体开放集个性化视频生成)
[04:47] 🔍 ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding(ReFocus:视觉编辑作为结构化图像理解的思维链)
[05:23] 🤖 Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains(多智能体微调:通过多样化推理链实现自我改进)
[06:00] 🦠 Infecting Generative AI With Viruses(感染生成式人工智能的病毒)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递