
Sign up to save your podcasts
Or
本期的 8 篇论文如下:
[00:24] 🤖 EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation(EnerVerse:面向机器人操作的具身未来空间构想)
[00:58] 🤖 VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction(VITA-1.5:迈向GPT-4o级别的实时视觉与语音交互)
[01:33] 🤔 Virgo: A Preliminary Exploration on Reproducing o1-like MLLM(Virgo:关于复现o1类多模态大语言模型的初步探索)
[02:11] 🤖 SDPO: Segment-Level Direct Preference Optimization for Social Agents(SDPO:面向社交代理的片段级直接偏好优化)
[02:51] 🎨 VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation(VisionReward:基于细粒度多维人类偏好的图像与视频生成学习)
[03:31] 🧬 Graph Generative Pre-trained Transformer(图生成预训练变换器)
[04:04] 🌍 LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models(LUSIFER:基于大语言模型的语言通用空间集成增强多语言嵌入)
[04:44] 🔬 BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery(BoxingGym:自动化实验设计与模型发现进展的基准测试)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 8 篇论文如下:
[00:24] 🤖 EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation(EnerVerse:面向机器人操作的具身未来空间构想)
[00:58] 🤖 VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction(VITA-1.5:迈向GPT-4o级别的实时视觉与语音交互)
[01:33] 🤔 Virgo: A Preliminary Exploration on Reproducing o1-like MLLM(Virgo:关于复现o1类多模态大语言模型的初步探索)
[02:11] 🤖 SDPO: Segment-Level Direct Preference Optimization for Social Agents(SDPO:面向社交代理的片段级直接偏好优化)
[02:51] 🎨 VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation(VisionReward:基于细粒度多维人类偏好的图像与视频生成学习)
[03:31] 🧬 Graph Generative Pre-trained Transformer(图生成预训练变换器)
[04:04] 🌍 LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models(LUSIFER:基于大语言模型的语言通用空间集成增强多语言嵌入)
[04:44] 🔬 BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery(BoxingGym:自动化实验设计与模型发现进展的基准测试)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递