
Sign up to save your podcasts
Or


本期的 15 篇论文如下:
[00:25] 🧠 General Agentic Memory Via Deep Research(通过深度研究的通用代理记忆)
[00:52] 🧪 AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning(AutoEnv:用于跨环境智能体学习的自动化环境测量)
[01:24] 🤖 Computer-Use Agents as Judges for Generative User Interface(以计算机使用代理作为生成式用户界面的评判者)
[01:55] 🎨 DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation(DeCo:用于端到端图像生成的频率解耦像素扩散)
[02:24] 🎨 UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios(UltraFlux:面向高质量原生4K文本到图像跨多样宽高比的数据-模型协同设计)
[03:10] 🔍 DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research(DR Tulu:基于演化评分标准的深度研究强化学习)
[03:46] 🎬 In-Video Instructions: Visual Signals as Generative Control(视频内指令:视觉信号作为生成控制)
[04:24] 📊 Budget-Aware Tool-Use Enables Effective Agent Scaling(预算感知的工具使用实现有效的智能体扩展)
[05:12] 🎬 Plan-X: Instruct Video Generation via Semantic Planning(Plan-X:通过语义规划指导视频生成)
[05:54] 🧪 M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark(M3-Bench:多模态、多跳、多线程工具使用MLLM智能体基准)
[06:25] 🤖 Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO(多智能体深度研究:使用M-GRPO训练多智能体系统)
[07:24] 🎬 HunyuanVideo 1.5 Technical Report(混元视频1.5技术报告)
[07:56] 🧠 Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens(视觉思维链:通过连续视觉标记教导视觉语言模型更好地观察与思考)
[08:36] 🧠 MIST: Mutual Information Via Supervised Training(MIST:通过监督训练实现互信息估计)
[09:07] 🎨 Controllable Layer Decomposition for Reversible Multi-Layer Image Generation(可控层分解用于可逆多层图像生成)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
By duan5
22 ratings
本期的 15 篇论文如下:
[00:25] 🧠 General Agentic Memory Via Deep Research(通过深度研究的通用代理记忆)
[00:52] 🧪 AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning(AutoEnv:用于跨环境智能体学习的自动化环境测量)
[01:24] 🤖 Computer-Use Agents as Judges for Generative User Interface(以计算机使用代理作为生成式用户界面的评判者)
[01:55] 🎨 DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation(DeCo:用于端到端图像生成的频率解耦像素扩散)
[02:24] 🎨 UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios(UltraFlux:面向高质量原生4K文本到图像跨多样宽高比的数据-模型协同设计)
[03:10] 🔍 DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research(DR Tulu:基于演化评分标准的深度研究强化学习)
[03:46] 🎬 In-Video Instructions: Visual Signals as Generative Control(视频内指令:视觉信号作为生成控制)
[04:24] 📊 Budget-Aware Tool-Use Enables Effective Agent Scaling(预算感知的工具使用实现有效的智能体扩展)
[05:12] 🎬 Plan-X: Instruct Video Generation via Semantic Planning(Plan-X:通过语义规划指导视频生成)
[05:54] 🧪 M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark(M3-Bench:多模态、多跳、多线程工具使用MLLM智能体基准)
[06:25] 🤖 Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO(多智能体深度研究:使用M-GRPO训练多智能体系统)
[07:24] 🎬 HunyuanVideo 1.5 Technical Report(混元视频1.5技术报告)
[07:56] 🧠 Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens(视觉思维链:通过连续视觉标记教导视觉语言模型更好地观察与思考)
[08:36] 🧠 MIST: Mutual Information Via Supervised Training(MIST:通过监督训练实现互信息估计)
[09:07] 🎨 Controllable Layer Decomposition for Reversible Multi-Layer Image Generation(可控层分解用于可逆多层图像生成)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

56 Listeners

291 Listeners

294 Listeners

156 Listeners

135 Listeners

7 Listeners

1 Listeners

0 Listeners