December 04, 2025

2025.12.04 | Qwen3-VL多模态超长上下文；PretrainZero强化主动预训练

Listen Later

11 minutes

本期的 15 篇论文如下：

[00:24] 🧠 Qwen3-VL Technical Report（Qwen3-VL 技术报告）

[00:57] 🧠 PretrainZero: Reinforcement Active Pretraining（PretrainZero：强化主动预训练）

[01:36] 🎬 ViDiC: Video Difference Captioning（ViDiC：视频差异描述）

[02:24] 🧠 OneThinker: All-in-one Reasoning Model for Image and Video（OneThinker：面向图像与视频的全能推理模型）

[03:07] 🔄 Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation（重新思考文本到视觉生成中推理时扩展的提示设计）

[03:59] ⚙ Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach（引导视觉-语言-动作模型作为反探索：一种测试时缩放方法）

[04:46] 🤖 SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL（SpaceTools：通过双重交互式强化学习实现工具增强的空间推理）

[05:22] 🔧 Thinking with Programming Vision: Towards a Unified View for Thinking with Images（以编程视觉思考：迈向图像思维的统一视角）

[06:01] 🔄 Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment（逆向流动：通过反向表征对齐改进标准化流）

[06:51] 🎮 RELIC: Interactive Video World Model with Long-Horizon Memory（RELIC：具备长时记忆的交互式视频世界模型）

[07:34] 🍳 CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation（CookAnything：灵活且一致的多步骤食谱图像生成框架）

[08:26] 🧠 SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment（SR-GRPO：将稳定秩作为大语言模型对齐的内在几何奖励）

[09:01] 📊 AlignBench: Benchmarking Fine-Grained Image-Text Alignment with Synthetic Image-Caption Pairs（AlignBench：基于合成图像-描述对评估细粒度图文对齐的基准）

[09:38] 🧠 SkillFactory: Self-Distillation For Learning Cognitive Behaviors（SkillFactory：用于学习认知行为的自蒸馏方法）

[10:20] 📱 UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs（UniQL：面向自适应边缘大语言模型的统一量化与低秩压缩）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

HuggingFace 每日AI论文速递

By duan

5

22 ratings

December 04, 2025

2025.12.04 | Qwen3-VL多模态超长上下文；PretrainZero强化主动预训练

Listen Later

11 minutes

本期的 15 篇论文如下：

[00:24] 🧠 Qwen3-VL Technical Report（Qwen3-VL 技术报告）

[00:57] 🧠 PretrainZero: Reinforcement Active Pretraining（PretrainZero：强化主动预训练）

[01:36] 🎬 ViDiC: Video Difference Captioning（ViDiC：视频差异描述）

[02:24] 🧠 OneThinker: All-in-one Reasoning Model for Image and Video（OneThinker：面向图像与视频的全能推理模型）

[03:07] 🔄 Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation（重新思考文本到视觉生成中推理时扩展的提示设计）

[03:59] ⚙ Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach（引导视觉-语言-动作模型作为反探索：一种测试时缩放方法）

[04:46] 🤖 SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL（SpaceTools：通过双重交互式强化学习实现工具增强的空间推理）

[05:22] 🔧 Thinking with Programming Vision: Towards a Unified View for Thinking with Images（以编程视觉思考：迈向图像思维的统一视角）

[06:01] 🔄 Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment（逆向流动：通过反向表征对齐改进标准化流）

[06:51] 🎮 RELIC: Interactive Video World Model with Long-Horizon Memory（RELIC：具备长时记忆的交互式视频世界模型）

[07:34] 🍳 CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation（CookAnything：灵活且一致的多步骤食谱图像生成框架）

[08:26] 🧠 SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment（SR-GRPO：将稳定秩作为大语言模型对齐的内在几何奖励）

[09:01] 📊 AlignBench: Benchmarking Fine-Grained Image-Text Alignment with Synthetic Image-Caption Pairs（AlignBench：基于合成图像-描述对评估细粒度图文对齐的基准）

[09:38] 🧠 SkillFactory: Self-Distillation For Learning Cognitive Behaviors（SkillFactory：用于学习认知行为的自蒸馏方法）

[10:20] 📱 UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs（UniQL：面向自适应边缘大语言模型的统一量化与低秩压缩）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文，败类 by 思文败类

思文，败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

135 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners