October 07, 2025

2025.10.07 | 论文秒变演讲；Video-LMM后训练突破

Listen Later

11 minutes

本期的 15 篇论文如下：

[00:21] 🎬 Paper2Video: Automatic Video Generation from Scientific Papers（论文自动生成学术演讲视频）

[00:55] 🎬 Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models（Video-LMM后训练：深入剖析大型多模态模型的视频推理）

[01:38] 🎬 VChain: Chain-of-Visual-Thought for Reasoning in Video Generation（VChain：面向视频生成推理的视觉思维链）

[02:14] 👻 Imperceptible Jailbreaking against Large Language Models（针对大语言模型的隐形越狱攻击）

[02:56] 🌳 MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information（MITS：基于点互信息的树搜索增强大模型推理）

[03:30] 🧬 Hybrid Architectures for Language Models: Systematic Analysis and Design Insights（语言模型混合架构：系统剖析与设计洞见）

[04:07] 📊 Factuality Matters: When Image Generation and Editing Meet Structured Visuals（事实至关重要：当图像生成与编辑遇上结构化视觉）

[04:59] 🔄 Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models（反应式Transformer：事件驱动的实时有状态对话模型）

[05:55] ⚖ Judging with Confidence: Calibrating Autoraters to Preference Distributions（置信评判：将自动评分器校准到偏好分布）

[06:44] 🎯 Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training（Reinforce-Ada：面向Reinforce风格LLM训练的自适应采样框架）

[07:27] 📏 Optimal Scaling Needs Optimal Norm（最优扩放需要最优范数）

[07:51] 🔬 Code4MeV2: a Research-oriented Code-completion Platform（Code4MeV2：面向研究的代码补全平台）

[08:31] 🪞 Self-Reflective Generation at Test Time（测试时自反思生成）

[09:15] 🔄 SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs（SwiReasoning：在显式与潜空间之间切换思维，实现帕累托更优的推理大模型）

[10:00] 👀 Watch and Learn: Learning to Use Computers from Online Videos（观看与学习：从在线视频中学习使用计算机）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

HuggingFace 每日AI论文速递

By duan

5

22 ratings

October 07, 2025

2025.10.07 | 论文秒变演讲；Video-LMM后训练突破

Listen Later

11 minutes

本期的 15 篇论文如下：

[00:21] 🎬 Paper2Video: Automatic Video Generation from Scientific Papers（论文自动生成学术演讲视频）

[00:55] 🎬 Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models（Video-LMM后训练：深入剖析大型多模态模型的视频推理）

[01:38] 🎬 VChain: Chain-of-Visual-Thought for Reasoning in Video Generation（VChain：面向视频生成推理的视觉思维链）

[02:14] 👻 Imperceptible Jailbreaking against Large Language Models（针对大语言模型的隐形越狱攻击）

[02:56] 🌳 MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information（MITS：基于点互信息的树搜索增强大模型推理）

[03:30] 🧬 Hybrid Architectures for Language Models: Systematic Analysis and Design Insights（语言模型混合架构：系统剖析与设计洞见）

[04:07] 📊 Factuality Matters: When Image Generation and Editing Meet Structured Visuals（事实至关重要：当图像生成与编辑遇上结构化视觉）

[04:59] 🔄 Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models（反应式Transformer：事件驱动的实时有状态对话模型）

[05:55] ⚖ Judging with Confidence: Calibrating Autoraters to Preference Distributions（置信评判：将自动评分器校准到偏好分布）

[06:44] 🎯 Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training（Reinforce-Ada：面向Reinforce风格LLM训练的自适应采样框架）

[07:27] 📏 Optimal Scaling Needs Optimal Norm（最优扩放需要最优范数）

[07:51] 🔬 Code4MeV2: a Research-oriented Code-completion Platform（Code4MeV2：面向研究的代码补全平台）

[08:31] 🪞 Self-Reflective Generation at Test Time（测试时自反思生成）

[09:15] 🔄 SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs（SwiReasoning：在显式与潜空间之间切换思维，实现帕累托更优的推理大模型）

[10:00] 👀 Watch and Learn: Learning to Use Computers from Online Videos（观看与学习：从在线视频中学习使用计算机）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文，败类 by 思文败类

思文，败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

135 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners