December 04, 2024

2024.12.04 每日AI论文 | 多镜头视频生成框架提升叙事连贯性，关键令牌识别增强LLM推理能力。

Listen Later

11 minutes

本期的 15 篇论文如下：

[00:24] 🎥 VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation（视频思维生成：多镜头视频生成的协作框架）

[01:04] 🧠 Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability（关键令牌重要性：令牌级对比估计提升LLM的推理能力）

[01:45] 🔄 Free Process Rewards without Process Labels（无过程标签的自由过程奖励）

[02:30] 🎧 AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?（AV-Odyssey 基准：多模态大语言模型真的能理解视听信息吗？）

[03:04] 🤖 MALT: Improving Reasoning with Multi-Agent LLM Training（MALT：通过多智能体LLM训练提升推理能力）

[03:45] 🎥 OmniCreator: Self-Supervised Unified Generation with Universal Editing（全能创作者：自监督统一生成与通用编辑）

[04:23] 🌴 Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis（真相还是幻象？面向端到端事实性评估的LLM-Oasis）

[05:08] 📚 OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation（OCR 阻碍 RAG：评估 OCR 对检索增强生成系统的级联影响）

[05:51] 📊 Scaling Image Tokenizers with Grouped Spherical Quantization（基于分组球面量化的图像标记器扩展）

[06:27] 🌐 LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences（LSceneLLM：利用自适应视觉偏好增强大型3D场景理解）

[07:09] ⚙ A dynamic parallel method for performance optimization on hybrid CPUs（混合CPU性能优化的动态并行方法）

[08:00] 🌐 MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation（MaskRIS：语义扭曲感知的数据增强方法用于指称图像分割）

[08:46] 🎥 Motion Prompting: Controlling Video Generation with Motion Trajectories（运动提示：通过运动轨迹控制视频生成）

[09:27] 🎥 VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval（视频亮点：联合视频亮点检测与时刻检索的特征精炼与跨任务对齐Transformer）

[10:01] 🤖 Generating a Low-code Complete Workflow via Task Decomposition and RAG（通过任务分解和RAG生成低代码完整工作流程）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

HuggingFace 每日AI论文速递

By duan

5

22 ratings

December 04, 2024

2024.12.04 每日AI论文 | 多镜头视频生成框架提升叙事连贯性，关键令牌识别增强LLM推理能力。

Listen Later

11 minutes

本期的 15 篇论文如下：

[00:24] 🎥 VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation（视频思维生成：多镜头视频生成的协作框架）

[01:04] 🧠 Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability（关键令牌重要性：令牌级对比估计提升LLM的推理能力）

[01:45] 🔄 Free Process Rewards without Process Labels（无过程标签的自由过程奖励）

[02:30] 🎧 AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?（AV-Odyssey 基准：多模态大语言模型真的能理解视听信息吗？）

[03:04] 🤖 MALT: Improving Reasoning with Multi-Agent LLM Training（MALT：通过多智能体LLM训练提升推理能力）

[03:45] 🎥 OmniCreator: Self-Supervised Unified Generation with Universal Editing（全能创作者：自监督统一生成与通用编辑）

[04:23] 🌴 Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis（真相还是幻象？面向端到端事实性评估的LLM-Oasis）

[05:08] 📚 OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation（OCR 阻碍 RAG：评估 OCR 对检索增强生成系统的级联影响）

[05:51] 📊 Scaling Image Tokenizers with Grouped Spherical Quantization（基于分组球面量化的图像标记器扩展）

[06:27] 🌐 LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences（LSceneLLM：利用自适应视觉偏好增强大型3D场景理解）

[07:09] ⚙ A dynamic parallel method for performance optimization on hybrid CPUs（混合CPU性能优化的动态并行方法）

[08:00] 🌐 MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation（MaskRIS：语义扭曲感知的数据增强方法用于指称图像分割）

[08:46] 🎥 Motion Prompting: Controlling Video Generation with Motion Trajectories（运动提示：通过运动轨迹控制视频生成）

[09:27] 🎥 VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval（视频亮点：联合视频亮点检测与时刻检索的特征精炼与跨任务对齐Transformer）

[10:01] 🤖 Generating a Low-code Complete Workflow via Task Decomposition and RAG（通过任务分解和RAG生成低代码完整工作流程）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

More shows like HuggingFace 每日AI论文速递

硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文，败类 by 思文败类

思文，败类

157 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners