HuggingFace 每日AI论文速递

2025.07.01 | 多模态生成领先;视频扩散效率提升


Listen Later

本期的 15 篇论文如下:

[00:21] 🖼 Ovis-U1 Technical Report(Ovis-U1 技术报告)

[00:58] 🎬 VMoBA: Mixture-of-Block Attention for Video Diffusion Models(VMoBA:用于视频扩散模型的混合块注意力机制)

[01:36] ✍ Calligrapher: Freestyle Text Image Customization(书法家:自由风格的文本图像定制)

[02:21] 🖼 Listener-Rewarded Thinking in VLMs for Image Preferences(图像偏好:视觉语言模型中基于监听者奖励的思考)

[03:04] 🧠 SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning(SPIRAL:基于零和博弈的自博弈通过多智能体多轮强化学习激励推理)

[03:46] 📸 Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention(基于图结构几何注意力机制的稳定ToF深度图像去噪)

[04:29] 🧬 Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective(上下文演化提示:一种开放式、自复制的视角)

[05:09] 🤔 Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling?(“顿悟时刻”再探:视觉语言模型能否在推理时扩展中实现真正的自我验证?)

[05:58] 💾 MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation(MEMFOF:面向内存高效多帧光流估计的高分辨率训练)

[06:38] 🚀 SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity(SparseLoRA:通过上下文稀疏性加速LLM微调)

[07:23] 🏙 UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding(UrbanLLaVA:一个用于城市智能的、具备空间推理与理解能力的多模态大型语言模型)

[08:01] 🧠 MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning(MARBLE:一个用于多模态空间推理与规划的硬基准)

[08:38] 🧰 Teaching a Language Model to Speak the Language of Tools(教语言模型说工具的语言)

[09:16] ✂ VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs(VOCABTRIM:用于LLM高效推测解码的词汇表剪枝)

[10:01] 🤖 RoboScape: Physics-informed Embodied World Model(RoboScape:物理信息驱动的具身世界模型)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文,败类 by 思文败类

思文,败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners