HuggingFace 每日AI论文速递

2025.06.30 | 3D视觉编辑;视频令牌压缩


Listen Later

本期的 14 篇论文如下:

[00:26] 🎨 BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing(BlenderFusion:基于3D的视觉编辑和生成式合成)

[00:59] ✂ LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs(LLaVA-Scissor:基于语义连通分量的视频LLM令牌压缩)

[01:42] 🖼 XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation(XVerse:通过DiT调制实现对身份和语义属性的多主体一致性控制)

[02:24] 🎬 ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models(ShotBench:视觉-语言模型中专家级电影理解)

[03:05] 🖼 From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios(从理想到现实:面向真实场景的统一且数据高效的密集预测)

[03:44] 🖼 MiCo: Multi-image Contrast for Reinforcement Visual Reasoning(MiCo:用于增强视觉推理的多图像对比学习)

[04:24] 🧮 Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity(Pangu Pro MoE:用于高效稀疏性的分组专家混合模型)

[05:06] 🗺 Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs(细粒度偏好优化提升视觉语言模型中的空间推理能力)

[05:52] 🤖 Ark: An Open-source Python-based Framework for Robot Learning(Ark:一个用于机器人学习的开源Python框架)

[06:36] 🎨 Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls(噪声一致性训练:一种在学习额外控制时用于单步生成器的原生方法)

[07:20] 🏎 The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements(自动化LLM竞速基准:复现NanoGPT的改进)

[08:01] 🧠 Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training(Gazal-R1:通过参数高效的两阶段训练实现最先进的医学推理)

[08:45] 🧮 Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning(Confucius3-Math:一个用于中国K-12数学学习的轻量级高性能推理大语言模型)

[09:39] 👁 RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models(RetFiner:用于视网膜基础模型的视觉-语言精炼方案)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文,败类 by 思文败类

思文,败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners