HuggingFace 每日AI论文速递

2025.02.10 | 视频处理性能提升,视频生成速度显著加快。


Listen Later

本期的 21 篇论文如下:

[00:22] 🎥 VideoRoPE: What Makes for Good Video Rotary Position Embedding?(视频旋转位置嵌入:什么使得视频旋转位置嵌入有效?)

[01:07] 🎥 Fast Video Generation with Sliding Tile Attention(基于滑动瓦片注意力的快速视频生成)

[01:54] 🎥 Goku: Flow Based Video Generative Foundation Models(悟空:基于流的视频生成基础模型)

[02:35] 🌍 AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting(AuraFusion360:基于参考的360°无界场景修补增强未见区域对齐)

[03:19] 🔢 QuEST: Stable Training of LLMs with 1-Bit Weights and Activations(QuEST:使用1位权重和激活值稳定训练大型语言模型)

[03:57] 🛡 DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails(DuoGuard:一种基于双玩家强化学习的多语言大模型防护框架)

[04:40] 🧠 Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach(通过潜在推理扩展测试时计算:一种递归深度方法)

[05:28] 🎯 Agency Is Frame-Dependent(代理是框架依赖的)

[06:04] 🎥 FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation(闪视频:高效高分辨率视频生成中的细节保真)

[06:46] 📊 Linear Correlation in LM's Compositional Generalization and Hallucination(语言模型中的组合泛化与幻觉的线性相关性)

[07:32] 🧠 Generating Symbolic World Models via Test-time Scaling of Large Language Models(通过测试时扩展大型语言模型生成符号世界模型)

[08:09] 📱 On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices(设备上的Sora:为移动设备实现基于扩散的文本到视频生成)

[08:51] ⚡ CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference(CMoE:用于高效LLM推理的快速混合专家模型雕刻)

[09:32] 🧩 Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More(补丁化缩放定律:图像价值50,176个标记及以上)

[10:20] 🔄 Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models(退一步跃进:提升语言模型推理能力的自回溯机制)

[11:06] 🧠 CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance(CodeSteer:通过代码/文本引导的符号增强语言模型)

[11:50] 🧩 No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces(无任务落后:各向同性模型合并与通用及任务特定子空间)

[12:39] 🌓 YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment(阴阳对齐:基准测试矛盾目标并提出基于多目标优化的DPO用于文本到图像对齐)

[13:20] 🌐 QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation(QLIP:文本对齐视觉标记化统一自回归多模态理解和生成)

[14:02] 🧠 ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning(ARR:通过分析、检索和推理进行问答的大语言模型)

[14:48] 🤖 MEETING DELEGATE: Benchmarking LLMs on Attending Meetings on Our Behalf(会议代表:评估大型语言模型在代为参加会议中的表现)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文,败类 by 思文败类

思文,败类

157 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners