HuggingFace 每日AI论文速递

2025.02.25 | 长上下文优化创新,视觉扩散高效通用。


Listen Later

本期的 20 篇论文如下:

[00:27] 📖 Thus Spake Long-Context Large Language Model(长上下文大语言模型如是说)

[01:09] 🌈 DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks(用于视觉感知任务的通用扩散模型)

[01:48] 🚀 Slamming: Training a Speech Language Model on One GPU in a Day(撞击:在一天内使用单个GPU训练语音语言模型)

[02:32] 🎥 VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing(视频粒度:调节时空注意力实现多粒度视频编辑)

[03:11] 🎧 Audio-FLAN: A Preliminary Release(音频FLAN:初步发布)

[03:43] 🧠 CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models(CodeCriticBench:面向大型语言模型的全面代码 critique 基准测试)

[04:28] 🎨 GCC: Generative Color Constancy via Diffusing a Color Checker(GCC:通过扩散色卡生成颜色恒常性)

[05:11] 📊 Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning(数学推理中测试时间扩展的语言通用性)

[05:57] 🚀 Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment(让LoRA再次伟大:通过自适应奇异值和混合专家优化对齐提升LoRA性能)

[06:38] 🧠 Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models(多模态不一致性推理(MMIR):多模态推理模型的新基准)

[07:23] 🎥 RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers(RIFLEx:视频扩散Transformer中长度外推的免费午餐)

[08:01] 📱 Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration(移动代理V:通过视频引导的多代理协作学习移动设备操作)

[08:45] ⏳ Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties(中国朝代间的时间推理与对齐基准测试)

[09:31] 🤖 Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation(反射性规划:视觉语言模型在多阶段长时程机器人操作中的应用)

[10:02] 🔄 Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam(稳定-SPAM:如何在4位精度下比16位Adam更稳定地训练)

[10:43] 📝 Can Community Notes Replace Professional Fact-Checkers?(社区笔记能替代专业事实核查员吗?)

[11:24] 📈 Forecasting Open-Weight AI Model Growth on Hugging Face(预测Hugging Face上开放权重AI模型的增长)

[12:08] 🔑 Beyond Release: Access Considerations for Generative AI Systems(超越发布:生成式人工智能系统的访问考量)

[12:49] 🌐 TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning(TAG:一种用于多智能体分层强化学习的去中心化框架)

[13:30] 💃 X-Dancer: Expressive Music to Human Dance Video Generation(X-Dancer:从音乐生成生动舞蹈视频)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文,败类 by 思文败类

思文,败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners