HuggingFace 每日AI论文速递

2025.02.17 | RAS加速扩散变换器,视频生成提升质量


Listen Later

本期的 21 篇论文如下:

[00:22] 🌐 Region-Adaptive Sampling for Diffusion Transformers(区域自适应采样扩散变换器)

[01:05] 🎥 Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model(步进视频生成技术报告:视频基础模型的实践、挑战与未来)

[01:48] 🌊 Large Language Diffusion Models(大规模语言扩散模型)

[02:31] 🧠 ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models(零基准:当代大型多模态模型的不可视觉基准)

[03:15] 🌟 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment(MM-RLHF:多模态大语言模型对齐的下一步进展)

[03:58] 🖼 Precise Parameter Localization for Textual Generation in Diffusion Models(扩散模型中文本生成精确参数定位)

[04:40] 🧠 Diverse Inference and Verification for Advanced Reasoning(高级推理的多重推断与验证)

[05:22] 🧬 DarwinLM: Evolutionary Structured Pruning of Large Language Models(达尔文LM:大型语言模型的进化结构剪枝)

[06:02] 📈 AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting(AdaPTS:将单变量基础模型适配到概率性多变量时间序列预测)

[06:40] 🖼 ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation(ImageRAG:动态图像检索用于引导图像生成)

[07:23] 🤖 We Can't Understand AI Using our Existing Vocabulary(我们无法用现有词汇理解人工智能)

[08:03] 📊 FoNE: Precise Single-Token Number Embeddings via Fourier Features(FoNE:通过傅里叶特征实现精确的单标记数字嵌入)

[08:53] 🌍 Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages(小模型,大影响:面向低资源语言的多语言小模型的有效语料库与基于图的适应)

[09:41] 🔓 Jailbreaking to Jailbreak(越狱以越狱)

[10:23] 🤖 STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning(STMA:一种用于长时程具身任务规划的时空记忆代理)

[11:05] 📊 Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding(文本引导的稀疏体素剪枝用于高效的三维视觉定位)

[11:41] ⚡ MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers(基于ODE和SDE求解器的均值回归扩散快速采样器)

[12:26] 🚗 V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models(V2V-LLM:基于多模态大语言模型的车辆间协同自动驾驶)

[13:06] 🎵 CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages(CLaMP 3:跨模态与跨语言的通用音乐信息检索)

[13:49] 🧩 Cluster and Predict Latents Patches for Improved Masked Image Modeling(基于聚类与预测潜在补丁的改进掩码图像建模)

[14:31] 🧬 Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model(基于语言扩散模型的端到端从头蛋白质设计以实现定制动力学)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文,败类 by 思文败类

思文,败类

157 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners