HuggingFace 每日AI论文速递

2024.10.18 每日AI论文 | AI评估标准化,电影生成模型领先。


Listen Later

本期的 31 篇论文如下:

[00:23] 📊 MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures(MixEval-X:从现实世界数据混合中进行任意到任意评估)

[01:02] 🎥 Movie Gen: A Cast of Media Foundation Models(电影生成:媒体基础模型集合)

[01:35] 📱 MobA: A Two-Level Agent System for Efficient Mobile Task Automation(MobA:一种高效移动任务自动化的两级代理系统)

[02:18] 🌐 Harnessing Webpage UIs for Text-Rich Visual Understanding(利用网页UI进行丰富的视觉理解)

[02:59] 🔄 Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation(雅努斯:解耦视觉编码以实现统一的多模态理解和生成)

[03:29] 🩺 MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models(多功能多模态RAG系统在医学视觉语言模型中的应用)

[04:04] 📊 A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models(大规模模型后训练中Delta参数编辑的统一视角)

[04:46] 🔄 PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment(PopAlign:多样化对比模式以实现更全面的模型对齐)

[05:23] 🔍 BenTo: Benchmark Task Reduction with In-Context Transferability(BenTo: 基于上下文迁移性的基准任务缩减)

[06:03] 🎥 DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control(DreamVideo-2:零样本主题驱动视频定制与精确运动控制)

[06:49] 🧠 MoH: Multi-Head Attention as Mixture-of-Head Attention(MoH:多头部注意力机制作为混合头部注意力机制)

[07:28] 🎥 VidPanos: Generative Panoramic Videos from Casual Panning Videos(VidPanos:从随意拍摄的平移视频生成全景视频)

[08:03] 📉 FlatQuant: Flatness Matters for LLM Quantization(FlatQuant:扁平化对LLM量化的重要性)

[08:44] 🔄 Retrospective Learning from Interactions(从交互中回顾学习)

[09:22] 🔄 Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation(向前失败:利用合成数据和检索增强改进ASR的生成错误校正)

[10:06] 🖼 Can MLLMs Understand the Deep Implication Behind Chinese Images?(多模态大语言模型能否理解中文图像的深层含义?)

[10:43] 📱 MedMobile: A mobile-sized language model with expert-level clinical capabilities(MedMobile:具备专家级临床能力的移动端语言模型)

[11:22] 🌍 WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines(世界美食:多语言多文化视觉问答的大规模基准)

[12:04] 🤖 Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant(记住、检索与生成:理解无限视觉概念作为个性化助手)

[12:48] 🔄 LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning(LoLDU:通过下三角-对角-上三角分解实现低秩适应的参数高效微调)

[13:29] 🔒 AERO: Softmax-Only LLMs for Efficient Private Inference(AERO:仅使用Softmax的LLM实现高效隐私推断)

[14:12] 🌐 $γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models(γ-MoD:探索多模态大语言模型的深度混合适应)

[14:45] 🌐 Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats(长序列大重建模型:广覆盖高斯点云)

[15:24] 🎶 MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization(MuVi:视频到音乐生成与语义对齐及节奏同步)

[16:05] 🔒 Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems(大型语言模型是否具备政治正确性?分析AI系统中的伦理偏见与越狱漏洞)

[16:48] 📚 SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation(基于模式教学和检索增强生成的数学应用题解决方法)

[17:27] 🗺 Roadmap towards Superhuman Speech Understanding using Large Language Models(基于大型语言模型的超人类语音理解路线图)

[18:05] 🔄 Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment(面向无指导的AR视觉生成的条件对比对齐)

[18:47] 🤖 TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration(TransAgent:异构代理协作迁移视觉语言基础模型)

[19:25] 🔬 Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models(开放材料2024(OMat24)无机材料数据集与模型)

[20:05] 📚 Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key(最小调整解锁LLM长输出:高质量数据的关键)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文,败类 by 思文败类

思文,败类

157 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners