HuggingFace 每日AI论文速递

2024.12.06 每日AI论文 | 视觉压缩提升效率,代码监控增强机器人可靠性。


Listen Later

本期的 23 篇论文如下:

[00:23] 🔍 VisionZip: Longer is Better but Not Necessary in Vision Language Models(视觉压缩:视觉语言模型中长度并非必要优势)

[01:03] 🤖 Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection(代码即监控:约束感知的视觉编程用于反应性和前瞻性机器人故障检测)

[01:43] 🖥 Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction(Aguvis:统一纯视觉自主GUI交互代理)

[02:27] 🔊 A Noise is Worth Diffusion Guidance(噪声值得扩散引导)

[03:04] 📊 Evaluating Language Models as Synthetic Data Generators(评估语言模型作为合成数据生成器)

[03:48] 🌐 Structured 3D Latents for Scalable and Versatile 3D Generation(结构化3D潜在表示在可扩展和多功能3D生成中的应用)

[04:26] 🌐 MV-Adapter: Multi-view Consistent Image Generation Made Easy(MV-Adapter:多视角一致图像生成变得简单)

[05:05] 🖼 Negative Token Merging: Image-based Adversarial Feature Guidance(负向标记合并:基于图像的对抗特征引导)

[05:41] 🌐 Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion(佛罗伦萨-VL:通过生成视觉编码器和深度-广度融合增强视觉语言模型)

[06:18] 📈 Densing Law of LLMs(大语言模型的密度定律)

[06:59] 🌌 Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis(无限:高分辨率图像合成中的比特位自回归建模)

[07:37] ⚽ Towards Universal Soccer Video Understanding(面向通用足球视频理解)

[08:15] 🎨 HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing(HumanEdit:一个高质量的人类奖励数据集,用于基于指令的图像编辑)

[08:53] 👗 AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models(任意服装虚拟试穿:基于潜在扩散模型的可定制多服装生成)

[09:35] 🌍 Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation(全球MMLU:理解和解决多语言评估中的文化和语言偏见)

[10:11] 🌐 Personalized Multimodal Large Language Models: A Survey(个性化多模态大语言模型:综述)

[10:55] ⚡ ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality(ZipAR:通过空间局部性加速自回归图像生成)

[11:36] 🧠 MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities(MRGen:基于扩散的可控数据引擎用于无标注模态的MRI分割)

[12:14] 🧠 Discriminative Fine-tuning of LVLMs(判别性微调的大视觉语言模型)

[12:48] 🧠 Monet: Mixture of Monosemantic Experts for Transformers(Monet:Transformer的单语义专家混合模型)

[13:24] 🌊 OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows(全流:多模态校正流的任意到任意生成)

[13:59] 🧠 KV Shifting Attention Enhances Language Modeling(KV移位注意力增强语言建模)

[14:40] 🌍 Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement(Marco-LLM:通过大规模多语言训练实现跨语言增强)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

291 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

294 Listeners

思文,败类 by 思文败类

思文,败类

157 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners