HuggingFace 每日AI论文速递

2025.03.26 | 视频预测性能提升,多模态预训练效果显著。


Listen Later

本期的 15 篇论文如下:

[00:22] 🎬 Long-Context Autoregressive Video Modeling with Next-Frame Prediction(基于下一帧预测的长程上下文自回归视频建模)

[01:01] 🖼 CoMP: Continual Multimodal Pre-training for Vision Foundation Models(CoMP:面向视觉基础模型的持续多模态预训练)

[01:42] 🎬 Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation(探索大型多模态模型在视频理解中的幻觉现象:基准、分析与缓解)

[02:28] 📈 Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing(基于随机生成与回滚预算强制的Flow模型推理时扩展)

[03:14] 🖼 Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation(揪出伪造:基于大型多模态模型的合成图像检测与伪影解释)

[03:54] 🖼 Scaling Vision Pre-Training to 4K Resolution(将视觉预训练扩展到4K分辨率)

[04:33] 🤔 Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking(三思而后行:通过扩展多轮测试时思考来增强LLM推理能力)

[05:15] 🖼 CoLLM: A Large Language Model for Composed Image Retrieval(CoLLM:用于组合图像检索的大型语言模型)

[05:53] 🤖 MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding(MDocAgent:用于文档理解的多模态多代理框架)

[06:35] 🖼 Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models(基于扩散模型的潜在空间超分辨率高分辨率图像生成)

[07:13] 🔍 ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning(ReSearch:通过强化学习训练大型语言模型以进行搜索推理)

[07:54] 🛡 LookAhead Tuning: Safer Language Models via Partial Answer Previews(前瞻调优:通过部分答案预览实现更安全的语言模型)

[08:38] 💡 Frequency Dynamic Convolution for Dense Image Prediction(用于密集图像预测的频率动态卷积)

[09:18] 🖼 LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation(LPOSS:基于图像块和像素的标签传播,用于开放词汇语义分割)

[09:51] 🧬 Gumbel-Softmax Flow Matching with Straight-Through Guidance for Controllable Biological Sequence Generation(基于直通引导的Gumbel-Softmax Flow Matching用于可控生物序列生成)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

...more
View all episodesView all episodes
Download on the App Store

HuggingFace 每日AI论文速递By duan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like HuggingFace 每日AI论文速递

View all
硅谷101|中国版 by 泓君Jane

硅谷101|中国版

56 Listeners

商业就是这样 by 商业就是这样

商业就是这样

292 Listeners

声动早咖啡 by 声动活泼

声动早咖啡

293 Listeners

思文,败类 by 思文败类

思文,败类

156 Listeners

不开玩笑 Jokes Aside by 不开玩笑JokesAside

不开玩笑 Jokes Aside

136 Listeners

人民公园说AI by JustSayAI

人民公园说AI

7 Listeners

數創實驗室 - AI時代的學習指南 by Vincent在數創

數創實驗室 - AI時代的學習指南

1 Listeners

AI可可AI生活 by fly51fly

AI可可AI生活

0 Listeners