
Sign up to save your podcasts
Or
本期的 10 篇论文如下:
[00:31] TOP1(🔥335) | 🤖 Qwen2.5 Technical Report(Qwen2.5技术报告)
[02:44] TOP2(🔥136) | 🎥 Apollo: An Exploration of Video Understanding in Large Multimodal Models(阿波罗:大型多模态模型中的视频理解探索)
[05:01] TOP3(🔥123) | 🚀 Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling(通过模型、数据和测试时扩展提升开源多模态模型的性能边界)
[07:18] TOP4(🔥121) | 🔄 PaliGemma 2: A Family of Versatile VLMs for Transfer(PaliGemma 2:多功能视觉语言模型的迁移研究)
[09:38] TOP5(🔥116) | 🚀 Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference(更智能、更优、更快、更长:一种现代双向编码器,用于快速、内存高效的长上下文微调和推理)
[12:21] TOP6(🔥108) | 🚀 SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance(SNOOPI:超强一步扩散蒸馏与适当引导)
[14:42] TOP7(🔥105) | 🔍 VisionZip: Longer is Better but Not Necessary in Vision Language Models(视觉压缩:视觉语言模型中长度并非必要优势)
[16:51] TOP8(🔥96) | 🧠 Phi-4 Technical Report(Phi-4 技术报告)
[18:55] TOP9(🔥92) | 🎥 InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions(InternLM-XComposer2.5-OmniLive:一个用于长期流式视频和音频交互的综合多模态系统)
[21:02] TOP10(🔥91) | 🧠 Are Your LLMs Capable of Stable Reasoning?(你的大语言模型能够稳定推理吗?)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 10 篇论文如下:
[00:31] TOP1(🔥335) | 🤖 Qwen2.5 Technical Report(Qwen2.5技术报告)
[02:44] TOP2(🔥136) | 🎥 Apollo: An Exploration of Video Understanding in Large Multimodal Models(阿波罗:大型多模态模型中的视频理解探索)
[05:01] TOP3(🔥123) | 🚀 Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling(通过模型、数据和测试时扩展提升开源多模态模型的性能边界)
[07:18] TOP4(🔥121) | 🔄 PaliGemma 2: A Family of Versatile VLMs for Transfer(PaliGemma 2:多功能视觉语言模型的迁移研究)
[09:38] TOP5(🔥116) | 🚀 Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference(更智能、更优、更快、更长:一种现代双向编码器,用于快速、内存高效的长上下文微调和推理)
[12:21] TOP6(🔥108) | 🚀 SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance(SNOOPI:超强一步扩散蒸馏与适当引导)
[14:42] TOP7(🔥105) | 🔍 VisionZip: Longer is Better but Not Necessary in Vision Language Models(视觉压缩:视觉语言模型中长度并非必要优势)
[16:51] TOP8(🔥96) | 🧠 Phi-4 Technical Report(Phi-4 技术报告)
[18:55] TOP9(🔥92) | 🎥 InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions(InternLM-XComposer2.5-OmniLive:一个用于长期流式视频和音频交互的综合多模态系统)
[21:02] TOP10(🔥91) | 🧠 Are Your LLMs Capable of Stable Reasoning?(你的大语言模型能够稳定推理吗?)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递