February 14, 2025

2025.02.14 | GPU扩展至300万tokens，文本编码器内存高效策略。

14 minutes

本期的 18 篇论文如下：

[00:21] 🚀 InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU（InfiniteHiP：在单个GPU上扩展语言模型上下文至300万 tokens）

[01:07] 🖼 Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation（Skrr：跳过并重用文本编码器层以实现内存高效文本到图像生成）

[01:49] 🧠 An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging（一个开放的方案：通过模型合并在一日内将语言特定LLM适应为推理模型）

[02:31] 📚 SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models（SelfCite：大语言模型中上下文归属的自监督对齐方法）

[03:14] 🐕 Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights（该模型也能识别狗吗？基于权重的零样本模型搜索）

[03:56] 🌐 Exploring the Potential of Encoder-free Architectures in 3D LMMs（探索无编码器架构在三维大尺度多模态模型中的潜力）

[04:39] 🎭 CoSER: Coordinating LLM-Based Persona Simulation of Established Roles（协同角色模拟：基于大语言模型的角色扮演语言代理）

[05:26] 🌐 TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models（TripoSG：使用大规模校正流模型生成高保真3D形状）

[06:09] 🤖 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents（EmbodiedBench：全面评估视觉驱动具身智能体多模态大语言模型）

[07:00] 🌪 Typhoon T1: An Open Thai Reasoning Model（台风T1：一个开放的泰语推理模型）

[07:54] 🤖 Logical Reasoning in Large Language Models: A Survey（大型语言模型中的逻辑推理：综述）

[08:36] 🧠 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency（MME-CoT：评估大型多模态模型中链式思维推理质量、鲁棒性和效率）

[09:23] 🧠 CoT-Valve: Length-Compressible Chain-of-Thought Tuning（长度可压缩的链式思维调优）

[10:11] 🤖 SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models（SQuARE：增强大型语言模型链式思考的顺序问答推理引擎）

[10:52] 🌐 mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data（mmE5：通过高质量合成数据改进多模态多语言嵌入）

[11:36] 🦜 The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding（随机鹦鹉在大语言模型肩上：物理概念理解的总结性评估）

[12:18] 🤖 DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References（DexTrack：面向人类参考的灵巧操作通用神经跟踪控制）

[13:00] 🔍 3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly（3CAD：一个大规模真实3C产品数据集用于无监督异常检测）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

...more

View all episodes

By duan

22 ratings