
Sign up to save your podcasts
Or
本期的 15 篇论文如下:
[00:23] 🔗 Chain-of-Model Learning for Language Model(模型链学习:一种用于语言模型的新型学习范式)
[00:58] 🤔 AdaptThink: Reasoning Models Can Learn When to Think(AdaptThink:推理模型何时思考的学习)
[01:45] 🧠 AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning(AdaCoT: 通过强化学习实现帕累托最优的自适应思维链触发)
[02:21] 🚀 Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction(Delta注意力机制:通过Delta校正实现快速而精确的稀疏注意力推断)
[03:04] 🖥 Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis(通过用户界面分解与合成扩展计算机使用中的Grounding)
[03:43] 🤔 Thinkless: LLM Learns When to Think(智思:大语言模型学习何时思考)
[04:23] 💡 Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space(暗中求索:在隐空间中通过测试时实例级策略梯度进行推理)
[05:00] 🧮 MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision(MM-PRM:利用可扩展的步骤级监督增强多模态数学推理)
[05:39] ✨ Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation(混合3D-4D高斯溅射:用于快速动态场景表示)
[06:15] 🛡 FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA(FedSVD:基于LoRA的自适应正交化差分隐私联邦学习)
[07:00] 🧩 Model Merging in Pre-training of Large Language Models(大型语言模型预训练中的模型合并)
[07:53] 🤖 CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models(CPGD:面向语言模型稳定规则强化学习)
[08:36] 🎬 Faster Video Diffusion with Trainable Sparse Attention(基于可训练稀疏注意力的快速视频扩散)
[09:23] 🧠 Fractured Chain-of-Thought Reasoning(碎裂的思维链推理)
[10:03] 🧠 Neuro-Symbolic Query Compiler(神经符号查询编译器)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 15 篇论文如下:
[00:23] 🔗 Chain-of-Model Learning for Language Model(模型链学习:一种用于语言模型的新型学习范式)
[00:58] 🤔 AdaptThink: Reasoning Models Can Learn When to Think(AdaptThink:推理模型何时思考的学习)
[01:45] 🧠 AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning(AdaCoT: 通过强化学习实现帕累托最优的自适应思维链触发)
[02:21] 🚀 Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction(Delta注意力机制:通过Delta校正实现快速而精确的稀疏注意力推断)
[03:04] 🖥 Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis(通过用户界面分解与合成扩展计算机使用中的Grounding)
[03:43] 🤔 Thinkless: LLM Learns When to Think(智思:大语言模型学习何时思考)
[04:23] 💡 Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space(暗中求索:在隐空间中通过测试时实例级策略梯度进行推理)
[05:00] 🧮 MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision(MM-PRM:利用可扩展的步骤级监督增强多模态数学推理)
[05:39] ✨ Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation(混合3D-4D高斯溅射:用于快速动态场景表示)
[06:15] 🛡 FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA(FedSVD:基于LoRA的自适应正交化差分隐私联邦学习)
[07:00] 🧩 Model Merging in Pre-training of Large Language Models(大型语言模型预训练中的模型合并)
[07:53] 🤖 CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models(CPGD:面向语言模型稳定规则强化学习)
[08:36] 🎬 Faster Video Diffusion with Trainable Sparse Attention(基于可训练稀疏注意力的快速视频扩散)
[09:23] 🧠 Fractured Chain-of-Thought Reasoning(碎裂的思维链推理)
[10:03] 🧠 Neuro-Symbolic Query Compiler(神经符号查询编译器)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递