
Sign up to save your podcasts
Or
本期的 15 篇论文如下:
[00:24] 🗜 Shifting AI Efficiency From Model-Centric to Data-Centric Compression(AI效率转移:从以模型为中心到以数据为中心的压缩)
[01:05] 🌐 Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model(Mutarjim:利用小型语言模型推进阿拉伯语-英语双向翻译)
[02:00] 📊 BizFinBench: A Business-Driven Real-World Financial Benchmark for Evaluating LLMs(BizFinBench:一个用于评估大型语言模型在业务驱动的真实金融场景表现的基准)
[02:40] 🖼 Alchemist: Turning Public Text-to-Image Data into Generative Gold(炼金术士:将公共文本到图像数据转化为生成式金矿)
[03:18] 🧠 Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance(具身智能体与个性化相遇:探索用于个性化辅助的记忆利用)
[03:59] 🧠 PATS: Process-Level Adaptive Thinking Mode Switching(PATS:过程级自适应思维模式切换)
[04:52] 🧠 ARM: Adaptive Reasoning Model(自适应推理模型)
[05:37] 🧩 Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles(谜题:利用合成可验证谜题扩展大型语言模型的逻辑推理能力)
[06:18] 🤖 B-score: Detecting biases in large language models using response history(B-score:利用响应历史检测大型语言模型中的偏见)
[06:58] 🧠 Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective(解析轨迹辅助的大语言模型推理:一个优化的视角)
[07:39] 🛡 Lifelong Safety Alignment for Language Models(语言模型的终身安全对齐)
[08:14] 🧪 MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search(MOOSE-Chem2: 探索大型语言模型在基于层级搜索的精细化科学假设发现中的能力极限)
[09:00] 🗺 Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps(多模态大语言模型能指引我回家吗?基于交通地图的细粒度视觉推理基准研究)
[09:43] 🧮 Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers(来自格式和长度的替代信号:用于解决没有标准答案的数学问题的强化学习)
[10:28] 🧠 Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models(强化微调驱动多模态大语言模型的推理能力)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 15 篇论文如下:
[00:24] 🗜 Shifting AI Efficiency From Model-Centric to Data-Centric Compression(AI效率转移:从以模型为中心到以数据为中心的压缩)
[01:05] 🌐 Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model(Mutarjim:利用小型语言模型推进阿拉伯语-英语双向翻译)
[02:00] 📊 BizFinBench: A Business-Driven Real-World Financial Benchmark for Evaluating LLMs(BizFinBench:一个用于评估大型语言模型在业务驱动的真实金融场景表现的基准)
[02:40] 🖼 Alchemist: Turning Public Text-to-Image Data into Generative Gold(炼金术士:将公共文本到图像数据转化为生成式金矿)
[03:18] 🧠 Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance(具身智能体与个性化相遇:探索用于个性化辅助的记忆利用)
[03:59] 🧠 PATS: Process-Level Adaptive Thinking Mode Switching(PATS:过程级自适应思维模式切换)
[04:52] 🧠 ARM: Adaptive Reasoning Model(自适应推理模型)
[05:37] 🧩 Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles(谜题:利用合成可验证谜题扩展大型语言模型的逻辑推理能力)
[06:18] 🤖 B-score: Detecting biases in large language models using response history(B-score:利用响应历史检测大型语言模型中的偏见)
[06:58] 🧠 Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective(解析轨迹辅助的大语言模型推理:一个优化的视角)
[07:39] 🛡 Lifelong Safety Alignment for Language Models(语言模型的终身安全对齐)
[08:14] 🧪 MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search(MOOSE-Chem2: 探索大型语言模型在基于层级搜索的精细化科学假设发现中的能力极限)
[09:00] 🗺 Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps(多模态大语言模型能指引我回家吗?基于交通地图的细粒度视觉推理基准研究)
[09:43] 🧮 Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers(来自格式和长度的替代信号:用于解决没有标准答案的数学问题的强化学习)
[10:28] 🧠 Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models(强化微调驱动多模态大语言模型的推理能力)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递