
Sign up to save your podcasts
Or
本期的 20 篇论文如下:
[00:23] 🌍 BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models(BenchMAX:大型语言模型的综合多语言评估套件)
[01:08] 📄 TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation(TextAtlas5M:用于密集文本图像生成的大规模数据集)
[01:48] 🎥 Light-A-Video: Training-free Video Relighting via Progressive Light Fusion(光影视频:基于渐进光融合的无训练视频重照明)
[02:36] 🎥 CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation(CineMaster:一个三维感知与可控的电影级文本到视频生成框架)
[03:16] 🖥 WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation(世界GUI:桌面GUI自动化的综合动态测试)
[04:06] ⚡ LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid(LASP-2:重新思考线性注意力及其混合模型的序列并行性)
[04:45] 🧠 TransMLA: Multi-head Latent Attention Is All You Need(TransMLA:多头潜在注意力机制的全部需求)
[05:31] 💼 Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance(Fino1:关于推理增强型大型语言模型在金融领域的可迁移性研究)
[06:23] 📏 Distillation Scaling Laws(蒸馏缩放定律)
[07:02] 🚀 Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning(忽略KL惩罚!通过增强关键标记的探索来提升强化学习微调效果)
[07:52] 🌍 SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation(SARChat-Bench-2M:用于SAR图像解释的多任务视觉语言基准)
[08:25] 🧠 LLM Pretraining with Continuous Concepts(基于连续概念的LLM预训练)
[09:09] 🎭 Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance(动画任何人2:利用环境可操作性生成高保真角色图像动画)
[09:52] 🔍 NoLiMa: Long-Context Evaluation Beyond Literal Matching(NoLiMa:超越字面匹配的长上下文评估)
[10:39] 🧠 Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing(中介:基于参数冲突少和不确定性路由的高效LLM合并)
[11:15] 📚 Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey(面向可信赖的大语言模型检索增强生成:综述)
[11:58] 🎥 Next Block Prediction: Video Generation via Semi-Autoregressive Modeling(下一区块预测:通过半自回归建模生成视频)
[12:43] 🔄 DPO-Shift: Shifting the Distribution of Direct Preference Optimization(DPO-Shift:直接偏好优化分布的可控转移)
[13:28] 🧠 LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention(LLM模块:使用增强交叉注意力机制从大模型向小模型进行知识迁移)
[14:15] 🛡 MetaSC: Test-Time Safety Specification Optimization for Language Models(MetaSC:语言模型推理时的安全规范优化)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 20 篇论文如下:
[00:23] 🌍 BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models(BenchMAX:大型语言模型的综合多语言评估套件)
[01:08] 📄 TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation(TextAtlas5M:用于密集文本图像生成的大规模数据集)
[01:48] 🎥 Light-A-Video: Training-free Video Relighting via Progressive Light Fusion(光影视频:基于渐进光融合的无训练视频重照明)
[02:36] 🎥 CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation(CineMaster:一个三维感知与可控的电影级文本到视频生成框架)
[03:16] 🖥 WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation(世界GUI:桌面GUI自动化的综合动态测试)
[04:06] ⚡ LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid(LASP-2:重新思考线性注意力及其混合模型的序列并行性)
[04:45] 🧠 TransMLA: Multi-head Latent Attention Is All You Need(TransMLA:多头潜在注意力机制的全部需求)
[05:31] 💼 Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance(Fino1:关于推理增强型大型语言模型在金融领域的可迁移性研究)
[06:23] 📏 Distillation Scaling Laws(蒸馏缩放定律)
[07:02] 🚀 Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning(忽略KL惩罚!通过增强关键标记的探索来提升强化学习微调效果)
[07:52] 🌍 SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation(SARChat-Bench-2M:用于SAR图像解释的多任务视觉语言基准)
[08:25] 🧠 LLM Pretraining with Continuous Concepts(基于连续概念的LLM预训练)
[09:09] 🎭 Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance(动画任何人2:利用环境可操作性生成高保真角色图像动画)
[09:52] 🔍 NoLiMa: Long-Context Evaluation Beyond Literal Matching(NoLiMa:超越字面匹配的长上下文评估)
[10:39] 🧠 Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing(中介:基于参数冲突少和不确定性路由的高效LLM合并)
[11:15] 📚 Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey(面向可信赖的大语言模型检索增强生成:综述)
[11:58] 🎥 Next Block Prediction: Video Generation via Semi-Autoregressive Modeling(下一区块预测:通过半自回归建模生成视频)
[12:43] 🔄 DPO-Shift: Shifting the Distribution of Direct Preference Optimization(DPO-Shift:直接偏好优化分布的可控转移)
[13:28] 🧠 LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention(LLM模块:使用增强交叉注意力机制从大模型向小模型进行知识迁移)
[14:15] 🛡 MetaSC: Test-Time Safety Specification Optimization for Language Models(MetaSC:语言模型推理时的安全规范优化)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递