## Episode Summary
In this episode, we cover:
- **Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2604.28123)
- **Audio-Visual Intelligence in Large Foundation Models** (arXiv)
- [Read more](http://arxiv.org/abs/2605.04045v1)
- **X2SAM: Any Segmentation in Images and Videos** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2605.00891)
- **Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2604.27488)
- **Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2605.02801)
---
*Sponsored by LimitLess AI*