**What You Need to Know:** Hugging Face just shipped TRL v1.0, turning the messy post-training pipeline (SFT → Reward Modeling → DPO/GRPO) into a stable, production-ready unified API. Liquid AI dropped LFM2.5-350M, a 350M-parameter model trained on 28T tokens with scaled RL that challenges the “bigger is better” scaling narrative. ...
AI Disclosure: This podcast is curated by Patrick but uses AI-generated voice synthesis (ElevenLabs) for audio production.