Unzip

SWE-WebDevBench: Evaluating Coding Agent Application Platforms as Virtual Software Agencies


Listen Later

## Episode Summary
In this episode, we cover:
- **SWE-WebDevBench: Evaluating Coding Agent Application Platforms as Virtual Software Agencies** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2605.04637)
- **CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2605.02910)
- **MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2604.27393)
- **When to Think, When to Speak: Learning Disclosure Policies for LLM Reasoning** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2605.03314)
- **ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning** (Hugging Face Daily)
- [Read more](https://huggingface.co/papers/2605.00380)
---
*Sponsored by LimitLess AI*
...more
View all episodesView all episodes
Download on the App Store

UnzipBy Skyler @ LimitLess AI