【本日の論文】
1. Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?
https://huggingface.co/papers/2602.14111
2. SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
https://huggingface.co/papers/2602.12670
3. GLM-5: from Vibe Coding to Agentic Engineering
https://huggingface.co/papers/2602.15763
4. Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook
https://huggingface.co/papers/2602.14299
5. ResearchGym: Evaluating Language Model Agents on Real-World AI Research
https://huggingface.co/papers/2602.15112