GitHub Daily Trend

GitHub - Danau5tin/terminal-bench-rl: GRPO training code which scales to 32xH100s for long horizo...


Listen Later

https://github.com/Danau5tin/terminal-bench-rl
GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard. - Danau5tin/terminal-bench-rl
...more
View all episodesView all episodes
Download on the App Store

GitHub Daily TrendBy VoiceFeed