GitHub Daily Trend

Repo State Loopholes During Agentic Evaluation · Issue #465 · SWE-bench/SWE-bench


Listen Later

https://github.com/SWE-bench/SWE-bench/issues/465
We've identified multiple loopholes with SWE Bench Verified where agents may look at future repository state (by querying it directly or through a variety of methods), and cases in which future rep...
...more
View all episodesView all episodes
Download on the App Store

GitHub Daily TrendBy VoiceFeed