Top model scores may be skewed by Git history leaks in SWE-bench
(github.com)
458 points
by mustaphah
2 days ago |
153 comments
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()
()