Announcement_8
Our ICML and ICML Position papers were accepted: “SWE-ABS: Adversarial Benchmark Strengthening Exposes Inflated Success Rates on Test-based Benchmark” and “How Should We Build A Benchmark? Revisiting 274 Code-Related Benchmarks For LLMs”.
Enjoy Reading This Article?
Here are some more articles you might like to read next: