Publications

2025

  1. ACL
    UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
    Boxi Yu, Yuxuan Zhu, Pinjia He, and Daniel Kang
    2025
  2. How Should We Build A Benchmark? Revisiting 274 Code-Related Benchmarks For LLMs
    Jialun Cao, Yuk-Kit Chan, Zixuan Ling, Wenxuan Wang, Shuqing Li, Mingwei Liu, Ruixi Qiao, Yuting Han, Chaozheng Wang, Boxi Yu, and 5 more authors
    2025
  3. ICSE
    An Empirical Study on Package-Level Deprecation in Python Ecosystem
    Zhiqing Zhong, Shilin He, Haoxuan Wang, Boxi Yu, Haowen Yang, and Pinjia He
    ICSE’25: International Conference on Software Engineering, 2025

2024

  1. ICSE
    Deep Learning or Classical Machine Learning? An Empirical Study on Log-Based Anomaly Detection
    Boxi Yu, Jiayi Yao, Qiuai Fu, Zhiqing Zhong, Haotian Xie, Yaoliang Wu, Yuchi Ma, and Pinjia He
    ICSE’24: International Conference on Software Engineering, 2024
  2. CASW
    DSPy Guardrails: Building Safe LLM Applications via Self-Refining Language Model Pipelines
    Boxi Yu, and Pinjia He
    Compound AI Systems Workshop, 2024
  3. ICSE
    Testing Graph Database Systems via Equivalent Query Rewriting
    Qiuyang Mang, Aoyang Fang, Boxi Yu, Hanfei Chen, and Pinjia He
    ICSE’24: International Conference on Software Engineering, 2024

2023

  1. arXiv
    Retromorphic Testing: A New Approach to the Test Oracle Problem
    Boxi Yu, Qiuyang Mang, Qingshuo Guo, and Pinjia He
    ArXiv, 2023
  2. ESEC/FSE
    Automated Testing and Improvement of Named Entity Recognition Systems
    Boxi Yu, Yiyan Hu, Qiuyang Mang, Wenhan Hu, and Pinjia He
    ESEC/FSE’23: Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023
  3. ISSTA
    ROME: Testing Image Captioning Systems via Recursive Object Melting
    Boxi Yu, Zhiqing Zhong, Jiaqi Li, Yixing Yang, Shilin He, and Pinjia He
    In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023

2022

  1. ISSTA
    Automated testing of image captioning systems
    Boxi Yu, Zhiqing Zhong, Xinran Qin, Jiayi Yao, Yuancheng Wang, and Pinjia He
    In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, 2022