Finding the best learning to rank algorithms for effort-aware defect prediction

被引:14
作者
Yu, Xiao [1 ,2 ]
Dai, Heng [3 ]
Li, Li [4 ]
Gu, Xiaodong [5 ]
Keung, Jacky Wai [6 ]
Bennin, Kwabena Ebo [7 ]
Li, Fuyang [1 ]
Liu, Jin [8 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan, Peoples R China
[2] Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya, Peoples R China
[3] Wuhan Qingchuan Univ, Sch Mech & Elect Engn, Wuhan, Peoples R China
[4] Beihang Univ, Sch Software, Beijing, Peoples R China
[5] Shanghai Jiao Tong Univ, Sch Software, Shanghai, Peoples R China
[6] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[7] Wageningen Univ & Res, Informat Technol Grp, Wageningen, Netherlands
[8] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
Software defect prediction; Empirical study; Learning to rank; Ranking instability; REGRESSION; RETRIEVAL; PRONENESS; MODELS; RIDGE;
D O I
10.1016/j.infsof.2023.107165
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Effort-Aware Defect Prediction (EADP) ranks software modules or changes based on their predicted number of defects (i.e., considering modules or changes as effort) or defect density (i.e., considering LOC as effort) by using learning to rank algorithms. Ranking instability refers to the inconsistent conclusions produced by existing empirical studies of EADP. The major reason is the poor experimental design, such as comparison of few learning to rank algorithms, the use of small number of datasets or datasets without indicating numbers of defects, and evaluation with inappropriate or few metrics.Objective: To find a stable ranking of learning to rank algorithms to investigate the best ones for EADP,Method: We examine the practical effects of 34 algorithms on 49 datasets for EADP. We measure the performance of these algorithms using 7 module-based and 7 LOC-based metrics and run experiments under cross-release and cross-project settings, respectively. Finally, we obtain the ranking of these algorithms by performing the Scott-Knott ESD test.Results: When module is used as effort, random forest regression performs the best under cross-release setting, and linear regression performs the best under cross-project setting among the learning to rank algorithms; (2) when LOC is used as effort, LTR-linear (Learning-to-Rank with the linear model) performs the best under cross-release setting, and Ranking SVM performs the best under cross-project setting.Conclusion: This comprehensive experimental procedure allows us to discover a stable ranking of the studied algorithms to select the best ones according to the requirement of software projects.
引用
收藏
页数:18
相关论文
共 94 条
  • [1] [Anonymous], 2007, 3 INT WORKSH PRED MO, DOI [10.1109/PROMISE.2007.10, DOI 10.1109/PROMISE.2007.10]
  • [2] Empirical Evaluation of Cross-Release Effort-Aware Defect Prediction Models
    Bennin, Kwabena Ebo
    Toda, Koji
    Kamei, Yasutaka
    Keung, Jacky
    Monden, Akito
    Ubayashi, Naoyasu
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2016), 2016, : 214 - 221
  • [3] Investigating the Effects of Balanced Training and Testing Datasets on Effort-Aware Fault Prediction Models
    Bennin, Kwabena Ebo
    Keung, Jacky
    Monden, Akito
    Kamei, Yasutaka
    Ubayashi, Naoyasu
    [J]. PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS, VOL 1, 2016, : 154 - 163
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Breiman L., 2017, Classification and regression trees, DOI [10.1201/9781315139470-8, DOI 10.1201/9781315139470-8]
  • [6] Burges C., 2005, P 22 INT C MACH LEAR, P89, DOI DOI 10.1145/1102351.1102363
  • [7] Cao Z., 2007, P 24 INT C MACH LEAR, V24, P129, DOI DOI 10.1145/1273496.1273513
  • [8] Chen X., 2020, IEEE Transactions on Software Engineering
  • [9] MULTI: Multi-objective effort-aware just-in-time software defect prediction
    Chen, Xiang
    Zhao, Yingquan
    Wang, Qiuping
    Yuan, Zhidan
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 93 : 1 - 13
  • [10] Deep Quadruple-Based Hashing for Remote Sensing Image-Sound Retrieval
    Chen, Yaxiong
    Xiong, Shengwu
    Mou, Lichao
    Zhu, Xiao Xiang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60