Bug numbers matter: An empirical study of effort-aware defect prediction using class labels versus bug numbers

被引:0
|
作者
Yang, Peixin [1 ,2 ,3 ]
Zeng, Ziyao [4 ]
Zhu, Lin [5 ]
Zhang, Yanjiao [5 ]
Wang, Xin [6 ]
Ma, Chuanxiang [7 ,8 ]
Hu, Wenhua [1 ,3 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan, Peoples R China
[2] Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya, Peoples R China
[3] Wuhan Univ Technol, Hubei Key Lab Transportat Internet Things, Wuhan, Peoples R China
[4] RMIT Univ, Sch Econ Finance & Mkt, Melbourne, Vic, Australia
[5] Wuhan Qingchuan Univ, Sch Comp, Wuhan, Peoples R China
[6] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[7] Hubei Univ, Sch Comp Sci & Informat Engn, Wuhan, Peoples R China
[8] Hubei Univ, Hubei Key Lab Big Data Intelligent Anal & Applicat, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
bug numbers; class label; effort-aware; machine learning; software defect prediction; QUANTITATIVE-ANALYSIS; SOFTWARE; REGRESSION; MODELS; FAULTS; RIDGE;
D O I
10.1002/spe.3363
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Previous research have utilized public software defect datasets such as NASA, RELINK, and SOFTLAB, which only contain class label information. Most effort-aware defect prediction (EADP) studies are carried out around these datasets. However, EADP studies typically relying on predicted bug number (i.e., considering modules as effort) or density (i.e., considering lines of code as effort) for ranking software modules. To explore the impact of bug number information in constructing EADP models, we access the performance degradation of the best-performing learning-to-rank methods when using class labels instead of bug numbers for training. The experimental results show that using class labels instead of bug numbers in building EADP models results in an decrease in the detected bugs when module is considering as effort. When effort is LOC, using class labels to construct EADP models can lead to a significant increase in the initial false alarms and a significant increase in the modules that need to be inspected. Therefore, we recommend not only the class labels but also the bug number information should be disclosed when publishing software defect datasets, in order to construct more accurate EADP models.
引用
收藏
页码:49 / 78
页数:30
相关论文
共 11 条
  • [1] An Empirical Study of Learning to Rank Techniques for Effort-Aware Defect Prediction
    Yu, Xiao
    Bennin, Kwabena Ebo
    Liu, Jin
    Keung, Jacky Wai
    Yin, Xiaofei
    Xu, Zhou
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 298 - 309
  • [2] Leveraging developer information for efficient effort-aware bug prediction
    Qu, Yu
    Chi, Jianlei
    Yin, Heng
    INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 137
  • [3] Testing and Code Review Based Effort-Aware Bug Prediction Model
    Muthukumaran, K.
    Murthy, N. L. Bhanu
    Reddy, G. Karthik
    Talishetti, Prateek
    SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2016, 653 : 17 - 30
  • [4] Empirical Evaluation of Cross-Release Effort-Aware Defect Prediction Models
    Bennin, Kwabena Ebo
    Toda, Koji
    Kamei, Yasutaka
    Keung, Jacky
    Monden, Akito
    Ubayashi, Naoyasu
    2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2016), 2016, : 214 - 221
  • [5] An Empirical Study on Dependence Clusters for Effort-Aware Fault-Proneness Prediction
    Yang, Yibiao
    Harman, Mark
    Krinke, Jens
    Islam, Syed
    Binkley, David
    Zhou, Yuming
    Xu, Baowen
    2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2016, : 296 - 307
  • [6] Poster: Bridging Effort-Aware Prediction and Strong Classification - a Just-in-Time Software Defect Prediction Study
    Guo, Yuchen
    Shepperd, Martin
    Li, Ning
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, 2018, : 325 - 326
  • [7] Effort-Aware Just-in-Time Bug Prediction for Mobile Apps Via Cross-Triplet Deep Feature Embedding
    Xu, Zhou
    Zhao, Kunsong
    Zhang, Tao
    Fu, Chunlei
    Yan, Meng
    Xie, Zhiwen
    Zhang, Xiaohong
    Catolino, Gemma
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 204 - 220
  • [8] An empirical study of software entropy based bug prediction using machine learning
    Kaur A.
    Kaur K.
    Chopra D.
    International Journal of System Assurance Engineering and Management, 2017, 8 (Suppl 2) : 599 - 616
  • [9] Are Smell-Based Metrics Actually Useful in Effort-Aware Structural Change-Proneness Prediction? An Empirical Study
    Liu, Huihui
    Yu, Yijun
    Li, Bixin
    Yang, Yibiao
    Jia, Ru
    2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 315 - 324
  • [10] Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
    Yang, Yibiao
    Zhou, Yuming
    Lu, Hongmin
    Chen, Lin
    Chen, Zhenyu
    Xu, Baowen
    Leung, Hareton
    Zhang, Zhenyu
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (04) : 331 - 357