Software Defect Prediction Based on Cost-Sensitive Dictionary Learning

被引:8
|
作者
Wan, Hongyan [1 ]
Wu, Guoqing [1 ]
Yu, Mali [2 ]
Yuan, Mengting [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China
[2] Jiujiang Univ, Sch Informat Sci & Technol, Jiujiang 332005, Peoples R China
关键词
Software defect prediction; dictionary learning; cost-sensitive; bilevel optimization; sparse coding; SPARSE REPRESENTATIONS; NEURAL-NETWORKS; QUALITY;
D O I
10.1142/S0218194019500384
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software defect prediction technology has been widely used in improving the quality of software system. Most real software defect datasets tend to have fewer defective modules than defective-free modules. Highly class-imbalanced data typically make accurate predictions difficult. The imbalanced nature of software defect datasets makes the prediction model classifying a defective module as a defective-free one easily. As there exists the similarity during the different software modules, one module can be represented by the sparse representation coefficients over the pre-defined dictionary which consists of historical software defect datasets. In this study, we make use of dictionary learning method to predict software defect. We optimize the classifier parameters and the dictionary atoms iteratively, to ensure that the extracted features (sparse representation) are optimal for the trained classifier. We prove the optimal condition of the elastic net which is used to solve the sparse coding coefficients and the regularity of the elastic net solution. Due to the reason that the misclassification of defective modules generally incurs much higher cost risk than the misclassification of defective-free ones, we take the different misclassification costs into account, increasing the punishment on misclassification defective modules in the procedure of dictionary learning, making the classification inclining to classify a module as a defective one. Thus, we propose a cost-sensitive software defect prediction method using dictionary learning (CSDL). Experimental results on the 10 class-imbalance datasets of NASA show that our method is more effective than several typical state-of-the-art defect prediction methods.
引用
收藏
页码:1219 / 1243
页数:25
相关论文
共 50 条
  • [41] Ensemble learning based software defect prediction
    Dong, Xin
    Liang, Yan
    Miyamoto, Shoichiro
    Yamaguchi, Shingo
    JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (04): : 377 - 391
  • [42] Cost-sensitive learning for imbalanced data streams
    Loezer, Lucas
    Enembreck, Fabricio
    Barddal, Jean Paul
    Britto Jr, Alceu de Souza
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 498 - 504
  • [43] Software Defect Prediction Based on Fourier Learning
    Yang, Kang
    Yu, Huiqun
    Fan, Guisheng
    Yang, Xingguang
    Zheng, Song
    Leng, Chunxia
    PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2018, : 388 - 392
  • [44] Cost-Sensitive Active Learning for Incomplete Data
    Wang, Min
    Yang, Chunyu
    Zhao, Fei
    Min, Fan
    Wang, Xizhao
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (01): : 405 - 416
  • [45] Search-based cost-sensitive hypergraph learning for anomaly detection
    Wang, Nan
    Zhang, Yubo
    Zhao, Xibin
    Zheng, Yingli
    Fan, Hao
    Zhou, Boya
    Gao, Yue
    INFORMATION SCIENCES, 2022, 617 : 451 - 463
  • [46] Speech Separation By Cost-Sensitive Deep Learning
    Zhang, Xiao-Lei
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 159 - 162
  • [47] A hybrid cost-sensitive ensemble for heart disease prediction
    Qi Zhenya
    Zhang, Zuoru
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [48] Data distribution-based cost-sensitive broad learning system
    Xu P.-F.
    Wang M.
    Liu J.-P.
    Tang Z.-H.
    Ma T.-Y.
    Kongzhi yu Juece/Control and Decision, 2021, 36 (07): : 1686 - 1692
  • [49] A hybrid cost-sensitive ensemble for heart disease prediction
    Qi Zhenya
    Zuoru Zhang
    BMC Medical Informatics and Decision Making, 21
  • [50] Cost-Sensitive LVQ for Bankruptcy Prediction: An Empirical Study
    Chen, Ning
    Vieira, Armando
    Duarte, Joao
    2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 5, 2009, : 115 - 119