Research on Cross-Company Defect Prediction Method to Improve Software Security

被引:1
作者
Shao, Yanli [1 ]
Zhao, Jingru [1 ]
Wang, Xingqi [1 ]
Wu, Weiwei [1 ]
Fang, Jinglong [1 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou 310018, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1155/2021/5558561
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the scale and complexity of software increase, software security issues have become the focus of society. Software defect prediction (SDP) is an important means to assist developers in discovering and repairing potential defects that may endanger software security in advance and improving software security and reliability. Currently, cross-project defect prediction (CPDP) and cross-company defect prediction (CCDP) are widely studied to improve the defect prediction performance, but there are still problems such as inconsistent metrics and large differences in data distribution between source and target projects. Therefore, a new CCDP method based on metric matching and sample weight setting is proposed in this study. First, a clustering-based metric matching method is proposed. The multigranularity metric feature vector is extracted to unify the metric dimension while maximally retaining the information contained in the metrics. Then use metric clustering to eliminate metric redundancy and extract representative metrics through principal component analysis (PCA) to support one-to-one metric matching. This strategy not only solves the metric inconsistent and redundancy problem but also transforms the cross-company heterogeneous defect prediction problem into a homogeneous problem. Second, a sample weight setting method is proposed to transform the source data distribution. Wherein the statistical source sample frequency information is set as an impact factor to increase the weight of source samples that are more similar to the target samples, which improves the data distribution similarity between the source and target projects, thereby building a more accurate prediction model. Finally, after the above two-step processing, some classical machine learning methods are applied to build the prediction model, and 12 project datasets in NASA and PROMISE are used for performance comparison. Experimental results prove that the proposed method has superior prediction performance over other mainstream CCDP methods.
引用
收藏
页数:19
相关论文
共 40 条
[1]   Using Faults-Slip-Through Metric As A Predictor of Fault-Proneness [J].
Afzal, Wasif .
17TH ASIA PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2010), 2010, :414-422
[2]   Assessing the applicability of fault-proneness models across object-oriented software projects [J].
Briand, LC ;
Melo, WL ;
Wüst, J .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (07) :706-720
[3]   Collective transfer learning for defect prediction [J].
Chen, Jinyin ;
Hu, Keke ;
Yang, Yitao ;
Liu, Yi ;
Xuan, Qi .
NEUROCOMPUTING, 2020, 416 :103-116
[4]   Negative samples reduction in cross-company software defects prediction [J].
Chen, Lin ;
Fang, Bin ;
Shang, Zhaowei ;
Tang, Yuanyan .
INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 62 :67-77
[5]  
Chowdhury I., THESIS CANADA QUEENS
[6]   Software defect prediction using relational association rule mining [J].
Czibula, Gabriela ;
Marian, Zsuzsanna ;
Czibula, Istvan Gergely .
INFORMATION SCIENCES, 2014, 264 :260-278
[7]   Evaluating defect prediction approaches: a benchmark and an extensive comparison [J].
D'Ambros, Marco ;
Lanza, Michele ;
Robbes, Romain .
EMPIRICAL SOFTWARE ENGINEERING, 2012, 17 (4-5) :531-577
[8]  
[何亮 He Liang], 2012, [模式识别与人工智能, Pattern Recognition and Artificial Intelligence], V25, P792
[9]   A Comparative Study to Benchmark Cross-Project Defect Prediction Approaches [J].
Herbold, Steffen ;
Trautsch, Alexander ;
Grabowski, Jens .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2018, 44 (09) :811-833
[10]  
Jiang Y., P 2009 ISSRE 09 20 I