An effective software cross-project fault prediction model for quality improvement

被引：6

作者：

Khatri, Yogita ^{[1
]}

Singh, Sandeep Kumar ^{[1
]}

机构：

[1] Jaypee Inst Informat Technol, Dept Comp Sci Engn & Informat Technol, Noida, India

来源：

SCIENCE OF COMPUTER PROGRAMMING | 2023年 / 226卷

关键词：

Software quality; Feature selection; Instance selection; Cross-project fault prediction; Effort-based performance measures; Non-effort-based performance measures; FEATURE-SELECTION METHOD; DEFECT PREDICTION; METRICS; CLASSIFICATION; OPTIMIZATION; VALIDATION;

D O I：

10.1016/j.scico.2022.102918

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

As a quality assurance activity, cross-project fault prediction (CPFP) involves building a model for predicting the faults in a specific software project (aka target project) facing the shortage of within-project training data, leveraging cross-projects data. However, the quality of training data decides the success of a CPFP model. Existing CPFP approaches mainly focused on instance selection with minimal attention to feature selection. Further, the validation of their models has been performed through non-effort-based performance measures (NEPMs) only, considering the availability of unlimited inspection resources, which is impractical. Addressing these problems, we propose a Hybrid Training Data Selection (HTDS) approach combining feature selection along with instance selection for building a proficient and pragmatic CPFP model by validating its effectiveness in terms of effort-based performance measures (EPMs) along with the NEPMs to confirm its practical applicability. The empirical results on 62 datasets manifested the potency of the proposed approach over all the compared approaches in terms of NEPMs and EPMs collectively. Thus, our proposed HTDS approach results in a more productive and practical CPFP model that can empower practitioners to produce quality software at a lesser cost.(c) 2022 Elsevier B.V. All rights reserved.

引用

页数：21

共 71 条

[61] Using Class Imbalance Learning for Software Defect Prediction [J].

Wang, Shuo ;

Yao, Xin .

IEEE TRANSACTIONS ON RELIABILITY, 2013, 62 (02) :434-443

[62] Multiple kernel ensemble learning for software defect prediction [J].

Wang, Tiejian ;

Zhang, Zhiwu ;

Jing, Xiaoyuan ;

Zhang, Liqiang .

AUTOMATED SOFTWARE ENGINEERING, 2016, 23 (04) :569-590

[63] Cross-Project and Within-Project Semisupervised Software Defect Prediction: A Unified Approach [J].

Wu, Fei ;

Jing, Xiao-Yuan ;

Sun, Ying ;

Sun, Jing ;

Huang, Lin ;

Cui, Fangyi ;

Sun, Yanfei .

IEEE TRANSACTIONS ON RELIABILITY, 2018, 67 (02) :581-597

[64]

Wu R., 2011, ESEC FSE 11, P15, DOI DOI 10.1145/2025113.2025120

[65]

Xu Z, 2018, 2018 25TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2018), P209, DOI 10.1109/SANER.2018.8330210

[66] MICHAC: Defect Prediction via Feature Selection based on Maximal Information Coefficient with Hierarchical Agglomerative Clustering [J].