Y Cross-Project Defect Prediction via Landmark Selection-Based Kernelized Discriminant Subspace Alignment

被引：21

作者：

Li, Zhiqiang ^{[1
]}

Niu, Jingwen ^{[2
]}

Jing, Xiao-Yuan ^{[3
,4
]}

Yu, Wangyang ^{[1
]}

Qi, Chao ^{[1
]}

机构：

[1] Shaanxi Normal Univ, Sch Comp Sci, Xian 710119, Peoples R China

[2] Xinxiang Univ, Sch Comp & Informat Engn, Xinxiang 453003, Henan, Peoples R China

[3] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China

[4] Guangdong Univ Petrochem Technol, Sch Comp, Maoming 525000, Peoples R China

来源：

IEEE TRANSACTIONS ON RELIABILITY | 2021年 / 70卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Cross-project defect prediction (CPDP); discriminant subspace alignment; domain adaptation; kernel projection; landmark selection; source label propagation; ADAPTATION; CLASSIFICATION; MODEL; CODE;

D O I：

10.1109/TR.2021.3074660

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Cross-project defect prediction (CPDP) refers to identifying defect-prone software modules in one project (target) using historical data collected from other projects (source), which can help developers find bugs and prioritize their testing efforts. Recently, CPDP has attracted great research interest. However, the source and target data usually exist redundancy and nonlinearity characteristics. Besides, most CPDP methods do not exploit source label information to uncover the underlying knowledge for label propagation. These factors usually lead to unsatisfactory CPDP performance. To address the above limitations, we propose a landmark selection-based kernelized discriminant subspace alignment (LSKDSA) approach for CPDP. LSKDSA not only reduces the discrepancy of the data distributions between the source and target projects, but also characterizes the complex data structures and increases the probability of linear separability of the data. Moreover, LSKDSA encodes label information of the source data into domain adaptation learning process and makes itself with good discriminant ability. Extensive experiments on 13 public projects fromthree benchmark datasets demonstrate that LSKDSA performs better than a range of competing CPDP methods. The improvement is 3.44% - 11.23% in g-measure, 5.75% - 11.76% in AUC, and 9.34% - 33.63% in MCC, respectively.

引用

页码：996 / 1013

页数：18

共 57 条

[1] Aljundi R, 2015, PROC CVPR IEEE, P56, DOI 10.1109/CVPR.2015.7298600
[2] [Anonymous], 2004, KERNEL METHODS PATTE
[3] [Anonymous], 2011, P 19 ACM SIGSOFT S 1
[4] Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection
Belhumeur, PN
Hespanha, JP
Kriegman, DJ
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) : 711 - 720
[5] Training data selection for cross-project defection prediction: which approach is better?
Bin, Yi
Zhou, Kai
Lu, Hongmin
Zhou, Yuming
Xu, Baowen
[J]. 11TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT (ESEM 2017), 2017, : 354 - 363
[6] An Empirical Study on Heterogeneous Defect Prediction Approaches
Chen, Haowen
Jing, Xiao-Yuan
Li, Zhiqiang
Wu, Di
Peng, Yi
Huang, Zhiguo
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 47 (12) : 2803 - 2822
[7] Cruz AEC, 2009, INT SYMP EMP SOFTWAR, P461
[8] Evaluating defect prediction approaches: a benchmark and an extensive comparison
D'Ambros, Marco
Lanza, Michele
Robbes, Romain
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2012, 17 (4-5) : 531 - 577
[9] Demsar J, 2006, J MACH LEARN RES, V7, P1
[10] Unsupervised Visual Domain Adaptation Using Subspace Alignment
Fernando, Basura
Habrard, Amaury
Sebban, Marc
Tuytelaars, Tinne
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2960 - 2967

← 1 2 3 4 5 6 →