Does Principal Component Analysis Improve Cluster-Based Analysis?

被引:10
作者
Farjo, Joan [1 ]
Abou Assi, Rawad [1 ]
Masri, Wes [1 ]
Zaraket, Fadi [1 ]
机构
[1] Amer Univ Beirut, Dept Elect & Comp Engn, Beirut, Lebanon
来源
IEEE SIXTH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2013) | 2013年
关键词
PCA (Principal Component Analysis); cluster analysis; dimensionality reduction; test suite minimization; coincidental correctness; COINCIDENTAL CORRECTNESS;
D O I
10.1109/ICSTW.2013.52
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Researchers in the dynamic program analysis field have extensively used cluster analysis to address various problems. Typically, the clustering techniques are applied onto execution profiles having high dimensionality (i.e., involving a large number of profiling elements), sometimes in the order of thousands or even hundreds of thousands. Our concern is that the high number of profiling elements might diminish the effectiveness of the clustering process, which led us to explore the use of dimensionality reduction techniques as a preprocessing step to clustering. Specifically, in this work, we used PCA (Principal Component Analysis) as a dimensionality reduction technique and investigated its impact on two cluster-based analysis techniques, one aiming at identifying coincidentally correct tests, and the other at test suite minimization. In other words, we tried to assess whether PCA improves cluster-based analysis. Our experimental results showed that the impact was positive on the first technique, but inconclusive on the second, which calls for further investigation in the future.
引用
收藏
页码:400 / 403
页数:4
相关论文
共 25 条
[1]  
Ammann P., 2016, INTRO SOFTWARE TESTI
[2]  
[Anonymous], 3 INT C SOFTW TEST V
[3]  
[Anonymous], 2001, P 24 INT C SOFTW ENG
[4]  
[Anonymous], 1 INT WORKSH TEST DE
[5]  
[Anonymous], INT WORKSH DEF LARG
[6]  
[Anonymous], 7 INT WORKSH DYN AN
[7]   SCREE TEST FOR NUMBER OF FACTORS [J].
CATTELL, RB .
MULTIVARIATE BEHAVIORAL RESEARCH, 1966, 1 (02) :245-276
[8]   Finding failures by cluster analysis of execution profiles [J].
Dickinson, W ;
Leon, D ;
Podgurski, A .
PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 2001, :339-348
[9]  
Fodor I.K, 2002, SURVEY DIMENSION RED, DOI DOI 10.2172/15002155
[10]  
Hatcher Larry, 1994, A step-by-step approach to using the SAS system for univariate and multivariate statistics