Performance analysis of clustering techniques over microarray data: A case study

被引:11
作者
Dash, Rasmita [1 ]
Misra, Bijan Bihari [2 ]
机构
[1] Siksha O Anusandhan Univ, Inst Tech Educ & Res, Dept Comp Sc & Informat Technol, Khandagiri Sq, Bhubaneswar 751030, Odisha, India
[2] Silicon Inst Technol, Dept Comp Sc & Engn, Bhubaneswar 751024, Odisha, India
关键词
Microarray data; Feature selection; Cluster analysis; Particle swarm optimization; Statistical test; PARTICLE SWARM OPTIMIZATION; GENE-EXPRESSION DATA; CLASSIFICATION; EXTENSIONS; ALGORITHM; SELECTION; CANCER;
D O I
10.1016/j.physa.2017.10.032
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:162 / 176
页数:15
相关论文
共 38 条
[1]   Research on particle swarm optimization based clustering: A systematic review of literature and techniques [J].
Alam, Shafiq ;
Dobbie, Gillian ;
Koh, Yun Sing ;
Riddle, Patricia ;
Rehman, Saeed Ur .
SWARM AND EVOLUTIONARY COMPUTATION, 2014, 17 :1-13
[2]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[3]   Discrete particle swarm optimization method for the large-scale discrete time-cost trade-off problem [J].
Aminbakhsh, Saman ;
Sonmez, Rifat .
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 51 :177-185
[4]  
[Anonymous], 2009, FINDING GROUPS DATA
[5]   Multiobjective clustering analysis using particle swarm optimization [J].
Armano, Giuliano ;
Farmani, Mohammad Reza .
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 55 :184-193
[6]  
Ball GH., 1965, ISODATA NOVEL METHOD
[7]   Gene-expression profiles predict survival of patients with lung adenocarcinoma [J].
Beer, DG ;
Kardia, SLR ;
Huang, CC ;
Giordano, TJ ;
Levin, AM ;
Misek, DE ;
Lin, L ;
Chen, GA ;
Gharib, TG ;
Thomas, DG ;
Lizyness, ML ;
Kuick, R ;
Hayasaka, S ;
Taylor, JMG ;
Iannettoni, MD ;
Orringer, MB ;
Hanash, S .
NATURE MEDICINE, 2002, 8 (08) :816-824
[8]   Efficient agglomerative hierarchical clustering [J].
Bouguettaya, Athman ;
Yu, Qi ;
Liu, Xumin ;
Zhou, Xiangmin ;
Song, Andy .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) :2785-2797
[9]  
Bruno G, 2011, KNOWLEDGE DISCOVERY PRACTICES AND EMERGING APPLICATIONS OF DATA MINING: TRENDS AND NEW DOMAINS, P23, DOI 10.4018/978-1-60960-067-9.ch002
[10]   Gene clustering by using query-based self-organizing maps [J].
Chang, Ray-I ;
Chu, Chih-Chun ;
Wu, Yu-Ying ;
Chen, Yen-Liang .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (09) :6689-6694