Ensemble Clustering Algorithm with Supervised Classification of Clinical Data for Early Diagnosis of Coronary Artery Disease

被引:16
作者
Kausar, Noreen [1 ]
Abdullah, Azween [2 ]
Samir, Brahim Belhaouari [3 ]
Palaniappan, Sellapan [1 ]
AlGhamdi, Bandar Saeed [4 ]
Dey, Nilanjan [5 ]
机构
[1] Malaysia Univ Sci & Technol, Selangor 1205, Malaysia
[2] Taylors Univ, Selangor 47600, Malaysia
[3] Alfaisal Univ, Riyadh 11564, Saudi Arabia
[4] King Faisal Specialist Hosp & Res Ctr, Riyadh 11564, Saudi Arabia
[5] Bengal Coll Engn & Technol, Durgapur 713122, India
关键词
Principal Component Analysis (PCA); Support Vector Machines (SVM); Coronary Artery Disease (CAD); Feature Selection; Dimension Reduction; Clustering; COMPONENT ANALYSIS; FEATURE-SELECTION;
D O I
10.1166/jmihi.2016.1593
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Enhancing the detection accuracy of heart anomalies for clinical diagnosis is essential yet complicated because of irrelevant patient's details and slow systematic processing. In this work, the aim is to select relevant clinical features which can accelerate the classification performance to distinguish abnormal and normal patients. For this purpose, Principal Component Analysis (PCA) algorithm is applied to reduce the attribute dimension by incorporating class identifiers for extracting minimal attributes which have maximum portion of the total variance. This approach combines Supervised and Unsupervised learning methods namely Support Vector Machines (SVM) and K-means Clustering for classification by adjusting their related parameters and measures. K-means clustering groups the similar data patterns in possible clusters which are individually classified to determine overall accuracy by computing average of accuracies achieved from all the clusters. Support Vector Machines (SVM) have a better generalization ability which can even detect unseen testing data with model trained at determined parameter values. Results performed on University of California, Irvine (UCI) Cleveland Heart data set have outperformed earlier data mining approaches because of its time, optimized classification by tuning associated parameters and selection of relevant attributes. In future, this approach can be used for multi-classification of different medical datasets.
引用
收藏
页码:78 / 87
页数:10
相关论文
共 33 条
[1]  
Andreeva P., 2007, INT C INF TECHN INF, P189
[2]  
[Anonymous], 2002, Principal components analysis
[3]  
[Anonymous], 2011, Int. J. Comput. Sci. Trends Technol.
[4]  
[Anonymous], 2008, P WORLD C ENG COMP S
[5]  
Can M., 2013, SE EUROPE J SOFT COM, V2, P91
[6]  
Chang Y.W., 2008, NEURAL COMPUT APPL, V3, P53
[7]  
Chen H., 2010, PRINCIPAL COMPONENT
[8]   Subpattern-based principle component analysis [J].
Chen, SC ;
Zhu, YL .
PATTERN RECOGNITION, 2004, 37 (05) :1081-1083
[9]  
Cheung N., 2001, Machine Learning Techniques for Medical Analysis
[10]   Effective diagnosis of heart disease through neural networks ensembles [J].
Das, Resul ;
Turkoglu, Ibrahim ;
Sengur, Abdulkadir .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) :7675-7680