K-Means and K-Medoids: Cluster Analysis on Birth Data Collected in City Muzaffarabad, Kashmir

被引:29
作者
Abbas, Syed Ali [1 ]
Aslam, Adil [1 ]
Rehman, Aqeel Ur [2 ]
Abbasi, Wajid Arshad [1 ]
Arif, Saeed [3 ]
Kazmi, Syed Zaki Hassan [1 ]
机构
[1] Univ Azad Jammu & Kashmir, Dept Comp Sci & Informat Technol, Muzaffarabad 13100, Pakistan
[2] Southwest Univ, Dept Elect & Informat Engn, Chongqing 400715, Peoples R China
[3] Saudi Elect Univ, Dept Comp Sci, Riyadh 11673, Saudi Arabia
关键词
Pregnancy; Clustering algorithms; Urban areas; Measurement; Data mining; Diabetes; Healthcare; machine learning; cluster analysis; K-medoids; k-means; caesarean section; birth data; PLACENTAL GENE-EXPRESSION; ALGORITHM; WOMEN;
D O I
10.1109/ACCESS.2020.3014021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the field of medical, each and every analysis is decisive as the study links to life of the subject under observation. One of the most vital area in the field of medical is the healthcare of expecting women in low income countries. High mortality rate due to increased number of caesarean section is evident because of poor medical infrastructure in the region, misunderstood religious teachings, low education and lack of proper decision making at the right time. The root cause analysis of situations demanding caesarean section is a tough job, however in the presence of historical data, one may extract useful information that will help supporting a medical decision by predicting the outcome. It is obvious that regional disparities have a huge impact on the residents of that region. A study performed on any region cannot be all applicable to the residents of some other distant region. This motive has established grounds to conduct a local study upon the data collected from expecting women in city Muzaffarabad, Kashmir. It is believed that the findings of this study will be significant for women that share more or less similar physical, social and maternal traits. Keeping this in mind, study presents an analysis of two clustering techniques for the investigation of appropriate algorithm that groups data into relevant clusters robustly. Firstly, we analyzed K-means and K-medoids algorithms' capability to cluster the data using different distance metrics. Secondly, data transformation techniques including scale, range and Yeo-Johnson are applied. Finally, transformed data are used in K-means and K-medoids algorithms' to generate cluster accuracy. It is observed that the results produced from transformed data are better than using raw data. Yeo-Johnson transformation method is found best for k-means (Hartigan & Wang), K-medoids (SEV distance function) and Rank k-medoids (SEV distance function) with mean accuracy 67.58%, 69.58% and 72.64% respectively.
引用
收藏
页码:151847 / 151855
页数:9
相关论文
共 56 条
[1]   Performance Analysis of Classification Algorithms on Birth Dataset [J].
Abbas, Syed Ali ;
Rehman, Aqeel Ur ;
Majeed, Fiaz ;
Majid, Abdul ;
Malik, M. Sheraz Arshed ;
Kazmi, Zaki Hassan ;
Zafar, Seemab .
IEEE ACCESS, 2020, 8 :102146-102154
[2]   Cause Analysis of Caesarian Sections and Application of Machine Learning Methods for Classification of Birth Data [J].
Abbas, Syed Ali ;
Riaz, Rabia ;
Kazmi, Syed Zaki Hassan ;
Rizvi, Sanam Shahla ;
Kwon, Se Jin .
IEEE ACCESS, 2018, 6 :67555-67561
[3]  
Alsayat A, 2016, 2016 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS (SERA), P45, DOI 10.1109/SERA.2016.7516127
[4]  
[Anonymous], 2009, Finding Groups in Data: An Introduction to Cluster Analysis
[5]  
Arora S, 2014, 2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), P59, DOI 10.1109/CONFLUENCE.2014.6949256
[6]  
Ayodele TO., 2010, New Advances in Machine Learning, V3, P19, DOI DOI 10.5772/9385
[7]  
Bagi KS, 2014, 2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), P157, DOI 10.1109/IC3I.2014.7019613
[8]  
Banjari Ines, 2015, Coll Antropol, V39, P247
[9]  
Budiaji W., 2019, kmed: Distance-based K-medoids R package version 0.3.0
[10]   Screening high-risk clusters for developing birth defects in mothers in Shanxi Province, China: application of latent class cluster analysis [J].
Cao, Hongyan ;
Wei, Xiaoyuan ;
Guo, Xingping ;
Song, Chunying ;
Luo, Yanhong ;
Cui, Yuehua ;
Hu, Xianming ;
Zhang, Yanbo .
BMC PREGNANCY AND CHILDBIRTH, 2015, 15