Ensemble method based predictive model for analyzing disease datasets: a predictive analysis approach

被引:10
作者
Ramesh, Dharavath [1 ]
Katheria, Yogendra Singh [1 ]
机构
[1] Indian Inst Technol ISM, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India
关键词
Disease prediction; Ensemble methods; Machine learning; CHRONIC KIDNEY-DISEASE; CLASSIFICATION; ALGORITHMS; DIAGNOSIS; RISK;
D O I
10.1007/s12553-019-00299-3
中图分类号
R-058 [];
学科分类号
摘要
Medical datasets have attracted the research community for possible analysis and suitable prediction, which helps the human to take proper precautions in preventing future diseases. To perform related operations, data mining techniques have been widely used in developing decision support systems for disease prediction through a set of medical datasets. This work proposes a new predictive model for disease prediction using pre-processing techniques for various disease datasets. The proposed model not only analyses the datasets also improves the performance by using ensemble methods. To process the datasets, pre-processing techniques such as discretization, resampling, principal component, and decision tree have been used. To classify the datasets, classification techniques such as Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF) have been used. The algorithms are applied with 10 fold validation technique. A predictive analysis has also been performed on various disease datasets, where every dataset results in significant improvement for various performance measures. We perform a predictive analysis on the datasets such as CKD (Chronic Kidney Disease), Cardiovascular Disease (CVD) or heart, Diabetes, Hepatitis disease, Cancer disease and ILPD (Indian Liver Patient disease). Experimental results show that the proposed predictive model outperforms in terms of better accuracy.
引用
收藏
页码:533 / 545
页数:13
相关论文
共 50 条
[1]  
Adekanle O, 2015, HEPATITIS RES TREAT, V1, P6
[2]  
[Anonymous], 2018, DIABETES
[3]  
[Anonymous], 2013, INT J SCI ENG RES
[4]  
[Anonymous], 2018, BEROERTE GT CIJFERS
[5]   An Efficient Rule-based Classification of Diabetes Using ID3, C4.5 & CART Ensembles [J].
Bashir, Saba ;
Qamar, Usman ;
Khan, Farhan Hassan ;
Javed, M. Younus .
PROCEEDINGS OF 2014 12TH INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY, 2014, :226-231
[6]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[7]  
Bhatla N, 2012, IJERT, V1
[8]  
Chih-Yin Ho, 2012, 2012 Sixth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS), P624, DOI 10.1109/CISIS.2012.180
[9]  
Cleveland Clinic Foundation, HEART DIS DAT
[10]   Identifying patients with chronic kidney disease from general practice computer records [J].
de Lusignan, S ;
Chan, T ;
Stevens, P ;
O'Donoghue, D ;
Hague, N ;
Dzregah, B ;
Van Vlymen, J ;
Walker, M ;
Hilton, S .
FAMILY PRACTICE, 2005, 22 (03) :234-241