Performance Analysis of Machine Learning Algorithms for Thyroid Disease

被引:0
作者
Hafiz Abbad Ur Rehman
Chyi-Yeu Lin
Zohaib Mushtaq
Shun-Feng Su
机构
[1] National Taiwan University of Science and Technology,Department of Mechanical Engineering
[2] National Taiwan University of Science and Technology,Department of Electrical Engineering
来源
Arabian Journal for Science and Engineering | 2021年 / 46卷
关键词
Classification; Thyroid disease; KNN; SVM; DT; NB; LR; Feature selection;
D O I
暂无
中图分类号
学科分类号
摘要
Thyroid disease arises from an anomalous growth of thyroid tissue at the verge of the thyroid gland. Thyroid disorderliness normally ensues when this gland releases abnormal amounts of hormones where hypothyroidism (inactive thyroid gland) and hyperthyroidism (hyperactive thyroid gland) are the two main types of thyroid disorder. This study proposes the use of efficient classifiers by using machine learning algorithms in terms of accuracy and other performance evaluation metrics to detect and diagnose thyroid disease. This research presents an extensive analysis of different classifiers which are K-nearest neighbor (KNN), Naïve Bayes, support vector machine, decision tree and logistic regression implemented with or without feature selection techniques. Thyroid data were taken from DHQ Teaching Hospital, Dera Ghazi Khan, Pakistan. Thyroid dataset was unique and different from other existing studies because it included three additional features which were pulse rate, body mass index and blood pressure. Experiment was based on three iterations; the first iteration of the experiment did not employ feature selection while the second and third were with L1-, L2-based feature selection technique. Evaluation and analysis of the experiment have been done which consisted of many factors such as accuracy, precision and receiver operating curve with area under curve. The result indicated that classifiers which involved L1-based feature selection achieved an overall higher accuracy (Naive Bayes 100%, logistic regression 100% and KNN 97.84%) compared to without feature selection and L2-based feature selection technique.
引用
收藏
页码:9437 / 9449
页数:12
相关论文
共 68 条
[1]  
Miller KD(2016)Cancer treatment and survivorship statistics, 2016 CA Cancer J. Clin. 66 271-289
[2]  
Pal R(2018)Evaluation and performance analysis of classification techniques for thyroid detection Int. J. Bus. Inf. Syst. 28 163-177
[3]  
Anand T(2016)Thyroid lesion classification in 242 patient population using Gabor transform features from high resolution ultrasound images Knowl. Based Syst. 107 235-245
[4]  
Dubey SK(2016)A comparative study on thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques CSI Trans. 4 313-319
[5]  
Acharya UR(2016)Classification of thyroid disease by using data mining models: a comparison of decision tree algorithms Oxf. J. Intell. Decis. Data Sci. 2016 13-28
[6]  
Choriappa P(2016)Thyroid disease diagnosis via hybrid architecture composing rough data sets theory and machine learning algorithms Soft Comput. 20 1179-1189
[7]  
Fujita H(2019)Effective K-nearest neighbor classifications for Wisconsin breast cancer data sets J. Chin. Inst. Eng. 43 1-13
[8]  
Chandel K(2013)A survey on data mining approaches for healthcare Int. J. Bio-Sci. Bio-Technol. 5 241-266
[9]  
Kunwar V(2013)Data mining applications in healthcare sector: a study Int. J. Sci. Technol. Res. 2 29-35
[10]  
Sabitha S(2012)Design of an enhanced fuzzy k-nearest neighbor classifier based computer aided diagnostic system for thyroid disease J. Med. Syst. 36 3243-3254