Machine learning framework with feature selection approaches for thyroid disease classification and associated risk factors identification

被引:9
|
作者
Azrin Sultana
Rakibul Islam
机构
[1] American International University-Bangladesh,Departmentof Computer Science
关键词
Thyroid disease prediction; Random forest; Healthcare; Machine learning; Feature selection;
D O I
10.1186/s43067-023-00101-5
中图分类号
学科分类号
摘要
Thyroid disease (TD) develops when the thyroid does not generate an adequate quantity of thyroid hormones as well as when a lump or nodule emerges due to aberrant growth of the thyroid gland. As a result, early detection was pertinent in preventing or minimizing the impact of this disease. In this study, different machine learning (ML) algorithms with a combination of scaling method, oversampling technique, and various feature selection approaches have been applied to make an efficient framework to classify TD. In addition, significant risk factors of TD were also identified in this proposed system. The dataset was collected from the University of California Irvine (UCI) repository for this research. After that, in the preprocessing stage, Synthetic Minority Oversampling Technique (SMOTE) was used to resolve the imbalance class problem and robust scaling technique was used to scale the dataset. The Boruta, Recursive Feature Elimination (RFE), and Least Absolute Shrinkage and Selection Operator (LASSO) approaches were used to select appropriate features. To train the model, we employed six different ML classifiers: Support Vector Machine (SVM), AdaBoost (AB), Decision Tree (DT), Gradient Boosting (GB), K-Nearest Neighbors (KNN), and Random Forest (RF). The models were examined using a 5-fold CV. Different performance metrics were observed to compare the effectiveness of the algorithms. The system achieved the most accurate results using the RF classifier, with 99% accuracy. This proposed system will be beneficial for physicians and patients to classify TD as well as to learn about the associated risk factors of TD.
引用
收藏
相关论文
共 50 条
  • [1] Feature Selection for Text Classification Using Machine Learning Approaches
    Thirumoorthy, K.
    Muneeswaran, K.
    NATIONAL ACADEMY SCIENCE LETTERS-INDIA, 2022, 45 (01): : 51 - 56
  • [2] Feature Selection for Text Classification Using Machine Learning Approaches
    K. Thirumoorthy
    K. Muneeswaran
    National Academy Science Letters, 2022, 45 : 51 - 56
  • [3] Feature Selection and Machine Learning Applied for Alzheimer's Disease Classification
    Sanchez-Reyna, Gabriela
    Espino-Salinas, Carlos H.
    Rodriguez-Aguayo, Pablo C.
    Salinas-Gonzalez, Jared D.
    Zanella-Calzada, Laura A.
    Martinez-Escobar, Elda Y.
    Celaya-Padilla, Jose M.
    Galvan-Tejada, Jorge, I
    Galvan-Tejada, Carlos E.
    VIII LATIN AMERICAN CONFERENCE ON BIOMEDICAL ENGINEERING AND XLII NATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING, 2020, 75 : 121 - 128
  • [4] Machine learning for fake news classification with optimal feature selection
    Fayaz, Muhammad
    Khan, Atif
    Bilal, Muhammad
    Khan, Sana Ullah
    SOFT COMPUTING, 2022, 26 (16) : 7763 - 7771
  • [5] Machine learning for fake news classification with optimal feature selection
    Muhammad Fayaz
    Atif Khan
    Muhammad Bilal
    Sana Ullah Khan
    Soft Computing, 2022, 26 : 7763 - 7771
  • [6] FEATURE SELECTION AND MACHINE LEARNING CLASSIFICATION FOR MALWARE DETECTION
    Khammas, Ban Mohammed
    Monemi, Alireza
    Bassi, Joseph Stephen
    Ismail, Ismahani
    Nor, Sulaiman Mohd
    Marsono, Muhammad Nadzir
    JURNAL TEKNOLOGI, 2015, 77 (01):
  • [7] Feature selection in a machine learning system for texture classification
    Baik, SW
    Bala, J
    ALGORITHMS FOR SYNTHETIC APERTURE RADAR IMAGERY V, 1998, 3370 : 261 - 268
  • [8] Optimizing Feature Selection for Solar Park Classification: Approaches with OBIA and Machine Learning
    Ladisa, Claudio
    Capolupo, Alessandra
    Tarantino, Eufemia
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT V, 2024, 14819 : 286 - 301
  • [9] Bio inspired Ensemble Feature Selection (BEFS) Model with Machine Learning and Data Mining Algorithms for Disease Risk Prediction
    Pasha, Syed Javeed
    Mohamed, E. Syed
    2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [10] Feature Selection and Machine Learning Approaches for Detecting Sarcopenia Through Predictive Modeling
    Tukhtaev, Akhrorbek
    Turimov, Dilmurod
    Kim, Jiyoun
    Kim, Wooseong
    MATHEMATICS, 2025, 13 (01)