A Machine Learning-Based Framework for Accurate and Early Diagnosis of Liver Diseases: A Comprehensive Study on Feature Selection, Data Imbalance, and Algorithmic Performance

被引:1
作者
Rehman, Attique Ur [1 ,2 ,3 ]
Butt, Wasi Haider [1 ]
Ali, Tahir Muhammad [2 ]
Javaid, Sabeen [3 ]
Almufareh, Maram Fahaad [4 ]
Humayun, Mamoona [5 ]
Rahman, Hameedur [6 ]
Mir, Azka [3 ]
Shaheen, Momina [5 ]
机构
[1] Natl Univ Sci & Technol, Coll Elect & Mech Engn, Dept Comp & Software Engn, Islamabad, Pakistan
[2] Gulf Univ Sci & Technol, Dept Comp Sci, Mubarak Al Abdullah, Kuwait
[3] Univ Sialkot, Dept Software Engn, Sialkot, Pakistan
[4] Jouf Univ, Coll Comp & Informat Sci, Dept Informat Syst, Al Jouf, Saudi Arabia
[5] Univ Roehampton, Sch Arts Humanities & Social Sci, London SW15 5PJ, England
[6] Air Univ, Fac Comp & AI, Dept Comp Games Dev, PAF Complex E9, Islamabad, Pakistan
关键词
CLASSIFICATION; PREDICTION; HEPATITIS;
D O I
10.1155/2024/6111312
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The liver is the largest organ of the human body with more than 500 vital functions. In recent decades, a large number of liver patients have been reported with diseases such as cirrhosis, fibrosis, or other liver disorders. There is a need for effective, early, and accurate identification of individuals suffering from such disease so that the person may recover before the disease spreads and becomes fatal. For this, applications of machine learning are playing a significant role. Despite the advancements, existing systems remain inconsistent in performance due to limited feature selection and data imbalance. In this article, we reviewed 58 articles extracted from 5 different electronic repositories published from January 2015 to 2023. After a systematic and protocol-based review, we answered 6 research questions about machine learning algorithms. The identification of effective feature selection techniques, data imbalance management techniques, accurate machine learning algorithms, a list of available data sets with their URLs and characteristics, and feature importance based on usage has been identified for diagnosing liver disease. The reason to select this research question is, in any machine learning framework, the role of dimensionality reduction, data imbalance management, machine learning algorithm with its accuracy, and data itself is very significant. Based on the conducted review, a framework, machine learning-based liver disease diagnosis (MaLLiDD), has been proposed and validated using three datasets. The proposed framework classified liver disorders with 99.56%, 76.56%, and 76.11% accuracy. In conclusion, this article addressed six research questions by identifying effective feature selection techniques, data imbalance management techniques, algorithms, datasets, and feature importance based on usage. It also demonstrated a high accuracy with the framework for early diagnosis, marking a significant advancement.
引用
收藏
页数:29
相关论文
共 15 条
  • [1] Machine Learning-Based Feature Selection and Classification for the Experimental Diagnosis of Trypanosoma cruzi
    Hevia-Montiel, Nidiyare
    Perez-Gonzalez, Jorge
    Neme, Antonio
    Haro, Paulina
    ELECTRONICS, 2022, 11 (05)
  • [2] Feature Selection For Machine Learning-Based Early Detection of Distributed Cyber Attacks
    Feng, Yaokai
    Akiyama, Hitoshi
    Lu, Liang
    Sakurai, Kouichi
    2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH), 2018, : 173 - 180
  • [3] Incorporating Feature Selection Methods into Machine Learning-Based Covid-19 Diagnosis
    Danaci, Cagla
    Tuncer, Seda Arslan
    APPLIED COMPUTER SYSTEMS, 2022, 27 (01) : 13 - 18
  • [4] A new hybrid ensemble feature selection framework for machine learning-based phishing detection system
    Chiew, Kang Leng
    Tan, Choon Lin
    Wong, KokSheik
    Yong, Kelvin S. C.
    Tiong, Wei King
    INFORMATION SCIENCES, 2019, 484 : 153 - 166
  • [5] Machine Learning-Based Framework for Multi-Class Diagnosis of Neurodegenerative Diseases: A Study on Parkinson's Disease
    Singh, Gurpreet
    Vadera, Meet
    Samavedham, Lakshminarayanan
    Lim, Erle Chuen-Hian
    IFAC PAPERSONLINE, 2016, 49 (07): : 990 - 995
  • [6] A Machine Learning-Based QSAR Model for Benzimidazole Derivatives as Corrosion Inhibitors by Incorporating Comprehensive Feature Selection
    Liu, Youquan
    Guo, Yanzhi
    Wu, Wengang
    Xiong, Ying
    Sun, Chuan
    Yuan, Li
    Li, Menglong
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2019, 11 (04) : 738 - 747
  • [7] Ensemble feature selection and classification methods for machine learning-based coronary artery disease diagnosis
    Kolukisa, Burak
    Bakir-Gungor, Burcu
    COMPUTER STANDARDS & INTERFACES, 2023, 84
  • [8] Improving the performance of machine learning classifiers for Breast Cancer diagnosis based on feature selection
    Perez, Noel
    Guevara, Miguel A.
    Silva, Augusto
    Ramos, Isabel
    Loureiro, Joana
    FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2014, 2014, 2 : 209 - 217
  • [9] Analysis of the performance of feature optimization techniques for the diagnosis of machine learning-based chronic kidney disease
    Hossain, Muhammad Minoar
    Swarna, Reshma Ahmed
    Mostafiz, Rafid
    Shaha, Pabon
    Pinky, Lubna Yasmin
    Rahman, Mohammad Motiur
    Rahman, Wahidur
    Hossain, Md. Selim
    Hossain, Md. Elias
    Iqbal, Md. Sadiq
    MACHINE LEARNING WITH APPLICATIONS, 2022, 9
  • [10] Evaluating Machine Learning-Based Feature Selection Methods for Diagnosing Parkinson's Disease Under the SVM Framework
    Thirapanish, Wiput
    Kantavat, Pittipol
    Wanvarie, Dittaya
    Chuangsuwanich, Ekapol
    Punyabukkana, Proadpran
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 409 - 415