Tabular Data Generation to Improve Classification of Liver Disease Diagnosis

被引:6
作者
Alauthman, Mohammad [1 ]
Aldweesh, Amjad [2 ]
Al-qerem, Ahmad [3 ]
Aburub, Faisal [4 ]
Al-Smadi, Yazan [3 ]
Abaker, Awad M. M. [5 ]
Alzubi, Omar Radhi [6 ]
Alzubi, Bilal [5 ]
机构
[1] Univ Petra, Fac Informat Technol, Dept Informat Secur, Amman 11196, Jordan
[2] Shaqra Univ, Coll Comp & Informat Technol, Sahqra 11911, Saudi Arabia
[3] Zarqa Univ, Fac Informat Technol, Comp Sci Dept, Zarqa 13110, Jordan
[4] Univ Petra, Dept Business Intelligence & Data Analyt, Amman 11196, Jordan
[5] Umm Al Qura Univ, Coll Comp Al Qunfudah, Comp Sci Dept, Mecca 24382, Saudi Arabia
[6] Umm Al Qura Univ, Coll Comp Al Qunfudah, Comp Engn Dept, Mecca 24382, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 04期
关键词
liver diseases; GAN; data augmentation; machine learning; classifications; PREDICTION;
D O I
10.3390/app13042678
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Liver diseases are among the most common diseases worldwide. Because of the high incidence and high mortality rate, these diseases diagnoses are vital. Several elements harm the liver. For instance, obesity, undiagnosed hepatitis infection, and alcohol abuse. This causes abnormal nerve function, bloody coughing or vomiting, insufficient kidney function, hepatic failure, jaundice, and liver encephalopathy.. The diagnosis of this disease is very expensive and complex. Therefore, this work aims to assess the performance of various machine learning algorithms at decreasing the cost of predictive diagnoses of chronic liver disease. In this study, five machine learning algorithms were employed: Logistic Regression, K-Nearest Neighbor, Decision Tree, Support Vector Machine, and Artificial Neural Network (ANN) algorithm. In this work, we examined the effects of the increased prediction accuracy of Generative Adversarial Networks (GANs) and the synthetic minority oversampling technique (SMOTE). Generative opponents' networks (GANs) are a mechanism to produce artificial data with a distribution close to real data distribution. This is achieved by training two different networks: the generator, which seeks to produce new and real samples, and the discriminator, which classifies the augmented samples using supervised classifications. Statistics show that the use of increased data slightly improves the performance of the classifier.
引用
收藏
页数:18
相关论文
共 43 条
  • [1] Comparison between Transfer Learning and Data Augmentation on Medical Images Classification
    Al-qerem, Ahmad
    Abu salem, Amer
    Jebreen, Issam
    Nabot, Ahmad
    Samhan, Ahmad
    [J]. 2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 427 - 433
  • [2] An efficient machine-learning model based on data augmentation for pain intensity recognition
    Al-Qerem, Ahmad
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2020, 21 (04) : 241 - 257
  • [3] General model for best feature extraction of EEG using discrete wavelet transform wavelet family and differential evolution
    al-Qerem, Ahmad
    Kharbat, Faten
    Nashwan, Shadi
    Ashraf, Staish
    Blaou, Khairi
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2020, 16 (03)
  • [4] Al-qerem A, 2019, INT ARAB CONF INF TE, P241, DOI [10.1109/acit47987.2019.8991120, 10.1109/ACIT47987.2019.8991120]
  • [5] Arjovsky M, 2017, PR MACH LEARN RES, V70
  • [6] Awad M., 2015, Effic. Learn. Mach. Theor. Concepts Appl. Eng. Syst. Des, P39, DOI [DOI 10.1007/978-1-4302-5990-93, DOI 10.1007/978-1-4302-5990-9_3]
  • [7] Babu M.S.P., 2010, P INT C INTELLIGENT
  • [8] Bansal M, 2022, DECIS ANAL J, V3, DOI DOI 10.1016/J.DAJOUR.2022.100071
  • [9] Behera Mandakini Priyadarshani, 2023, Procedia Computer Science, P818, DOI 10.1016/j.procs.2023.01.062
  • [10] Belavigi D., 2019, INT J INNOV TECHNOL, V8, P3290