Tabular Data Generation to Improve Classification of Liver Disease Diagnosis

被引：6

作者：

Alauthman, Mohammad ^{[1
]}

Aldweesh, Amjad ^{[2
]}

Al-qerem, Ahmad ^{[3
]}

Aburub, Faisal ^{[4
]}

Al-Smadi, Yazan ^{[3
]}

Abaker, Awad M. M. ^{[5
]}

Alzubi, Omar Radhi ^{[6
]}

Alzubi, Bilal ^{[5
]}

机构：

[1] Univ Petra, Fac Informat Technol, Dept Informat Secur, Amman 11196, Jordan

[2] Shaqra Univ, Coll Comp & Informat Technol, Sahqra 11911, Saudi Arabia

[3] Zarqa Univ, Fac Informat Technol, Comp Sci Dept, Zarqa 13110, Jordan

[4] Univ Petra, Dept Business Intelligence & Data Analyt, Amman 11196, Jordan

[5] Umm Al Qura Univ, Coll Comp Al Qunfudah, Comp Sci Dept, Mecca 24382, Saudi Arabia

[6] Umm Al Qura Univ, Coll Comp Al Qunfudah, Comp Engn Dept, Mecca 24382, Saudi Arabia

来源：

APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 04期

关键词：

liver diseases; GAN; data augmentation; machine learning; classifications; PREDICTION;

D O I：

10.3390/app13042678

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Liver diseases are among the most common diseases worldwide. Because of the high incidence and high mortality rate, these diseases diagnoses are vital. Several elements harm the liver. For instance, obesity, undiagnosed hepatitis infection, and alcohol abuse. This causes abnormal nerve function, bloody coughing or vomiting, insufficient kidney function, hepatic failure, jaundice, and liver encephalopathy.. The diagnosis of this disease is very expensive and complex. Therefore, this work aims to assess the performance of various machine learning algorithms at decreasing the cost of predictive diagnoses of chronic liver disease. In this study, five machine learning algorithms were employed: Logistic Regression, K-Nearest Neighbor, Decision Tree, Support Vector Machine, and Artificial Neural Network (ANN) algorithm. In this work, we examined the effects of the increased prediction accuracy of Generative Adversarial Networks (GANs) and the synthetic minority oversampling technique (SMOTE). Generative opponents' networks (GANs) are a mechanism to produce artificial data with a distribution close to real data distribution. This is achieved by training two different networks: the generator, which seeks to produce new and real samples, and the discriminator, which classifies the augmented samples using supervised classifications. Statistics show that the use of increased data slightly improves the performance of the classifier.

引用

页数：18

共 43 条

[1] Comparison between Transfer Learning and Data Augmentation on Medical Images Classification
Al-qerem, Ahmad
Abu salem, Amer
Jebreen, Issam
Nabot, Ahmad
Samhan, Ahmad
[J]. 2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 427 - 433
[2] An efficient machine-learning model based on data augmentation for pain intensity recognition
Al-Qerem, Ahmad
[J]. EGYPTIAN INFORMATICS JOURNAL, 2020, 21 (04) : 241 - 257
[3] General model for best feature extraction of EEG using discrete wavelet transform wavelet family and differential evolution
al-Qerem, Ahmad
Kharbat, Faten
Nashwan, Shadi
Ashraf, Staish
Blaou, Khairi
[J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2020, 16 (03)
[4] Al-qerem A, 2019, INT ARAB CONF INF TE, P241, DOI [10.1109/acit47987.2019.8991120, 10.1109/ACIT47987.2019.8991120]
[5] Arjovsky M, 2017, PR MACH LEARN RES, V70
[6] Awad M., 2015, Effic. Learn. Mach. Theor. Concepts Appl. Eng. Syst. Des, P39, DOI [DOI 10.1007/978-1-4302-5990-93, DOI 10.1007/978-1-4302-5990-9_3]
[7] Babu M.S.P., 2010, P INT C INTELLIGENT
[8] Bansal M, 2022, DECIS ANAL J, V3, DOI DOI 10.1016/J.DAJOUR.2022.100071
[9] Behera Mandakini Priyadarshani, 2023, Procedia Computer Science, P818, DOI 10.1016/j.procs.2023.01.062
[10] Belavigi D., 2019, INT J INNOV TECHNOL, V8, P3290

← 1 2 3 4 5 →