A Comprehensive Analysis of Artificial Intelligence Techniques for the Prediction and Prognosis of Genetic Disorders Using Various Gene Disorders

被引:19
作者
Chaplot, Neelam [1 ]
Pandey, Dhiraj [2 ]
Kumar, Yogesh [3 ]
Sisodia, Pushpendra Singh [4 ]
机构
[1] Manipal Univ Jaipur, Dept Comp Sci & Engn, Jaipur, Rajasthan, India
[2] JSS Acad Tech Educ, Dept Comp Sci & Engn, Noida, UP, India
[3] Pandit Deendayal Energy Univ, Sch Technol, Dept CSE, Gandhinagar, Gujarat, India
[4] Indus Univ, Dept Comp Engn, IITE, Ahmadabad, Gujarat, India
关键词
Adaptive boosting - Classification (of information) - Diagnosis - Diseases - Genes - Learning systems - Logistic regression - Mean square error - Random forests - Support vector regression;
D O I
10.1007/s11831-023-09904-1
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A medical analysis of diagnosing rare genetic diseases has rapidly become the most expensive and time-consuming component for doctors. By combining predictive methods with growing knowledge of genetic disease, artificial intelligence (AI) has the potential to simplify and accelerate genome interpretation greatly. In this paper, multiple machine-learning models like support vector machine, Gaussian Naive Bayes, KNN, Decision Tree, Gradient Boosting, logistic regression, light gradient boosting classifier, Random Forest, extreme gradient boosting classifier, and cat-boost are applied to the genetic disorder as well as genetic disorder sub-classes datasets. The dataset has been initially pre-processed to check for NAN values, which are graphically represented in various categories like genetic disorder, genetic disorder subclasses, five samples of symptoms, genes inherited from mother's and father's side, birth defects etc. to study their pattern. Later, the features have been selected using standardization technique on which the machine learning models are applied and later evaluated using accuracy, loss, recall, precision, root mean square error, and F1 score. Furthermore, the confusion matrix is also generated to compute false negative, true positive, false positive and true negative values for the classes drawn from both datasets. It has been found that the highest accuracy has been calculated by decision tree, random forest, gradient boosting, LGBM classifier, XGB classifier, and CatBoost by 99.9% for genetic disorder while as only the random forest, decision tree, LGBM classifier, and CatBoost, on the other hand, achieved 99.9% accuracy for genetic disorder sub-classes.
引用
收藏
页码:3301 / 3323
页数:23
相关论文
共 40 条
[1]  
Al-Sarem Mohammed, 2021, Advances on Smart and Soft Computing. Proceedings of ICACIn 2020. Advances in Intelligent Systems and Computing (AISC 1188), P189, DOI 10.1007/978-981-15-6048-4_17
[2]   Prognostication and Risk Factors for Cystic Fibrosis via Automated Machine Learning [J].
Alaa, Ahmed M. ;
van der Schaar, Mihaela .
SCIENTIFIC REPORTS, 2018, 8
[3]   Genetic data sharing and artificial intelligence in the era of personalized medicine based on a cross-sectional analysis of the Saudi human genome program [J].
Alrefaei, Abdulmajeed F. ;
Hawsawi, Yousef M. ;
Almaleki, Deyab ;
Alafif, Tarik ;
Alzahrani, Faisal A. ;
Bakhrebah, Muhammed A. .
SCIENTIFIC REPORTS, 2022, 12 (01)
[4]  
[Anonymous], COM CHALLENGES COMPE
[5]   Identifying disease genes using machine learning and gene functional similarities, assessed through Gene Ontology [J].
Asif, Muhammad ;
Martiniano, Hugo F. M. C. M. ;
Vicente, Astrid M. ;
Couto, Francisco M. .
PLOS ONE, 2018, 13 (12)
[6]   Prediction model using SMOTE, genetic algorithm and decision tree (PMSGD) for classification of diabetes mellitus [J].
Azad, Chandrashekhar ;
Bhushan, Bharat ;
Sharma, Rohit ;
Shankar, Achyut ;
Singh, Krishna Kant ;
Khamparia, Aditya .
MULTIMEDIA SYSTEMS, 2022, 28 (04) :1289-1307
[7]   Deep transfer learning techniques with hybrid optimization in early prediction and diagnosis of different types of oral cancer [J].
Bansal, Khushboo ;
Bathla, R. K. ;
Kumar, Yogesh .
SOFT COMPUTING, 2022, 26 (21) :11153-11184
[8]  
Blazer D. G., 2006, GENES BEHAV SOCIAL E, DOI DOI 10.17226/11693
[9]   Artificial Intelligence (AI) in Rare Diseases: Is the Future Brighter? [J].
Brasil, Sandra ;
Pascoal, Carlota ;
Francisco, Rita ;
Ferreira, Vanessa dos Reis ;
Videira, Paula A. ;
Valadao, Goncalo .
GENES, 2019, 10 (12)
[10]   A comparative study on thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques [J].
Khushboo Chandel ;
Veenita Kunwar ;
Sai Sabitha ;
Tanupriya Choudhury ;
Saurabh Mukherjee .
CSI Transactions on ICT, 2016, 4 (2-4) :313-319