Comparative analysis of machine learning models for shortlisting SNPs to facilitate detection of marginal epistasis in GWAS

被引:0
作者
Dasmandal, Tanwy [1 ,2 ]
Sinha, Dipro [4 ]
Rai, Anil [3 ]
Mishra, Dwijesh Chandra [4 ]
Archak, Sunil [5 ]
机构
[1] ICAR Indian Agr Res Inst, Grad Sch, New Delhi, India
[2] ICAR Natl Bur Fish Genet Resources, Lucknow, Uttar Pradesh, India
[3] Indian Council Agr Res, New Delhi, India
[4] ICAR Indian Agr Stat Res Inst, New Delhi, India
[5] ICAR Natl Bur Plant Genet Resources, New Delhi 110012, India
关键词
Marginal epistasis; Machine learning; GWAS; Feature selection; SNP-SNP interactions; GENOME-WIDE ASSOCIATION; GENETIC ARCHITECTURE; COMPLEX TRAITS; COMMON; SELECTION;
D O I
10.1007/s41060-024-00647-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Epistasis, an essential genetic element causing phenotypic diversity, is frequently characterized as the interaction between two or more genes. Previous models could identify marginal epistatic interactions by mapping variants that have nonzero marginal epistatic effects. However, these models fail short of identifying individual interaction partners. To reduce the computational burden of the existing epistasis detection algorithms without compromising the detection of exact epistatic partners, strengths of various machine learning algorithms were exploited as a filtering strategy. Seven machine learning strategies were compared for shortlisting marginally associated SNPs that includes AdaBoost, artificial neural network, 3 random forest, stepwise regression, ridge regression, lasso and elastic net. Datasets were simulated for different combinations of heritability and minor allele frequencies, and performances of different algorithms were evaluated using power and precision measures. We found that ridge regression model outperformed the other models in shortlisting marginal epistasis-related SNPs. Thus, it is expected that epistasis detection tools will benefit by adding a filtering stage using ridge regression for efficient detection of marginal epistasis in large genomic datasets.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Machine learning models outperform deep learning models, provide interpretation and facilitate feature selection for soybean trait prediction
    Mitchell Gill
    Robyn Anderson
    Haifei Hu
    Mohammed Bennamoun
    Jakob Petereit
    Babu Valliyodan
    Henry T. Nguyen
    Jacqueline Batley
    Philipp E. Bayer
    David Edwards
    BMC Plant Biology, 22
  • [32] Machine learning models outperform deep learning models, provide interpretation and facilitate feature selection for soybean trait prediction
    Gill, Mitchell
    Anderson, Robyn
    Hu, Haifei
    Bennamoun, Mohammed
    Petereit, Jakob
    Valliyodan, Babu
    Nguyen, Henry T.
    Batley, Jacqueline
    Bayer, Philipp E.
    Edwards, David
    BMC PLANT BIOLOGY, 2022, 22 (01)
  • [33] Comparative Investigation of Traditional Machine-Learning Models and Transformer Models for Phishing Email Detection
    Melendez, Rene
    Ptaszynski, Michal
    Masui, Fumito
    ELECTRONICS, 2024, 13 (24):
  • [34] A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya
    Yego, Nelson Kemboi
    Kasozi, Juma
    Nkurunziza, Joseph
    DATA, 2021, 6 (11)
  • [35] Comparative analysis of explainable machine learning prediction models for hospital mortality
    Eline Stenwig
    Giampiero Salvi
    Pierluigi Salvo Rossi
    Nils Kristian Skjærvold
    BMC Medical Research Methodology, 22
  • [36] A Comparative Analysis of Machine Learning Models for Predicting Loess Collapse Potential
    Motameni, Sahand
    Rostami, Fateme
    Farzai, Sara
    Soroush, Abbas
    GEOTECHNICAL AND GEOLOGICAL ENGINEERING, 2024, 42 (02) : 881 - 894
  • [37] Comparative Analysis of Machine Learning Models for Forecasting Infectious Disease Spread
    Damacharla, Praveen
    Gummadi, Venkata Akhil Kumar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (11) : 10 - 22
  • [38] A Comparative Analysis of Machine Learning Models for Predicting Loess Collapse Potential
    Sahand Motameni
    Fateme Rostami
    Sara Farzai
    Abbas Soroush
    Geotechnical and Geological Engineering, 2024, 42 : 881 - 894
  • [39] Comparative Evaluation and Comprehensive Analysis of Machine Learning Models for Regression Problems
    Sekeroglu, Boran
    Ever, Yoney Kirsal
    Dimililer, Kamil
    Al-Turjman, Fadi
    DATA INTELLIGENCE, 2022, 4 (03) : 620 - 652
  • [40] Forecasting Solar Power Generation: A Comparative Analysis of Machine Learning Models
    Gottwald, Daria
    Parmar, Manan
    Zureck, Alexander
    2024 INTERNATIONAL CONFERENCE ON RENEWABLE ENERGIES AND SMART TECHNOLOGIES, REST 2024, 2024,