Genome-wide association studies of ischemic stroke based on interpretable machine learning

被引:0
|
作者
Nikoli, Stefan [1 ]
Ignatov, Dmitry I. [1 ]
Khvorykh, Gennady, V [2 ]
Limborska, Svetlana A. [2 ]
Khrunin, Andrey, V [2 ]
机构
[1] HSE Univ, Lab Models & Methods Computat Pragmat, Dept Data Anal & Artificial Intelligence, Moscow, Russia
[2] Natl Res Ctr Kurchatov Inst, Moscow, Russia
基金
俄罗斯科学基金会;
关键词
Genome-wide association studies; Interpretable machine learning; Ischemic stroke; Illuminating druggable genome; XGBoost; Interpretable neural network TabNet; SNP ranking; SNP importance; OXIDATIVE STRESS; DISEASE; RISK; GENE; PROTEINS; LOCI;
D O I
10.7717/peerj-cs.2454
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the identification of several dozen genetic loci associated with ischemic stroke (IS), the genetic bases of this disease remain largely unexplored. In this research we present the results of genome-wide association studies (GWAS) based on classical statistical testing and machine learning algorithms (logistic regression, gradient boosting on decision trees, and tabular deep learning model TabNet). To build a consensus on the results obtained by different techniques, the Pareto-Optimal solution was proposed and applied. These methods were applied to real genotypic data of sick and healthy individuals of European ancestry obtained from the Database of Genotypes and Phenotypes (5,581 individuals, 883,749 single nucleotide polymorphisms). Finally, 131 genes were identified as candidates for association with the onset of IS. UBQLN1, TRPS1, and MUSK were previously described as associated with the course of IS in model animals. ACOT11 taking part in metabolism of fatty acids was shown for the first time to be associated with IS. The identified genes were compared with genes from the Illuminating Druggable Genome project. The product of GPR26 representing the G-coupled protein receptor can be considered as a therapeutic target for stroke prevention. The approaches presented in this research can be used to reprocess GWAS datasets from other diseases.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Genome-Wide Association Analysis of Ischemic Stroke in Young Adults
    Cheng, Yu-Ching
    O'Connell, Jeffrey R.
    Cole, John W.
    Stine, O. Colin
    Dueker, Nicole
    McArdle, Patrick F.
    Sparks, Mary J.
    Shen, Jess
    Laurie, Cathy C.
    Nelson, Sarah
    Doheny, Kimberly F.
    Ling, Hua
    Pugh, Elizabeth W.
    Brott, Thomas G.
    Brown, Robert D., Jr.
    Meschia, James F.
    Nalls, Michael
    Rich, Stephen S.
    Worrall, Bradford
    Anderson, Christopher D.
    Biffi, Alessandro
    Cortellini, Lynelle
    Furie, Karen L.
    Rost, Natalia S.
    Rosand, Jonathan
    Manolio, Teri A.
    Kittner, Steven J.
    Mitchell, Braxton D.
    G3-GENES GENOMES GENETICS, 2011, 1 (06): : 505 - 513
  • [2] Machine Learning to Advance Human Genome-Wide Association Studies
    Sigala, Rafaella E.
    Lagou, Vasiliki
    Shmeliov, Aleksey
    Atito, Sara
    Kouchaki, Samaneh
    Awais, Muhammad
    Prokopenko, Inga
    Mahdi, Adam
    Demirkan, Ayse
    GENES, 2024, 15 (01)
  • [3] Machine learning approaches to genome-wide association studies
    Enoma, David O.
    Bishung, Janet
    Abiodun, Theresa
    Ogunlana, Olubanke
    Osamor, Victor Chukwudi
    JOURNAL OF KING SAUD UNIVERSITY SCIENCE, 2022, 34 (04)
  • [4] Leveraging machine learning to advance genome-wide association studies
    Dagasso, Gabrielle
    Yan, Yan
    Wang, Lipu
    Li, Longhai
    Kutcher, Randy
    Zhang, Wentao
    Jin, Lingling
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2021, 25 (1-2) : 17 - 36
  • [5] Successes of Genome-wide Association Studies
    Klein, Robert J.
    Xu, Xing
    Mukherjee, Semanti
    Willis, Jason
    Hayes, James
    CELL, 2010, 142 (03) : 350 - 351
  • [6] Genome-Wide Association Studies of Intracranial Aneurysms An Update
    Hussain, Ibrahim
    Duffis, Ennis Jesus
    Gandhi, Chirag D.
    Prestigiacomo, Charles J.
    STROKE, 2013, 44 (09) : 2670 - 2675
  • [7] Wellcome Trust Genome-Wide Association Study of Ischemic Stroke
    Markus, Hugh S.
    STROKE, 2013, 44 (06) : S20 - S22
  • [8] Chapter 11: Genome-Wide Association Studies
    Bush, William S.
    Moore, Jason H.
    PLOS COMPUTATIONAL BIOLOGY, 2012, 8 (12)
  • [9] The impact of genome-wide association studies on biomedical research publications
    Struck, Travis J.
    Mannakee, Brian K.
    Gutenkunst, Ryan N.
    HUMAN GENOMICS, 2018, 12
  • [10] Identification of Shared Genes Between Ischemic Stroke and Parkinson's Disease Using Genome-Wide Association Studies
    Lang, Wenjing
    Wang, Junjie
    Ma, Xiaofeng
    Zhang, Nong
    Li, He
    Cui, Pan
    Hao, Junwei
    FRONTIERS IN NEUROLOGY, 2019, 10