Functional Neural Networks for High-Dimensional Genetic Data Analysis

被引:2
作者
Zhang, Shan [1 ]
Zhou, Yuan [2 ]
Geng, Pei [3 ]
Lu, Qing [2 ]
机构
[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
[2] Univ Florida, Dept Biostat, Gainesville, FL 32603 USA
[3] Univ New Hampshire, Dept Math & Stat, Durham, NH 03824 USA
关键词
Genetics; Diseases; Data analysis; Data models; Vectors; Artificial neural networks; Time measurement; Functional data analysis; neural networks; genetic data analysis; ASSOCIATION; PHENOTYPES;
D O I
10.1109/TCBB.2024.3364614
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Artificial intelligence (AI) is a thriving research field with many successful applications in areas such as computer vision and speech recognition. Machine learning methods, such as artificial neural networks (ANN), play a central role in modern AI technology. While ANN also holds great promise for human genetic research, the high-dimensional genetic data and complex genetic structure bring tremendous challenges. The vast majority of genetic variants on the genome have small or no effects on diseases, and fitting ANN on a large number of variants without considering the underlying genetic structure (e.g., linkage disequilibrium) could bring a serious overfitting issue. Furthermore, while a single disease phenotype is often studied in a classic genetic study, in emerging research fields (e.g., imaging genetics), researchers need to deal with different types of disease phenotypes. To address these challenges, we propose a functional neural networks (FNN) method. FNN uses a series of basis functions to model high-dimensional genetic data and a variety of phenotype data and further builds a multi-layer functional neural network to capture the complex relationships between genetic variants and disease phenotypes. Through simulations, we demonstrate the advantages of FNN for high-dimensional genetic data analysis in terms of robustness and accuracy. The real data applications also showed that FNN attained higher accuracy than the existing methods.
引用
收藏
页码:383 / 393
页数:11
相关论文
共 27 条
  • [1] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [2] The UK Biobank resource with deep phenotyping and genomic data
    Bycroft, Clare
    Freeman, Colin
    Petkova, Desislava
    Band, Gavin
    Elliott, Lloyd T.
    Sharp, Kevin
    Motyer, Allan
    Vukcevic, Damjan
    Delaneau, Olivier
    O'Connell, Jared
    Cortes, Adrian
    Welsh, Samantha
    Young, Alan
    Effingham, Mark
    McVean, Gil
    Leslie, Stephen
    Allen, Naomi
    Donnelly, Peter
    Marchini, Jonathan
    [J]. NATURE, 2018, 562 (7726) : 203 - +
  • [3] Detecting gene-gene interactions that underlie human diseases
    Cordell, Heather J.
    [J]. NATURE REVIEWS GENETICS, 2009, 10 (06) : 392 - 404
  • [4] Zeiler MD, 2012, Arxiv, DOI arXiv:1212.5701
  • [5] Deep learning: new computational modelling techniques for genomics
    Eraslan, Gokcen
    Avsec, Ziga
    Gagneur, Julien
    Theis, Fabian J.
    [J]. NATURE REVIEWS GENETICS, 2019, 20 (07) : 389 - 403
  • [6] Functional Linear Models for Association Analysis of Quantitative Traits
    Fan, Ruzong
    Wang, Yifan
    Mills, James L.
    Wilson, Alexander F.
    Bailey-Wilson, Joan E.
    Xiong, Momiao
    [J]. GENETIC EPIDEMIOLOGY, 2013, 37 (07) : 726 - 742
  • [7] SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
    Fey, Matthias
    Lenssen, Jan Eric
    Weichert, Frank
    Mueller, Heinrich
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 869 - 877
  • [8] Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
  • [9] Coming of age: ten years of next-generation sequencing technologies
    Goodwin, Sara
    McPherson, John D.
    McCombie, W. Richard
    [J]. NATURE REVIEWS GENETICS, 2016, 17 (06) : 333 - 351
  • [10] Penalized function-on-function regression
    Ivanescu, Andrada E.
    Staicu, Ana-Maria
    Scheipl, Fabian
    Greven, Sonja
    [J]. COMPUTATIONAL STATISTICS, 2015, 30 (02) : 539 - 568