A deep auto-encoder model for gene expression prediction

被引:70
作者
Xie, Rui [1 ]
Wen, Jia [2 ]
Quitadamo, Andrew [2 ]
Cheng, Jianlin [1 ]
Shi, Xinghua [2 ]
机构
[1] Univ Missouri, Dept Comp Sci, Columbia, MO USA
[2] Univ N Carolina, Coll Comp & Informat, Dept Bioinformat & Genom, Univ City Blvd, Charlotte, NC 28223 USA
来源
BMC GENOMICS | 2017年 / 18卷
基金
美国国家科学基金会;
关键词
Predictive model; Stacked denoising auto-encoder; Multilayer perceptron; Deep learning; Gene expression; QUANTITATIVE TRAIT LOCI; RESIDUE CONTACTS; GENOME; TRANSCRIPTOME; NETWORKS; ARCHITECTURES; NUCLEOTIDE; SEQUENCE; MAP;
D O I
10.1186/s12864-017-4226-0
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. Results: To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. Conclusion: We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.
引用
收藏
页数:11
相关论文
共 88 条
  • [1] Adhikari B, 2016, METHODS MOL BIOL, V1415, P463, DOI 10.1007/978-1-4939-3572-7_24
  • [2] Genetics of single-cell protein abundance variation in large yeast populations
    Albert, Frank W.
    Treusch, Sebastian
    Shockley, Arthur H.
    Bloom, Joshua S.
    Kruglyak, Leonid
    [J]. NATURE, 2014, 506 (7489) : 494 - +
  • [3] [Anonymous], INT C MACH LEARN
  • [4] [Anonymous], 2013, ARXIV13084214
  • [5] The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans
    Ardlie, Kristin G.
    DeLuca, David S.
    Segre, Ayellet V.
    Sullivan, Timothy J.
    Young, Taylor R.
    Gelfand, Ellen T.
    Trowbridge, Casandra A.
    Maller, Julian B.
    Tukiainen, Taru
    Lek, Monkol
    Ward, Lucas D.
    Kheradpour, Pouya
    Iriarte, Benjamin
    Meng, Yan
    Palmer, Cameron D.
    Esko, Tonu
    Winckler, Wendy
    Hirschhorn, Joel N.
    Kellis, Manolis
    MacArthur, Daniel G.
    Getz, Gad
    Shabalin, Andrey A.
    Li, Gen
    Zhou, Yi-Hui
    Nobel, Andrew B.
    Rusyn, Ivan
    Wright, Fred A.
    Lappalainen, Tuuli
    Ferreira, Pedro G.
    Ongen, Halit
    Rivas, Manuel A.
    Battle, Alexis
    Mostafavi, Sara
    Monlong, Jean
    Sammeth, Michael
    Mele, Marta
    Reverter, Ferran
    Goldmann, Jakob M.
    Koller, Daphne
    Guigo, Roderic
    McCarthy, Mark I.
    Dermitzakis, Emmanouil T.
    Gamazon, Eric R.
    Im, Hae Kyung
    Konkashbaev, Anuar
    Nicolae, Dan L.
    Cox, Nancy J.
    Flutre, Timothee
    Wen, Xiaoquan
    Stephens, Matthew
    [J]. SCIENCE, 2015, 348 (6235) : 648 - 660
  • [6] Representation Learning: A Review and New Perspectives
    Bengio, Yoshua
    Courville, Aaron
    Vincent, Pascal
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828
  • [7] Learning Deep Architectures for AI
    Bengio, Yoshua
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01): : 1 - 127
  • [8] Genetic interactions contribute less than additive effects to quantitative trait variation in yeast
    Bloom, Joshua S.
    Kotenko, Iulia
    Sadhu, Meru J.
    Treusch, Sebastian
    Albert, Frank W.
    Kruglyak, Leonid
    [J]. NATURE COMMUNICATIONS, 2015, 6
  • [9] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [10] Breiman L, 1998, ANN STAT, V26, P801