A deep convolutional neural network approach for predicting phenotypes from genotypes

被引:155
作者
Ma, Wenlong [1 ,2 ]
Qiu, Zhixu [1 ,3 ]
Song, Jie [1 ,2 ]
Li, Jiajia [1 ,3 ]
Cheng, Qian [1 ,3 ]
Zhai, Jingjing [1 ,2 ]
Ma, Chuang [1 ,2 ]
机构
[1] Northwest A&F Univ, Coll Life Sci, Ctr Bioinformat, State Key Lab Crop Stress Biol Arid Areas, Yangling 712100, Shaanxi, Peoples R China
[2] Northwest A&F Univ, Key Lab Biol & Genet Improvement Maize Arid Area, Minist Agr, Yangling 712100, Shaanxi, Peoples R China
[3] Northwest A&F Univ, Biomass Energy Ctr Arid & Semiarid Lands, Yangling 712100, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Ensemble learning; Genomic selection; High phenotypic values; Machine learning; Genotypic marker; MARKER-ASSISTED SELECTION; GENOMIC SELECTION; TRAITS; WHEAT; REGRESSION; BARLEY;
D O I
10.1007/s00425-018-2976-9
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Main conclusion Deep learning is a promising technology to accurately select individuals with high phenotypic values based on genotypic data. Genomic selection (GS) is a promising breeding strategy by which the phenotypes of plant individuals are usually predicted based on genome-wide markers of genotypes. In this study, we present a deep learning method, named DeepGS, to predict phenotypes from genotypes. Using a deep convolutional neural network, DeepGS uses hidden variables that jointly represent features in genotypes when making predictions; it also employs convolution, sampling and dropout strategies to reduce the complexity of high-dimensional genotypic data. We used a large GS dataset to train DeepGS and compared its performance with other methods. The experimental results indicate that DeepGS can be used as a complement to the commonly used RR-BLUP in the prediction of phenotypes from genotypes. The complementarity between DeepGS and RR-BLUP can be utilized using an ensemble learning approach for more accurately selecting individuals with high phenotypic values, even for the absence of outlier individuals and subsets of genotypic markers. The source codes of DeepGS and the ensemble learning approach have been packaged into Docker images for facilitating their applications in different GS programs.
引用
收藏
页码:1307 / 1318
页数:12
相关论文
共 51 条
[1]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[2]   Deep learning for computational biology [J].
Angermueller, Christof ;
Parnamaa, Tanel ;
Parts, Leopold ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2016, 12 (07)
[3]   Genomic Selection in the Era of Next Generation Sequencing for Complex Traits in Plant Breeding [J].
Bhat, Javaid A. ;
Ali, Sajad ;
Salgotra, Romesh K. ;
Mir, Zahoor A. ;
Dutta, Sutapa ;
Jadon, Vasudha ;
Tyagi, Anshika ;
Mushtaq, Muntazir ;
Jain, Neelu ;
Singh, Pradeep K. ;
Singh, Gyanendra P. ;
Prabhu, K. V. .
FRONTIERS IN GENETICS, 2016, 7
[4]   Comparison of methods used to identify superior individuals in genomic selection in plant breeding [J].
Bhering, L. L. ;
Junqueira, V. S. ;
Peixoto, L. A. ;
Cruz, C. D. ;
Laviola, B. G. .
GENETICS AND MOLECULAR RESEARCH, 2015, 14 (03) :10888-10896
[5]   A Ranking Approach to Genomic Selection [J].
Blondel, Mathieu ;
Onogi, Akio ;
Iwata, Hiroyoshi ;
Ueda, Naonori .
PLOS ONE, 2015, 10 (06)
[6]   Gene expression inference with deep learning [J].
Chen, Yifei ;
Li, Yi ;
Narayan, Rajiv ;
Subramanian, Aravind ;
Xie, Xiaohui .
BIOINFORMATICS, 2016, 32 (12) :1832-1839
[7]   Genomic Selection in Plant Breeding: Methods, Models, and Perspectives [J].
Crossa, Jose ;
Perez-Rodriguez, Paulino ;
Cuevas, Jaime ;
Montesinos-Lopez, Osval ;
Jarquin, Diego ;
de los Campos, Gustavo ;
Burgueno, Juan ;
Gonzalez-Camacho, Juan M. ;
Perez-Elizalde, Sergio ;
Beyene, Yoseph ;
Dreisigacker, Susanne ;
Singh, Ravi ;
Zhang, Xuecai ;
Gowda, Manje ;
Roorkiwal, Manish ;
Rutkoski, Jessica ;
Varshney, Rajeev K. .
TRENDS IN PLANT SCIENCE, 2017, 22 (11) :961-975
[8]   Genomic Prediction of Gene Bank Wheat Landraces [J].
Crossa, Jose ;
Jarquin, Diego ;
Franco, Jorge ;
Perez-Rodriguez, Paulino ;
Burgueno, Juan ;
Saint-Pierre, Carolina ;
Vikram, Prashant ;
Sansaloni, Carolina ;
Petroli, Cesar ;
Akdemir, Deniz ;
Sneller, Clay ;
Reynolds, Matthew ;
Tattaris, Maria ;
Payne, Thomas ;
Guzman, Carlos ;
Pena, Roberto J. ;
Wenzl, Peter ;
Singh, Sukhwinder .
G3-GENES GENOMES GENETICS, 2016, 6 (07) :1819-1834
[9]   Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree [J].
de los Campos, Gustavo ;
Naya, Hugo ;
Gianola, Daniel ;
Crossa, Jose ;
Legarra, Andres ;
Manfredi, Eduardo ;
Weigel, Kent ;
Cotes, Jose Miguel .
GENETICS, 2009, 182 (01) :375-385
[10]   Genomic selection: genome-wide prediction in plant improvement [J].
Desta, Zeratsion Abera ;
Ortiz, Rodomiro .
TRENDS IN PLANT SCIENCE, 2014, 19 (09) :592-601