Evolutionarily informed machine learning enhances the power of predictive gene-to-phenotype relationships

被引:62
作者
Cheng, Chia-Yi [1 ,5 ]
Li, Ying [2 ,3 ]
Varala, Kranthi [2 ,3 ]
Bubert, Jessica [4 ]
Huang, Ji [1 ]
Kim, Grace J. [1 ]
Halim, Justin [1 ]
Arp, Jennifer [4 ]
Shih, Hung-Jui S. [1 ]
Levinson, Grace [1 ]
Park, Seo Hyun [1 ]
Cho, Ha Young [1 ]
Moose, Stephen P. [4 ]
Coruzzi, Gloria M. [1 ]
机构
[1] NYU, Ctr Genom & Syst Biol, Dept Biol, New York, NY 10003 USA
[2] Purdue Univ, Dept Hort & Landscape Architecture, W Lafayette, IN 47907 USA
[3] Purdue Univ, Purdue Ctr Plant Biol, W Lafayette, IN 47907 USA
[4] Univ Illinois, Dept Crop Sci, Urbana, IL 61801 USA
[5] Natl Taiwan Univ, Dept Life Sci, Taipei, Taiwan
基金
美国食品与农业研究所; 美国国家科学基金会;
关键词
NITROGEN USE EFFICIENCY; TRANSCRIPTION FACTORS; GRAIN-YIELD; FACTOR-Y; ARABIDOPSIS; ROLES; MUTAGENESIS; DIVERSITY; PLATFORM; REVEALS;
D O I
10.1038/s41467-021-25893-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Inferring phenotypic outcomes from genomic features is both a promise and challenge for systems biology. Using gene expression data to predict phenotypic outcomes, and functionally validating the genes with predictive powers are two challenges we address in this study. We applied an evolutionarily informed machine learning approach to predict phenotypes based on transcriptome responses shared both within and across species. Specifically, we exploited the phenotypic diversity in nitrogen use efficiency and evolutionarily conserved transcriptome responses to nitrogen treatments across Arabidopsis accessions and maize varieties. We demonstrate that using evolutionarily conserved nitrogen responsive genes is a biologically principled approach to reduce the feature dimensionality in machine learning that ultimately improved the predictive power of our gene-to-trait models. Further, we functionally validated seven candidate transcription factors with predictive power for NUE outcomes in Arabidopsis and one in maize. Moreover, application of our evolutionarily informed pipeline to other species including rice and mice models underscores its potential to uncover genes affecting any physiological or clinical traits of interest across biology, agriculture, or medicine. Predicting complex phenotypes from genomic information is still a challenge. Here, the authors use an evolutionarily informed machine learning approach within and across species to predict genes affecting nitrogen utilization in crops, and show their approach is also useful in mammalian systems.
引用
收藏
页数:15
相关论文
共 72 条
[1]   Genome-wide Insertional mutagenesis of Arabidopsis thaliana [J].
Alonso, JM ;
Stepanova, AN ;
Leisse, TJ ;
Kim, CJ ;
Chen, HM ;
Shinn, P ;
Stevenson, DK ;
Zimmerman, J ;
Barajas, P ;
Cheuk, R ;
Gadrinab, C ;
Heller, C ;
Jeske, A ;
Koesema, E ;
Meyers, CC ;
Parker, H ;
Prednis, L ;
Ansari, Y ;
Choy, N ;
Deen, H ;
Geralt, M ;
Hazari, N ;
Hom, E ;
Karnes, M ;
Mulholland, C ;
Ndubaku, R ;
Schmidt, I ;
Guzman, P ;
Aguilar-Henonin, L ;
Schmid, M ;
Weigel, D ;
Carter, DE ;
Marchand, T ;
Risseeuw, E ;
Brogden, D ;
Zeko, A ;
Crosby, WL ;
Berry, CC ;
Ecker, JR .
SCIENCE, 2003, 301 (5633) :653-657
[2]   The curse(s) of dimensionality [J].
Altman, Naomi ;
Krzywinski, Martin .
NATURE METHODS, 2018, 15 (06) :399-400
[3]  
Arp J. J, 2017, THESIS U ILLINOIS UR
[4]   WRKY transcription factors Jack of many trades in plants [J].
Bakshi, Madhunita ;
Oelmueller, Ralf .
PLANT SIGNALING & BEHAVIOR, 2014, 9 (02)
[5]  
Beatty PH, 2018, Engineering Nitrogen Utilization in Crop Plants, P15, DOI [10.1007/978-3-319- 92958-3_2, DOI 10.1007/978-3-319-92958-3_2, 10.1007/978-3-319-92958-32]
[6]   Genetic relatedness of previously Plant-Variety-Protected commercial maize inbreds [J].
Beckett, Travis J. ;
Morales, A. Jason ;
Koehler, Klaus L. ;
Rocheford, Torbert R. .
PLOS ONE, 2017, 12 (12)
[7]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[8]   Computational translation of genomic responses from experimental model systems to humans [J].
Brubaker, Douglas K. ;
Proctor, Elizabeth A. ;
Haigis, Kevin M. ;
Lauffenburger, Douglas A. .
PLOS COMPUTATIONAL BIOLOGY, 2019, 15 (01)
[9]  
Bruessow F., 2019, PLANT BIOL, DOI [10.1101/768911, DOI 10.1101/768911]
[10]   Dimension Reduction: A Guided Tour [J].
Burges, Christopher J. C. .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2010, 2 (04) :275-365