learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data

被引:6
作者
Westhues, Cathy C. [1 ,2 ]
Simianer, Henner [2 ,3 ]
Beissinger, Timothy M. [1 ,2 ]
机构
[1] Univ Goettingen, Dept Crop Sci, Div Plant Breeding Methodol, Carl Sprengel Weg 1, D-37075 Gottingen, Germany
[2] Univ Goettingen, Ctr Integrated Breeding Res, Carl Sprengel Weg 1, Gottingen, Germany
[3] Univ Goettingen, Dept Anim Sci, Anim Breeding & Genet Grp, Albrecht Thaer Weg 3, D-37075 Gottingen, Germany
来源
G3-GENES GENOMES GENETICS | 2022年 / 12卷 / 11期
关键词
multienvironment trials; machine learning; genotype x; environment interaction; genomic prediction; R software; SELECTION; REGRESSION; PLANT;
D O I
10.1093/g3journal/jkac226
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or to retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated over specific periods of time based on naive (for instance, nonoverlapping 10-day windows) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient-boosted decision trees, random forests, stacked ensemble models, and multilayer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with multi-environment trial experimental data in a user-friendly way. The package is published under an MIT license and accessible on GitHub.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Comparing Machine Learning to Regression Methods for Mortality Prediction Using Veterans Affairs Electronic Health Record Clinical Data
    Jing, Bocheng
    Boscardin, W. John
    Deardorff, W. James
    Jeon, Sun Young
    Lee, Alexandra K.
    Donovan, Anne L.
    Lee, Sei J.
    MEDICAL CARE, 2022, 60 (06) : 470 - 479
  • [42] Multi-agent Environment for Decision-Support in Production Systems Using Machine Learning Methods
    Kozlak, Jaroslaw
    Sniezynski, Bartlomiej
    Wilk-Kolodziejczyk, Dorota
    Lesniak, Albert
    Jaskowiec, Krzysztof
    COMPUTATIONAL SCIENCE - ICCS 2019, PT II, 2019, 11537 : 517 - 529
  • [43] Prediction of Field-Scale Wheat Yield Using Machine Learning Method and Multi-Spectral UAV Data
    Bian, Chaofa
    Shi, Hongtao
    Wu, Suqin
    Zhang, Kefei
    Wei, Meng
    Zhao, Yindi
    Sun, Yaqin
    Zhuang, Huifu
    Zhang, Xuewei
    Chen, Shuo
    REMOTE SENSING, 2022, 14 (06)
  • [44] Multi-breed genomic prediction using Bayes R with sequence data and dropping variants with a small effect
    van den Berg, Irene
    Bowman, Phil J.
    MacLeod, Iona M.
    Hayes, Ben J.
    Wang, Tingting
    Bolormaa, Sunduimijid
    Goddard, Mike E.
    GENETICS SELECTION EVOLUTION, 2017, 49
  • [45] Prediction of Recurrent Ischemic Stroke Using Registry Data and Machine Learning Methods: The Erlangen Stroke Registry
    Vodencarevic, Asmir
    Weingaertner, Michael
    Caro, J. Jaime
    Ukalovic, Dubravka
    Zimmermann-Rittereiser, Marcus
    Schwab, Stefan
    Kolominsky-Rabas, Peter
    STROKE, 2022, 53 (07) : 2299 - 2306
  • [46] Minimal data requirements for accurate compound activity prediction using machine learning methods of different complexity
    Siemers, Friederike Maite
    Feldmann, Christian
    Bajorath, Juergen
    CELL REPORTS PHYSICAL SCIENCE, 2022, 3 (11):
  • [47] Air Quality Class Prediction Using Machine Learning Methods Based on Monitoring Data and Secondary Modeling
    Liu, Qian
    Cui, Bingyan
    Liu, Zhen
    ATMOSPHERE, 2024, 15 (05)
  • [48] assignPOP: An R package for population assignment using genetic, non-genetic, or integrated data in a machine-learning framework
    Chen, Kuan-Yu
    Marschall, Elizabeth A.
    Sovic, Michael G.
    Fries, Anthony C.
    Gibbs, H. Lisle
    Ludsin, Stuart A.
    METHODS IN ECOLOGY AND EVOLUTION, 2018, 9 (02): : 439 - 446
  • [49] Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning
    Wang, Panru
    Zhang, Junying
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2025, 26 (01)
  • [50] High-Dimensional Multi-trait GWAS By Reverse Prediction of Genotypes Using Machine Learning Methods
    Malik, Muhammad Ammar
    Ludl, Adriaan-Alexander
    Michoel, Tom
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2021, 2022, 13483 : 79 - 93