Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data

被引:20
作者
Chen, Weihao [1 ,2 ]
Alexandre, Pamela A. [2 ]
Ribeiro, Gabriela [3 ]
Fukumasu, Heidge [3 ]
Sun, Wei [1 ,4 ,5 ]
Reverter, Antonio [2 ]
Li, Yutao [2 ]
机构
[1] Yangzhou Univ, Coll Anim Sci & Technol, Yangzhou, Jiangsu, Peoples R China
[2] CSIRO Agr & Food, St Lucia, Qld, Australia
[3] Univ Sao Paulo, Sch Anim Sci & Food Engn, Pirassununga, Brazil
[4] Yangzhou Univ, Inst Agr Sci & Technol Dev, Yangzhou, Jiangsu, Peoples R China
[5] Yangzhou Univ, Joint Int Res Lab Agr & Agri Prod Safety, Minist Educ, Yangzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
residual feed intake; Bos indicus; co-expression network; RNA-seq; Random Forest; Extreme Gradient Boosting; supporting vector machine; SINGLE NUCLEOTIDE POLYMORPHISMS; MOLECULAR-BASIS; CLASSIFICATION; GROWTH; PERFORMANCE; MODELS;
D O I
10.3389/fgene.2021.619857
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Machine learning (ML) methods have shown promising results in identifying genes when applied to large transcriptome datasets. However, no attempt has been made to compare the performance of combining different ML methods together in the prediction of high feed efficiency (HFE) and low feed efficiency (LFE) animals. In this study, using RNA sequencing data of five tissues (adrenal gland, hypothalamus, liver, skeletal muscle, and pituitary) from nine HFE and nine LFE Nellore bulls, we evaluated the prediction accuracies of five analytical methods in classifying FE animals. These included two conventional methods for differential gene expression (DGE) analysis (t-test and edgeR) as benchmarks, and three ML methods: Random Forests (RFs), Extreme Gradient Boosting (XGBoost), and combination of both RF and XGBoost (RX). Utility of a subset of candidate genes selected from each method for classification of FE animals was assessed by support vector machine (SVM). Among all methods, the smallest subsets of genes (117) identified by RX outperformed those chosen by t-test, edgeR, RF, or XGBoost in classification accuracy of animals. Gene co-expression network analysis confirmed the interactivity existing among these genes and their relevance within the network related to their prediction ranking based on ML. The results demonstrate a great potential for applying a combination of ML methods to large transcriptome datasets to identify biologically important genes for accurately classifying FE animals.
引用
收藏
页数:12
相关论文
共 54 条
[1]   The metabolic characteristics of susceptibility to wooden breast disease in chickens with high feed efficiency [J].
Abasht, Behnam ;
Zhou, Nan ;
Lee, William R. ;
Zhuo, Zhu ;
Peripolli, Elisa .
POULTRY SCIENCE, 2019, 98 (08) :3246-3256
[2]   Identification of single nucleotide polymorphisms in genes involved in digestive and metabolic processes associated with feed efficiency and performance traits in beef cattle [J].
Abo-Ismail, M. K. ;
Kelly, M. J. ;
Squires, E. J. ;
Swanson, K. C. ;
Bauck, S. ;
Miller, S. P. .
JOURNAL OF ANIMAL SCIENCE, 2013, 91 (06) :2512-2529
[3]   Systems Biology Reveals NR2F6 and TGFB1 as Key Regulators of Feed Efficiency in Beef Cattle [J].
Alexandre, Pamela A. ;
Naval-Sanchez, Marina ;
Porto-Neto, Laercio R. ;
Ferraz, Jose Bento S. ;
Reverter, Antonio ;
Fukumasu, Heidge .
FRONTIERS IN GENETICS, 2019, 10
[4]   Liver transcriptomic networks reveal main biological processes associated with feed efficiency in beef cattle [J].
Alexandre, Pamela A. ;
Kogelman, Lisette J. A. ;
Santana, Miguel H. A. ;
Passarelli, Danielle ;
Pulz, Lidia H. ;
Fantinato-Neto, Paulo ;
Silva, Paulo L. ;
Leme, Paulo R. ;
Strefezzi, Ricardo F. ;
Coutinho, Luiz L. ;
Ferraz, Jose B. S. ;
Eler, Joanie P. ;
Kadarmideen, Haja N. ;
Fukumasu, Heidge .
BMC GENOMICS, 2015, 16
[5]  
[Anonymous], 2011, ENCY MACHINE LEARNIN
[6]  
Archer JA, 1997, J ANIM SCI, V75, P2024
[7]   Relationships among carbon dioxide, feed intake, and feed efficiency traits in ad libitum fed beef cattle [J].
Arthur, Paul F. ;
Bird-Gardiner, Tracie ;
Barchia, Idris M. ;
Donoghue, Kath A. ;
Herd, Robert M. .
JOURNAL OF ANIMAL SCIENCE, 2018, 96 (11) :4859-4867
[8]   Computing topological parameters of biological networks [J].
Assenov, Yassen ;
Ramirez, Fidel ;
Schelhorn, Sven-Eric ;
Lengauer, Thomas ;
Albrecht, Mario .
BIOINFORMATICS, 2008, 24 (02) :282-284
[9]   Genome-Wide Epistatic Interaction Networks Affecting Feed Efficiency in Duroc and Landrace Pigs [J].
Banerjee, Priyanka ;
Carmelo, Victor Adriano Okstoft ;
Kadarmideen, Haja N. .
FRONTIERS IN GENETICS, 2020, 11
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32