Sensitivity analysis based on the random forest machine learning algorithm identifies candidate genes for regulation of innate and adaptive immune response of chicken

被引:9
作者
Polewko-Klim, Aneta [1 ]
Lesinski, Wojciech [1 ]
Golinska, Agnieszka Kitlas [1 ]
Mnich, Krzysztof [2 ]
Siwek, Maria [3 ]
Rudnicki, Witold R. [1 ,2 ,4 ]
机构
[1] Univ Bialystok, Inst Comp Sci, Bialystok, Poland
[2] Univ Bialystok, Computat Ctr, Bialystok, Poland
[3] Univ Technol & Life Sci, Anim Biotechnol & Genet Dept, Bydgoszcz, Poland
[4] Univ Warsaw, Interdisciplinary Ctr Math & Computat Modelling, Warsaw, Poland
关键词
immune response; chicken; marker gene; machine learning; KINASE SIGNALING PATHWAYS; FEATURE-SELECTION; CLASSIFICATION; PROTEIN; POPULATIONS; HEMOCYANIN; SYSTEM; BORUTA; CELL; QTL;
D O I
10.1016/j.psj.2020.08.059
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Two categories of immune responses- innate and adaptive immunity-have both polygenic backgrounds and a significant environmental component. The goal of the reported study was to define candidate genes and mutations for the immune traits of interest in chickens using machine learning-based sensitivity analysis for single-nucleotide polymorphisms (SNPs) located in candidate genes defined in quantitative trait loci regions. Here the adaptive immunity is represented by the specific antibody response toward keyhole limpet hemocyanin (KLH), whereas the innate immunity was represented by natural antibodies toward lipopolysaccharide (LPS) and lipoteichoic acid (LTA). The analysis consisted of 3 basic steps: an identification of candidate SNPs via feature selection, an optimisation of the feature set using recursive feature elimination, and finally a gene-level sensitivity analysis for final selection of models. The predictive model based on 5 genes (MAPK8IP3 CRLF3, UNC13D, ILR9, and PRCKB) explains 14.9% of variance for KLH adaptive response. The models obtained for LTA and LPS use more genes and have lower predictive power, explaining respectively 7.8 and 4.5% of total variance. In comparison, the linear models built on genes identified by a standard statistical analysis explain 1.5, 0.5, and 0.3% of variance for KLH, LTA, and LPS response, respectively. The present study shows that machine learning methods applied to systems with a complex interaction network can discover phenotype-genotype associations with much higher sensitivity than traditional statistical models. It adds contribution to evidence suggesting a role of MAPK8IP3 in the adaptive immune response. It also indicates that CRLF3 is involved in this process as well. Both findings need additional verification.
引用
收藏
页码:6341 / 6354
页数:14
相关论文
共 44 条
[1]   Natural variation in Fc glycosylation of HIV-specific antibodies impacts antiviral activity [J].
Ackerman, Margaret E. ;
Crispin, Max ;
Yu, Xiaojie ;
Baruah, Kavitha ;
Boesch, Austin W. ;
Harvey, David J. ;
Dugast, Anne-Sophie ;
Heizen, Erin L. ;
Ercan, Altan ;
Choi, Ickwon ;
Streeck, Hendrik ;
Nigrovic, Peter A. ;
Bailey-Kellogg, Chris ;
Scanlan, Chris ;
Alter, Galit .
JOURNAL OF CLINICAL INVESTIGATION, 2013, 123 (05) :2183-2192
[2]  
Bliss J, 1996, J IMMUNOL, V156, P887
[3]   Exploiting SNP Correlations within Random Forest for Genome-Wide Association Studies [J].
Botta, Vincent ;
Louppe, Gilles ;
Geurts, Pierre ;
Wehenkel, Louis .
PLOS ONE, 2014, 9 (04)
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   A forest-based approach to identifying gene and gene-gene interactions [J].
Chen, Xiang ;
Liu, Ching-Ti ;
Zhang, Meizhuo ;
Zhang, Heping .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (49) :19199-19203
[6]  
Core Team R., 2020, R: A Language and Environment for Statistical Computing
[7]   Evaluation of variable selection methods for random forests and omics data sets [J].
Degenhardt, Frauke ;
Seifert, Stephan ;
Szymczak, Silke .
BRIEFINGS IN BIOINFORMATICS, 2019, 20 (02) :492-503
[8]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)
[9]   MAP kinases in the immune response [J].
Dong, C ;
Davis, RJ ;
Flavell, RA .
ANNUAL REVIEW OF IMMUNOLOGY, 2002, 20 :55-72
[10]   Monte Carlo feature selection for supervised classification [J].
Draminski, Michal ;
Rada-Iglesias, Alvaro ;
Enroth, Stefan ;
Wadelius, Claes ;
Koronacki, Jacek ;
Komorowski, Jan .
BIOINFORMATICS, 2008, 24 (01) :110-117