Predictive Modeling of Microbiome Data Using a Phylogeny-Regularized Generalized Linear Mixed Model

被引:33
作者
Xiao, Jian [1 ,2 ,3 ]
Chen, Li [4 ]
Johnson, Stephen [1 ,2 ]
Yu, Yue [1 ,2 ]
Zhang, Xianyang [5 ]
Chen, Jun [1 ,2 ]
机构
[1] Mayo Clin, Div Biomed Stat & Informat, Rochester, MN 55905 USA
[2] Mayo Clin, Ctr Individualized Med, Rochester, MN 55905 USA
[3] Zhongnan Univ Econ & Law, Sch Stat & Math, Wuhan, Hubei, Peoples R China
[4] Auburn Univ, Harrison Sch Pharm, Dept Hlth Outcomes Res & Policy, Auburn, AL 36849 USA
[5] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
来源
FRONTIERS IN MICROBIOLOGY | 2018年 / 9卷
基金
中国国家自然科学基金;
关键词
microbiome; phylogenetic tree; kernel method; generalized mixed model; predictive model; HUMAN GUT MICROBIOME; VARIABLE SELECTION; REGRESSION; CLASSIFICATION; INDIVIDUALS; ASSOCIATION; INFERENCE; UNIFRAC; HEALTH; MATRIX;
D O I
10.3389/fmicb.2018.01391
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Recent human microbiome studies have revealed an essential role of the human microbiome in health and disease, opening up the possibility of building microbiome-based predictive models for individualized medicine. One unique characteristic of microbiome data is the existence of a phylogenetic tree that relates all the microbial species. It has frequently been observed that a cluster or clusters of bacteria at varying phylogenetic depths are associated with some clinical or biological outcome due to shared biological function (clustered signal). Moreover, in many cases, we observe a community-level change, where a large number of functionally interdependent species are associated with the outcome (dense signal). We thus develop "glmmTree," a prediction method based on a generalized linear mixed model framework, for capturing clustered and dense microbiome signals. glmmTree uses the similarity between microbiomes, which is defined based on the microbiome composition and the phylogenetic tree, to predict the outcome. The effects of other predictive variables (e.g., age, sex) can be incorporated readily in the regression framework. Additional tuning parameters enable a data-adaptive approach to capture signals at different phylogenetic depth and abundance level. Simulation studies and real data applications demonstrated that "glmmTree" outperformed existing methods in the dense and clustered signal scenarios.
引用
收藏
页数:14
相关论文
共 68 条
  • [1] Mining the Human Gut Microbiota for Effector Strains that Shape the Immune System
    Ahern, Philip P.
    Faith, Jeremiah J.
    Gordon, Jeffrey I.
    [J]. IMMUNITY, 2014, 40 (06) : 815 - 823
  • [2] Human Gut Microbiome and Risk for Colorectal Cancer
    Ahn, Jiyoung
    Sinha, Rashmi
    Pei, Zhiheng
    Dominianni, Christine
    Wu, Jing
    Shi, Jianxin
    Goedert, James J.
    Hayes, Richard B.
    Yang, Liying
    [J]. JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2013, 105 (24): : 1907 - 1911
  • [3] [Anonymous], 2004, KERNEL METHODS PATTE
  • [4] APPROXIMATE INFERENCE IN GENERALIZED LINEAR MIXED MODELS
    BRESLOW, NE
    CLAYTON, DG
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) : 9 - 25
  • [5] Analysis of Fusobacterium persistence and antibiotic response in colorectal cancer
    Bullman, Susan
    Pedamallu, Chandra S.
    Sicinska, Ewa
    Claney, Thomas E.
    Zhang, Xiaoyang
    Cai, Diana
    Neuberg, Donna
    Huang, Katherine
    Guevara, Fatima
    Nelson, Timothy
    Chipashvili, Otari
    Hagan, Timothy
    Walker, Mark
    Ramachandran, Aruna
    Diosdado, Begona
    Serna, Garazi
    Mulet, Nuria
    Landolfi, Stefania
    Ramon y Cajal, Santiago
    Fasani, Roberta
    Aguirre, Andrew J.
    Ng, Kimmie
    Elez, Elena
    Ogino, Shuji
    Tabernero, Josep
    Fuchs, Charles S.
    Hahn, William C.
    Nuciforo, Paolo
    Meyerson, Matthew
    [J]. SCIENCE, 2017, 358 (6369) : 1443 - +
  • [6] Emerging roles of the microbiome in cancer
    Bultman, Scott J.
    [J]. CARCINOGENESIS, 2014, 35 (02) : 249 - 255
  • [7] Exact sequence variants should replace operational taxonomic units in marker-gene data analysis
    Callahan, Benjamin J.
    McMurdie, Paul J.
    Holmes, Susan P.
    [J]. ISME JOURNAL, 2017, 11 (12) : 2639 - 2643
  • [8] Callahan BJ, 2016, NAT METHODS, V13, P581, DOI [10.1038/NMETH.3869, 10.1038/nmeth.3869]
  • [9] Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers
    Charlson, Emily S.
    Chen, Jun
    Custers-Allen, Rebecca
    Bittinger, Kyle
    Li, Hongzhe
    Sinha, Rohini
    Hwang, Jennifer
    Bushman, Frederic D.
    Collman, Ronald G.
    [J]. PLOS ONE, 2010, 5 (12):
  • [10] An omnibus test for differential distribution analysis of microbiome sequencing data
    Chen, Jun
    King, Emily
    Deek, Rebecca
    Wei, Zhi
    Yu, Yue
    Grill, Diane
    Ballman, Karla
    [J]. BIOINFORMATICS, 2018, 34 (04) : 643 - 651