MicroHDF: predicting host phenotypes with metagenomic data using a deep forest-based framework

被引:1
作者
Shi, Kai [1 ,2 ]
Liu, Qiaohui [1 ]
Ji, Qingrong [1 ]
He, Qisheng [1 ]
Zhao, Xing-Ming [3 ,4 ]
机构
[1] Guilin Univ Technol, Coll Comp Sci & Engn, Guilin 541004, Gaungxi, Peoples R China
[2] Guilin Univ Technol, Guangxi Key Lab Embedded Technol & Intelligent Sys, Guilin 541004, Gaungxi, Peoples R China
[3] Huzhou Univ, Affiliated Cent Hosp, Huzhou Cent Hosp, Huzhou 313000, Zhejiang, Peoples R China
[4] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
disease prediction; gut microbiome; machine learning; metagenomics; phylogenetic tree; HUMAN GUT MICROBIOME; ASSOCIATION; COLITIS;
D O I
10.1093/bib/bbae530
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The gut microbiota plays a vital role in human health, and significant effort has been made to predict human phenotypes, especially diseases, with the microbiota as a promising indicator or predictor with machine learning (ML) methods. However, the accuracy is impacted by a lot of factors when predicting host phenotypes with the metagenomic data, e.g. small sample size, class imbalance, high-dimensional features, etc. To address these challenges, we propose MicroHDF, an interpretable deep learning framework to predict host phenotypes, where a cascade layers of deep forest units is designed for handling sample class imbalance and high dimensional features. The experimental results show that the performance of MicroHDF is competitive with that of existing state-of-the-art methods on 13 publicly available datasets of six different diseases. In particular, it performs best with the area under the receiver operating characteristic curve of 0.9182 +/- 0.0098 and 0.9469 +/- 0.0076 for inflammatory bowel disease (IBD) and liver cirrhosis, respectively. Our MicroHDF also shows better performance and robustness in cross-study validation. Furthermore, MicroHDF is applied to two high-risk diseases, IBD and autism spectrum disorder, as case studies to identify potential biomarkers. In conclusion, our method provides an effective and reliable prediction of the host phenotype and discovers informative features with biological insights.
引用
收藏
页数:13
相关论文
共 69 条
  • [31] A statistical model for describing and simulating microbial community profiles
    Ma, Siyuan
    Ren, Boyu
    Mallick, Himel
    Moon, Yo Sup
    Schwager, Emma
    Maharjan, Sagun
    Tickle, Timothy L.
    Lu, Yiren
    Carmody, Rachel N.
    Franzosa, Eric A.
    Janson, Lucas
    Huttenhower, Curtis
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (09)
  • [32] The association between emotional and behavioral problems and gastrointestinal symptoms among children with high-functioning autism
    Mazefsky, Carla A.
    Schreiber, Dana R.
    Olino, Thomas M.
    Minshew, Nancy J.
    [J]. AUTISM, 2014, 18 (05) : 493 - 501
  • [33] Machine learning and deep learning applications in microbiome research
    Medina, Ricardo Hernandez
    Kutuzova, Svetlana
    Nielsen, Knud Nor
    Johansen, Joachim
    Hansen, Lars Hestbjerg
    Nielsen, Mads
    Rasmussen, Simon
    [J]. ISME COMMUNICATIONS, 2022, 2 (01):
  • [34] The gut microbiota-brain axis in behaviour and brain disorders
    Morais, Livia H.
    Schreiber, Henry L.
    Mazmanian, Sarkis K.
    [J]. NATURE REVIEWS MICROBIOLOGY, 2021, 19 (04) : 241 - 255
  • [35] Multi-level analysis of the gut-brain axis shows autism spectrum disorder-associated molecular and microbial profiles
    Morton, James T.
    Jin, Dong-Min
    Mills, Robert H.
    Shao, Yan
    Rahman, Gibraan
    McDonald, Daniel
    Zhu, Qiyun
    Balaban, Metin
    Jiang, Yueyu
    Cantrell, Kalen
    Gonzalez, Antonio
    Carmel, Julie
    Frankiensztajn, Linoy Mia
    Martin-Brevet, Sandra
    Berding, Kirsten
    Needham, Brittany D.
    Zurita, Maria Fernanda
    David, Maude
    Averina, Olga V.
    Kovtun, Alexey S.
    Noto, Antonio
    Mussap, Michele
    Wang, Mingbang
    Frank, Daniel N.
    Li, Ellen
    Zhou, Wenhao
    Fanos, Vassilios
    Danilenko, Valery N.
    Wall, Dennis P.
    Cardenas, Paul
    Baldeon, Manuel E.
    Jacquemont, Sebastien
    Koren, Omry
    Elliott, Evan
    Xavier, Ramnik J.
    Mazmanian, Sarkis K.
    Knight, Rob
    Gilbert, Jack A.
    Donovan, Sharon M.
    Lawley, Trevor D.
    Carpenter, Bob
    Bonneau, Richard
    Taroncher-Oldenburg, Gaspar
    [J]. NATURE NEUROSCIENCE, 2023, 26 (07) : 1208 - +
  • [36] MarkerML - Marker Feature Identification in Metagenomic Datasets Using Interpretable Machine Learning
    Nagpal, Sunil
    Singh, Rohan
    Taneja, Bhupesh
    Mande, Sharmila S.
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2022, 434 (11)
  • [37] Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes
    Nielsen, H. Bjorn
    Almeida, Mathieu
    Juncker, Agnieszka Sierakowska
    Rasmussen, Simon
    Li, Junhua
    Sunagawa, Shinichi
    Plichta, Damian R.
    Gautier, Laurent
    Pedersen, Anders G.
    Le Chatelier, Emmanuelle
    Pelletier, Eric
    Bonde, Ida
    Nielsen, Trine
    Manichanh, Chaysavanh
    Arumugam, Manimozhiyan
    Batto, Jean-Michel
    dos Santos, Marcelo B. Quintanilha
    Blom, Nikolaj
    Borruel, Natalia
    Burgdorf, Kristoffer S.
    Boumezbeur, Fouad
    Casellas, Francesc
    Dore, Joel
    Dworzynski, Piotr
    Guarner, Francisco
    Hansen, Torben
    Hildebrand, Falk
    Kaas, Rolf S.
    Kennedy, Sean
    Kristiansen, Karsten
    Kultima, Jens Roat
    Leonard, Pierre
    Levenez, Florence
    Lund, Ole
    Moumen, Bouziane
    Le Paslier, Denis
    Pons, Nicolas
    Pedersen, Oluf
    Prifti, Edi
    Qin, Junjie
    Raes, Jeroen
    Sorensen, Soren
    Tap, Julien
    Tims, Sebastian
    Ussery, David W.
    Yamada, Takuji
    Renault, Pierre
    Sicheritz-Ponten, Thomas
    Bork, Peer
    Wang, Jun
    [J]. NATURE BIOTECHNOLOGY, 2014, 32 (08) : 822 - 828
  • [38] The gut microbiome and hypertension
    O'Donnell, Joanne A.
    Zheng, Tenghao
    Meric, Guillaume
    Marques, Francine Z.
    [J]. NATURE REVIEWS NEPHROLOGY, 2023, 19 (03) : 153 - 167
  • [39] DeepMicro: deep representation learning for disease prediction based on microbiome data
    Oh, Min
    Zhang, Liqing
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [40] Machine learning approaches in microbiome research: challenges and best practices
    Papoutsoglou, Georgios
    Tarazona, Sonia
    Lopes, Marta B.
    Klammsteiner, Thomas
    Ibrahimi, Eliana
    Eckenberger, Julia
    Novielli, Pierfrancesco
    Tonda, Alberto
    Simeon, Andrea
    Shigdel, Rajesh
    Bereux, Stephane
    Vitali, Giacomo
    Tangaro, Sabina
    Lahti, Leo
    Temko, Andriy
    Claesson, Marcus J.
    Berland, Magali
    [J]. FRONTIERS IN MICROBIOLOGY, 2023, 14