Deep in the Bowel: Highly Interpretable Neural Encoder-Decoder Networks Predict Gut Metabolites from Gut Microbiome

被引:42
作者
Le, Vuong [1 ]
Quinn, Thomas P. [1 ]
Tran, Truyen [1 ]
Venkatesh, Svetha [1 ]
机构
[1] Deakin Univ, Appl AI Inst, Geelong, Vic, Australia
关键词
Metabolomics; Multi-omics; Machine learning; Deep learning; Interpretability; MULTI-OMICS;
D O I
10.1186/s12864-020-6652-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Technological advances in next-generation sequencing (NGS) and chromatographic assays [e.g., liquid chromatography mass spectrometry (LC-MS)] have made it possible to identify thousands of microbe and metabolite species, and to measure their relative abundance. In this paper, we propose a sparse neural encoder-decoder network to predict metabolite abundances from microbe abundances. Results Using paired data from a cohort of inflammatory bowel disease (IBD) patients, we show that our neural encoder-decoder model outperforms linear univariate and multivariate methods in terms of accuracy, sparsity, and stability. Importantly, we show that our neural encoder-decoder model is not simply a black box designed to maximize predictive accuracy. Rather, the network's hidden layer (i.e., the latent space, comprised only of sparsely weighted microbe counts) actually captures key microbe-metabolite relationships that are themselves clinically meaningful. Although this hidden layer is learned without any knowledge of the patient's diagnosis, we show that the learned latent features are structured in a way that predicts IBD and treatment status with high accuracy. Conclusions By imposing a non-negative weights constraint, the network becomes a directed graph where each downstream node is interpretable as the additive combination of the upstream nodes. Here, the middle layer comprises distinct microbe-metabolite axes that relate key microbial biomarkers with metabolite biomarkers. By pre-processing the microbiome and metabolome data using compositional data analysis methods, we ensure that our proposed multi-omics workflow will generalize to any pair of -omics data. To the best of our knowledge, this work is the first application of neural encoder-decoders for the interpretable integration of multi-omics biological data.
引用
收藏
页数:15
相关论文
共 45 条
[1]  
Aitchison J., 1986, The statistical analysis of compositional data
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]  
Chamberlain Scott A, 2013, F1000Res, V2, P191, DOI 10.12688/f1000research.2-191.v1
[4]   Gene expression inference with deep learning [J].
Chen, Yifei ;
Li, Yi ;
Narayan, Rajiv ;
Subramanian, Aravind ;
Xie, Xiaohui .
BIOINFORMATICS, 2016, 32 (12) :1832-1839
[5]   Opportunities and obstacles for deep learning in biology and medicine [J].
Ching, Travers ;
Himmelstein, Daniel S. ;
Beaulieu-Jones, Brett K. ;
Kalinin, Alexandr A. ;
Do, Brian T. ;
Way, Gregory P. ;
Ferrero, Enrico ;
Agapow, Paul-Michael ;
Zietz, Michael ;
Hoffman, Michael M. ;
Xie, Wei ;
Rosen, Gail L. ;
Lengerich, Benjamin J. ;
Israeli, Johnny ;
Lanchantin, Jack ;
Woloszynek, Stephen ;
Carpenter, Anne E. ;
Shrikumar, Avanti ;
Xu, Jinbo ;
Cofer, Evan M. ;
Lavender, Christopher A. ;
Turaga, Srinivas C. ;
Alexandari, Amr M. ;
Lu, Zhiyong ;
Harris, David J. ;
DeCaprio, Dave ;
Qi, Yanjun ;
Kundaje, Anshul ;
Peng, Yifan ;
Wiley, Laura K. ;
Segler, Marwin H. S. ;
Boca, Simina M. ;
Swamidass, S. Joshua ;
Huang, Austin ;
Gitter, Anthony ;
Greene, Casey S. .
JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2018, 15 (141)
[6]   Learning Understandable Neural Networks With Nonnegative Weight Constraints [J].
Chorowski, Jan ;
Zurada, Jacek M. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) :62-69
[7]   Meta-analysis of gut microbiome studies identifies disease-specific and shared responses [J].
Duvallet, Claire ;
Gibbons, Sean M. ;
Gurry, Thomas ;
Irizarry, Rafael A. ;
Alm, Eric J. .
NATURE COMMUNICATIONS, 2017, 8
[8]   Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis [J].
Fernandes, Andrew D. ;
Reid, Jennifer N. S. ;
Macklaim, Jean M. ;
McMurrough, Thomas A. ;
Edgell, David R. ;
Gloor, Gregory B. .
MICROBIOME, 2014, 2
[9]  
Frankle J., 2019, ICLR
[10]   Gut microbiome structure and metabolic activity in inflammatory bowel disease [J].
Franzosa, Eric A. ;
Sirota-Madi, Alexandra ;
Avila-Pacheco, Julian ;
Fornelos, Nadine ;
Haiser, Henryj ;
Reinker, Stefan ;
Vatanen, Tommi ;
Hall, A. Brantley ;
Mallick, Himel ;
Mclver, Lauren J. ;
Sauk, Jenny S. ;
Wilson, Robin G. ;
Stevens, Betsy W. ;
Scott, Justin M. ;
Pierce, Kerry ;
Deik, Amy A. ;
Bullock, Kevin ;
Imhann, Floris ;
Porter, Jeffrey A. ;
Zhernakova, Alexandra ;
Fu, Jingyuan ;
Weersma, Rinse K. ;
Wijmenga, Cisca ;
Clish, Clary B. ;
Vlamakis, Hera ;
Huttenhower, Curtis ;
Xavier, Ramnik J. .
NATURE MICROBIOLOGY, 2019, 4 (02) :293-305