Graph Embedding Deep Learning Guides Microbial Biomarkers' Identification

被引:24
作者
Zhu, Qiang [1 ,2 ]
Jiang, Xingpeng [2 ,3 ]
Zhu, Qing [2 ,3 ]
Pan, Min [2 ,3 ]
He, Tingting [2 ,3 ]
机构
[1] Cent China Normal Univ, Sch Informat Management, Wuhan, Hubei, Peoples R China
[2] Cent China Normal Univ, Sch Comp, Wuhan, Hubei, Peoples R China
[3] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smar, Wuhan, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
WIDE ASSOCIATION; METAGENOME;
D O I
10.3389/fgene.2019.01182
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The microbiome-wide association studies are to figure out the relationship between microorganisms and humans, with the goal of discovering relevant biomarkers to guide disease diagnosis. However, the microbiome data is complex, with high noise and dimensions. Traditional machine learning methods are limited by the models' representation ability and cannot learn complex patterns from the data. Recently, deep learning has been widely applied to fields ranging from text processing to image recognition due to its efficient flexibility and high capacity. But the deep learning models must be trained with enough data in order to achieve good performance, which is impractical in reality. In addition, deep learning is considered as black box and hard to interpret. These factors make deep learning not widely used in microbiome-wide association studies. In this work, we construct a sparse microbial interaction network and embed this graph into deep model to alleviate the risk of overfitting and improve the performance. Further, we explore a Graph Embedding Deep Feedforward Network (GEDFN) to conduct feature selection and guide meaningful microbial markers' identification. Based on the experimental results, we verify the feasibility of combining the microbial graph model with the deep learning model, and demonstrate the feasibility of applying deep learning and feature selection on microbial data. Our main contributions are: firstly, we utilize different methods to construct a variety of microbial interaction networks and combine the network via graph embedding deep learning. Secondly, we introduce a feature selection method based on graph embedding and validate the biological meaning of microbial markers. The code is available at https://github.com/MicroAVA/GEDFN.git.
引用
收藏
页数:11
相关论文
共 43 条
[1]   A practical tool for maximal information coefficient analysis [J].
Albanese, Davide ;
Riccadonna, Samantha ;
Donati, Claudio ;
Franceschi, Pietro .
GIGASCIENCE, 2018, 7 (04) :1-8
[2]   Deep learning for computational biology [J].
Angermueller, Christof ;
Parnamaa, Tanel ;
Parts, Leopold ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2016, 12 (07)
[3]   AN ORDINATION OF THE UPLAND FOREST COMMUNITIES OF SOUTHERN WISCONSIN [J].
BRAY, JR ;
CURTIS, JT .
ECOLOGICAL MONOGRAPHS, 1957, 27 (04) :326-349
[4]   The human metagenome: our other genome? [J].
Bruels, Thomas ;
Weissenbach, Jean .
HUMAN MOLECULAR GENETICS, 2011, 20 :R142-R148
[5]   Next-Generation Machine Learning for Biological Networks [J].
Camacho, Diogo M. ;
Collins, Katherine M. ;
Powers, Rani K. ;
Costello, James C. ;
Collins, James J. .
CELL, 2018, 173 (07) :1581-1592
[6]   Opportunities and obstacles for deep learning in biology and medicine [J].
Ching, Travers ;
Himmelstein, Daniel S. ;
Beaulieu-Jones, Brett K. ;
Kalinin, Alexandr A. ;
Do, Brian T. ;
Way, Gregory P. ;
Ferrero, Enrico ;
Agapow, Paul-Michael ;
Zietz, Michael ;
Hoffman, Michael M. ;
Xie, Wei ;
Rosen, Gail L. ;
Lengerich, Benjamin J. ;
Israeli, Johnny ;
Lanchantin, Jack ;
Woloszynek, Stephen ;
Carpenter, Anne E. ;
Shrikumar, Avanti ;
Xu, Jinbo ;
Cofer, Evan M. ;
Lavender, Christopher A. ;
Turaga, Srinivas C. ;
Alexandari, Amr M. ;
Lu, Zhiyong ;
Harris, David J. ;
DeCaprio, Dave ;
Qi, Yanjun ;
Kundaje, Anshul ;
Peng, Yifan ;
Wiley, Laura K. ;
Segler, Marwin H. S. ;
Boca, Simina M. ;
Swamidass, S. Joshua ;
Huang, Austin ;
Gitter, Anthony ;
Greene, Casey S. .
JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2018, 15 (141)
[7]   Minimum redundancy feature selection from microarray gene expression data [J].
Ding, C ;
Peng, HC .
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, :523-528
[8]   Deep learning: new computational modelling techniques for genomics [J].
Eraslan, Gokcen ;
Avsec, Ziga ;
Gagneur, Julien ;
Theis, Fabian J. .
NATURE REVIEWS GENETICS, 2019, 20 (07) :389-403
[9]  
Faust Karoline, 2016, F1000Res, V5, P1519
[10]   Microbial Co-occurrence Relationships in the Human Microbiome [J].
Faust, Karoline ;
Sathirapongsasuti, J. Fah ;
Izard, Jacques ;
Segata, Nicola ;
Gevers, Dirk ;
Raes, Jeroen ;
Huttenhower, Curtis .
PLOS COMPUTATIONAL BIOLOGY, 2012, 8 (07)