Improving Disease Prediction using Shallow Convolutional Neural Networks on Metagenomic Data Visualizations based on Mean-Shift Clustering Algorithm

被引:0
作者
Hai Thanh Nguyen [1 ]
Toan Bao Tran [2 ,3 ]
Huong Hoang Luong [4 ]
Trung Phuoc Le [4 ]
Tran, Nghi C. [5 ]
机构
[1] Can Tho Univ, Coll Informat & Commun Technol, Can Tho, Vietnam
[2] Duy Tan Univ, Ctr Software Engn, Da Nang 550000, Vietnam
[3] Duy Tan Univ, Inst Res & Dev, Da Nang 550000, Vietnam
[4] FPT Univ, Dept Informat Technol, Can Tho, Vietnam
[5] Natl Cent Univ, Taoyuan, Taiwan
关键词
Clustering algorithm; metagenomic; visualization; disease prediction; mean-shift; personalized medicine; species abundance; bacterial; BLOOD CULTURES; CONTAMINATION; IMPACT;
D O I
10.14569/IJACSA.2020.0110607
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Metagenomic data is a novel and valuable source for personalized medicine approaches to improve human health. Data Visualization is a crucial technique in data analysis to explore and find patterns in data. Especially, data resources from metagenomic often have very high dimension so humans face big challenges to understand them. In this study, we introduce a visualization method based on Mean-shift algorithm which enables us to observe high-dimensional data via images exhibiting clustered features by the clustering method. Then, these generated synthetic images are fetched into a convolutional neural network to do disease prediction tasks. The proposed method shows promising results when we evaluate the approach on four metagenomic bacterial species abundance datasets related to four diseases including Liver Cirrhosis, Colorectal Cancer, Obesity, and Type 2 Diabetes.
引用
收藏
页码:52 / 60
页数:9
相关论文
共 29 条
[1]  
[Anonymous], 2008, VISUALIZING DATA USI
[2]   Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0 [J].
Asnicar, Francesco ;
Thomas, Andrew Maltez ;
Beghini, Francesco ;
Mengoni, Claudia ;
Manara, Serena ;
Manghi, Paolo ;
Zhu, Qiyun ;
Bolzan, Mattia ;
Cumbo, Fabio ;
May, Uyen ;
Sanders, Jon G. ;
Zolfo, Moreno ;
Kopylova, Evguenia ;
Pasolli, Edoardo ;
Knight, Rob ;
Mirarab, Siavash ;
Huttenhower, Curtis ;
Segata, Nicola .
NATURE COMMUNICATIONS, 2020, 11 (01)
[3]   CONTAMINANT BLOOD CULTURES AND RESOURCE UTILIZATION - THE TRUE CONSEQUENCES OF FALSE-POSITIVE RESULTS [J].
BATES, DW ;
GOLDMAN, L ;
LEE, TH .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1991, 265 (03) :365-369
[4]   Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records [J].
Che, Zhengping ;
Cheng, Yu ;
Zha, Shuangfei ;
Sun, Zhaonan ;
Liu, Yan .
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, :787-792
[5]   An assessment of the functional enzymes and corresponding genes in chicken manure and wheat straw composted with addition of clay via meta-genomic analysis [J].
Chen, Hongyu ;
Awasthi, Sanjeev Kumar ;
Liu, Tao ;
Zhang, Zengqiang ;
Awasthi, Mukesh Kumar .
INDUSTRIAL CROPS AND PRODUCTS, 2020, 153
[6]   Bioinformatics for whole-genome shotgun sequencing of microbial communities [J].
Chen, K ;
Pachter, L .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (02) :106-112
[7]  
Crisci Carlos D, 2020, CURRENT TREATMENT OP, P1, DOI [10.1007/s40521-020-00258-8, DOI 10.1007/S40521-020-00258-8.2020]
[8]  
FUKUNAGA K, 1975, IEEE T INFORM THEORY, V21, P32, DOI 10.1109/TIT.1975.1055330
[9]   Multidomain analyses of a longitudinal human microbiome intestinal cleanout perturbation experiment [J].
Fukuyama, Julia ;
Rumker, Laurie ;
Sankaran, Kris ;
Jeganathan, Pratheepa ;
Dethlefsen, Les ;
Relman, David A. ;
Holmes, Susan P. .
PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (08) :e1005706
[10]   Impact of Blood Cultures Drawn by Phlebotomy on Contamination Rates and Health Care Costs in a Hospital Emergency Department [J].
Gander, Rita M. ;
Byrd, Linda ;
DeCrescenzo, Michael ;
Hirany, Shaina ;
Bowen, Michelle ;
Baughman, Judy .
JOURNAL OF CLINICAL MICROBIOLOGY, 2009, 47 (04) :1021-1024