Multidimensional scaling improves distance-based clustering for microbiome data

被引:1
作者
Chen, Guanhua [1 ]
Wang, Xinyue [2 ]
Sun, Qiang [3 ]
Tang, Zheng-Zheng [1 ]
机构
[1] Univ Wisconsin Madison, Dept Biostat & Med Informat, 600 Highland Ave, Madison, WI 53726 USA
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[3] Univ Toronto, Dept Stat Sci, Toronto, ON M5S 3G3, Canada
关键词
VALIDATION; PATTERNS; UNIFRAC; NUMBER;
D O I
10.1093/bioinformatics/btaf042
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Clustering patients into subgroups based on their microbial compositions can greatly enhance our understanding of the role of microbes in human health and disease etiology. Distance-based clustering methods, such as partitioning around medoids (PAM), are popular due to their computational efficiency and absence of distributional assumptions. However, the performance of these methods can be suboptimal when true cluster memberships are driven by differences in the abundance of only a few microbes, a situation known as the sparse signal scenario.Results We demonstrate that classical multidimensional scaling (MDS), a widely used dimensionality reduction technique, effectively denoises microbiome data and enhances the clustering performance of distance-based methods. We propose a two-step procedure that first applies MDS to project high-dimensional microbiome data into a low-dimensional space, followed by distance-based clustering using the low-dimensional data. Our extensive simulations demonstrate that our procedure offers superior performance compared to directly conducting distance-based clustering under the sparse signal scenario. The advantage of our procedure is further showcased in several real data applications.Availability and implementation The R package MDSMClust is available at https://github.com/wxy929/MDS-project.
引用
收藏
页数:9
相关论文
共 44 条
[1]   Eigenvalue Ratio Test for the Number of Factors [J].
Ahn, Seung C. ;
Horenstein, Alex R. .
ECONOMETRICA, 2013, 81 (03) :1203-1227
[2]   Applications and Comparison of Dimensionality Reduction Methods for Microbiome Data [J].
Armstrong, George ;
Rahman, Gibraan ;
Martino, Cameron ;
McDonald, Daniel ;
Gonzalez, Antonio ;
Mishne, Gal ;
Knight, Rob .
FRONTIERS IN BIOINFORMATICS, 2022, 2
[3]   Enterotypes of the human gut microbiome [J].
Arumugam, Manimozhiyan ;
Raes, Jeroen ;
Pelletier, Eric ;
Le Paslier, Denis ;
Yamada, Takuji ;
Mende, Daniel R. ;
Fernandes, Gabriel R. ;
Tap, Julien ;
Bruls, Thomas ;
Batto, Jean-Michel ;
Bertalan, Marcelo ;
Borruel, Natalia ;
Casellas, Francesc ;
Fernandez, Leyden ;
Gautier, Laurent ;
Hansen, Torben ;
Hattori, Masahira ;
Hayashi, Tetsuya ;
Kleerebezem, Michiel ;
Kurokawa, Ken ;
Leclerc, Marion ;
Levenez, Florence ;
Manichanh, Chaysavanh ;
Nielsen, H. Bjorn ;
Nielsen, Trine ;
Pons, Nicolas ;
Poulain, Julie ;
Qin, Junjie ;
Sicheritz-Ponten, Thomas ;
Tims, Sebastian ;
Torrents, David ;
Ugarte, Edgardo ;
Zoetendal, Erwin G. ;
Wang, Jun ;
Guarner, Francisco ;
Pedersen, Oluf ;
de Vos, Willem M. ;
Brunak, Soren ;
Dore, Joel ;
Weissenbach, Jean ;
Ehrlich, S. Dusko ;
Bork, Peer .
NATURE, 2011, 473 (7346) :174-180
[4]  
Beyer K, 1999, LECT NOTES COMPUT SC, V1540, P217
[5]  
Bishop YM, 2017, Discrete Multivariate Analysis: Theory and Practice
[6]   Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 [J].
Bolyen, Evan ;
Rideout, Jai Ram ;
Dillon, Matthew R. ;
Bokulich, NicholasA. ;
Abnet, Christian C. ;
Al-Ghalith, Gabriel A. ;
Alexander, Harriet ;
Alm, Eric J. ;
Arumugam, Manimozhiyan ;
Asnicar, Francesco ;
Bai, Yang ;
Bisanz, Jordan E. ;
Bittinger, Kyle ;
Brejnrod, Asker ;
Brislawn, Colin J. ;
Brown, C. Titus ;
Callahan, Benjamin J. ;
Caraballo-Rodriguez, Andres Mauricio ;
Chase, John ;
Cope, Emily K. ;
Da Silva, Ricardo ;
Diener, Christian ;
Dorrestein, Pieter C. ;
Douglas, Gavin M. ;
Durall, Daniel M. ;
Duvallet, Claire ;
Edwardson, Christian F. ;
Ernst, Madeleine ;
Estaki, Mehrbod ;
Fouquier, Jennifer ;
Gauglitz, Julia M. ;
Gibbons, Sean M. ;
Gibson, Deanna L. ;
Gonzalez, Antonio ;
Gorlick, Kestrel ;
Guo, Jiarong ;
Hillmann, Benjamin ;
Holmes, Susan ;
Holste, Hannes ;
Huttenhower, Curtis ;
Huttley, Gavin A. ;
Janssen, Stefan ;
Jarmusch, Alan K. ;
Jiang, Lingjing ;
Kaehler, Benjamin D. ;
Bin Kang, Kyo ;
Keefe, Christopher R. ;
Keim, Paul ;
Kelley, Scott T. ;
Knights, Dan .
NATURE BIOTECHNOLOGY, 2019, 37 (08) :852-857
[7]  
Borg I., 2005, Modern multidimensional scaling: Theory and applications
[8]   AN ORDINATION OF THE UPLAND FOREST COMMUNITIES OF SOUTHERN WISCONSIN [J].
BRAY, JR ;
CURTIS, JT .
ECOLOGICAL MONOGRAPHS, 1957, 27 (04) :326-349
[9]  
Callahan BJ, 2016, NAT METHODS, V13, P581, DOI [10.1038/NMETH.3869, 10.1038/nmeth.3869]
[10]   Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers [J].
Charlson, Emily S. ;
Chen, Jun ;
Custers-Allen, Rebecca ;
Bittinger, Kyle ;
Li, Hongzhe ;
Sinha, Rohini ;
Hwang, Jennifer ;
Bushman, Frederic D. ;
Collman, Ronald G. .
PLOS ONE, 2010, 5 (12)