Evaluation of integrative clustering methods for the analysis of multi-omics data

被引:59
作者
Chauvel, Cecile [1 ]
Novoloaca, Alexei [2 ]
Veyre, Pierre [3 ]
Reynier, Frederic [4 ]
Becker, Jeremie [5 ]
机构
[1] Bioaster, Data Management & Anal Unit, Biostat, Lyon, France
[2] World Hlth Org, Int Agcy Res Canc, Epigenet Grp, Biostat, Lyon, France
[3] Bioaster, Data Management & Anal Unit, Lyon, France
[4] Bioaster, Genom & Transcript, Lyon, France
[5] Bioaster, Genom & Transcript Unit, Biostat, Lyon, France
关键词
benchmark; clustering; data integration; multi-omics; unsupervised analysis; BREAST; JOINT; CLASSIFICATION; EXPRESSION; CRITERIA; MODULES;
D O I
10.1093/bib/bbz015
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in sequencing, mass spectrometry and cytometry technologies have enabled researchers to collect large-scale omics data from the same set of biological samples. The joint analysis of multiple omics offers the opportunity to uncover coordinated cellular processes acting across different omic layers. In this work, we present a thorough comparison of a selection of recent integrative clustering approaches, including Bayesian (BCC and MDI) and matrix factorization approaches (iCluster, moCluster, JIVE and iNMF). Based on simulations, the methods were evaluated on their sensitivity and their ability to recover both the correct number of clusters and the simulated clustering at the common and data-specific levels. Standard non-integrative approaches were also included to quantify the added value of integrative methods. For most matrix factorization methods and one Bayesian approach (BCC), the shared and specific structures were successfully recovered with high and moderate accuracy, respectively. An opposite behavior was observed on non-integrative approaches, i.e. high performances on specific structures only. Finally, we applied the methods on the Cancer Genome Atlas breast cancer data set to check whether results based on experimental data were consistent with those obtained in the simulations.
引用
收藏
页码:541 / 552
页数:12
相关论文
共 41 条
[1]  
American Cancer Society, Facts and figures
[2]  
[Anonymous], 2016, GENOM COMPUT BIOL
[3]   Methods for the integration of multi-omics data: mathematical aspects [J].
Bersanelli, Matteo ;
Mosca, Ettore ;
Remondini, Daniel ;
Giampieri, Enrico ;
Sala, Claudia ;
Castellani, Gastone ;
Milanesi, Luciano .
BMC BIOINFORMATICS, 2016, 17
[4]   IPF-LASSO: Integrative L1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data [J].
Boulesteix, Anne-Laure ;
De Bin, Riccardo ;
Jiang, Xiaoyu ;
Fuchs, Mathias .
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2017, 2017
[5]   Transcriptomic and metabolomic data integration [J].
Cavill, Rachel ;
Jennen, Danyel ;
Kleinjans, Jos ;
Briede, Jacob Jan .
BRIEFINGS IN BIOINFORMATICS, 2016, 17 (05) :891-901
[6]   A Combined Metabonomic and Transcriptomic Approach to Investigate Metabolism during Development in the Chick Chorioallantoic Membrane [J].
Cavill, Rachel ;
Sidhu, Jasmin K. ;
Kilarski, Witold ;
Javerzat, Sophie ;
Hagedorn, Martin ;
Timothy, M. D. Ebbels ;
Bikfalvi, Andreas ;
Keunt, Hector C. .
JOURNAL OF PROTEOME RESEARCH, 2010, 9 (06) :3126-3134
[7]   Integrative clustering methods for high-dimensional molecular data [J].
Chalise, Prabhakar ;
Koestler, Devin C. ;
Bimali, Milan ;
Yu, Qing ;
Fridley, Brooke L. .
TRANSLATIONAL CANCER RESEARCH, 2014, 3 (03) :202-216
[8]  
Dai XF, 2015, AM J CANCER RES, V5, P2929
[9]  
de Souto M. C. P., 2012, 2012 Brazilian Symposium on Neural Networks (SBRN 2012), P49, DOI 10.1109/SBRN.2012.25
[10]   Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology [J].
Devarajan, Karthik .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (07)