Multi-omics integration-a comparison of unsupervised clustering methodologies

被引:94
作者
Tini, Giulia [1 ,2 ]
Marchetti, Luca [3 ,4 ]
Priami, Corrado [5 ]
Scott-Boyer, Marie-Pier
机构
[1] Univ Trento, Math, Trento, Italy
[2] COSBI, Trento, Italy
[3] Univ Verona, Verona, Italy
[4] COSBI, Computat Biol Team, Trento, Italy
[5] Univ Trento, Comp Sci, Trento, Italy
关键词
molecular-level interaction; biological systems; unsupervised classification; data preprocessing; JOINT; DISCOVERY; MODULES; BREAST; ONPLS;
D O I
10.1093/bib/bbx167
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
With the recent developments in the field of multi-omics integration, the interest in factors such as data preprocessing, choice of the integration method and the number of different omics considered had increased. In this work, the impact of these factors is explored when solving the problem of sample classification, by comparing the performances of five unsupervised algorithms: Multiple Canonical Correlation Analysis, Multiple Co-Inertia Analysis, Multiple Factor Analysis, Joint and Individual Variation Explained and Similarity Network Fusion. These methods were applied to three real data sets taken from literature and several ad hoc simulated scenarios to discuss classification performance in different conditions of noise and signal strength across the data types. The impact of experimental design, feature selection and parameter training has been also evaluated to unravel important conditions that can affect the accuracy of the result.
引用
收藏
页码:1269 / 1279
页数:11
相关论文
共 69 条
[21]   Evaluation of O2PLS in Omics data integration [J].
el Bouhaddani, Said ;
Houwing-Duistermaat, Jeanine ;
Salo, Perttu ;
Perola, Markus ;
Jongbloed, Geurt ;
Uh, Hae-Won .
BMC BIOINFORMATICS, 2016, 17
[22]   NMR spectroscopy of RNA [J].
Fürtig, B ;
Richter, C ;
Wöhnert, J ;
Schwalbe, H .
CHEMBIOCHEM, 2003, 4 (10) :936-962
[23]   Passing Messages between Biological Networks to Refine Predicted Interactions [J].
Glass, Kimberly ;
Huttenhower, Curtis ;
Quackenbush, John ;
Yuan, Guo-Cheng .
PLOS ONE, 2013, 8 (05)
[24]   Data integration in the era of omics: current and future challenges [J].
Gomez-Cabrero, David ;
Abugessaisa, Imad ;
Maier, Dieter ;
Teschendorff, Andrew ;
Merkenschlager, Matthias ;
Gisel, Andreas ;
Ballestar, Esteban ;
Bongcam-Rudloff, Erik ;
Conesa, Ana ;
Tegner, Jesper .
BMC SYSTEMS BIOLOGY, 2014, 8 :I1
[25]   HIGHLIGHTING RELATIONSHIPS BETWEEN HETEROGENEOUS BIOLOGICAL DATA THROUGH GRAPHICAL DISPLAYS BASED ON REGULARIZED CANONICAL CORRELATION ANALYSIS [J].
Gonzalez, I. ;
Dejean, S. ;
Martin, P. G. P. ;
Goncalves, O. ;
Besse, P. ;
Baccini, A. .
JOURNAL OF BIOLOGICAL SYSTEMS, 2009, 17 (02) :173-199
[26]  
Hira Zena M., 2015, Advances in Bioinformatics, V2015, P198363, DOI 10.1155/2015/198363
[27]  
Hotelling H, 1936, BIOMETRIKA, V28, P321, DOI 10.2307/2333955
[28]  
Huang J, 2016, FRONT GENET, V7, DOI [10.3389/fgene.2017.00084, 10.3389/fgene.2016.00084]
[29]  
Jolliffe I. T, 2002, SPRINGER SERIES STAT
[30]   Comprehensive molecular portraits of human breast tumours [J].
Koboldt, Daniel C. ;
Fulton, Robert S. ;
McLellan, Michael D. ;
Schmidt, Heather ;
Kalicki-Veizer, Joelle ;
McMichael, Joshua F. ;
Fulton, Lucinda L. ;
Dooling, David J. ;
Ding, Li ;
Mardis, Elaine R. ;
Wilson, Richard K. ;
Ally, Adrian ;
Balasundaram, Miruna ;
Butterfield, Yaron S. N. ;
Carlsen, Rebecca ;
Carter, Candace ;
Chu, Andy ;
Chuah, Eric ;
Chun, Hye-Jung E. ;
Coope, Robin J. N. ;
Dhalla, Noreen ;
Guin, Ranabir ;
Hirst, Carrie ;
Hirst, Martin ;
Holt, Robert A. ;
Lee, Darlene ;
Li, Haiyan I. ;
Mayo, Michael ;
Moore, Richard A. ;
Mungall, Andrew J. ;
Pleasance, Erin ;
Robertson, A. Gordon ;
Schein, Jacqueline E. ;
Shafiei, Arash ;
Sipahimalani, Payal ;
Slobodan, Jared R. ;
Stoll, Dominik ;
Tam, Angela ;
Thiessen, Nina ;
Varhol, Richard J. ;
Wye, Natasja ;
Zeng, Thomas ;
Zhao, Yongjun ;
Birol, Inanc ;
Jones, Steven J. M. ;
Marra, Marco A. ;
Cherniack, Andrew D. ;
Saksena, Gordon ;
Onofrio, Robert C. ;
Pho, Nam H. .
NATURE, 2012, 490 (7418) :61-70