mixOmics: An R package for 'omics feature selection and multiple data integration

被引:2442
作者
Rohart, Florian [1 ,6 ]
Gautier, Benoit [1 ]
Singh, Amrit [2 ,3 ]
Le Cao, Kim-Anh [1 ,4 ,5 ]
机构
[1] Univ Queensland, Diamantina Inst, Translat Res Inst, Brisbane, Qld, Australia
[2] Prevent Organ Failure PROOF Ctr Excellence, Vancouver, BC, Canada
[3] Univ British Columbia, Dept Pathol & Lab Med, Vancouver, BC, Canada
[4] Univ Melbourne, Melbourne Integrat Genom, Melbourne, Vic, Australia
[5] Univ Melbourne, Sch Math & Stat, Melbourne, Vic, Australia
[6] Univ Queensland, Inst Mol Biosci, Brisbane, Qld, Australia
基金
英国医学研究理事会;
关键词
PARTIAL LEAST-SQUARES; CANONICAL CORRELATION-ANALYSIS; CLASSIFICATION; METABOLOMICS; PROFILES; NETWORKS; CANCER;
D O I
10.1371/journal.pcbi.1005752
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The advent of high throughput technologies has led to a wealth of publicly available 'omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a 'molecular signature') to explain or predict biological conditions, but mainly for a single type of 'omics. In addition, commonly used methods are univariate and consider each biological feature independently. We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a systems biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous 'omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple 'omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latest mixOmics integrative frameworks for the multivariate analyses of 'omics data available from the package.
引用
收藏
页数:19
相关论文
共 47 条
[1]  
[Anonymous], FACTOMINER FACTOR AN
[2]  
[Anonymous], 2017, MIXOMICS OMICS DATA
[3]  
[Anonymous], 2013, PMA PENALIZED MULTIV
[4]  
[Anonymous], 1975, International Perspectives on Mathematical and Statistical Modeling, DOI DOI 10.1016/B978-0-12-103950-9.50017-4
[5]   Enterotypes of the human gut microbiome [J].
Arumugam, Manimozhiyan ;
Raes, Jeroen ;
Pelletier, Eric ;
Le Paslier, Denis ;
Yamada, Takuji ;
Mende, Daniel R. ;
Fernandes, Gabriel R. ;
Tap, Julien ;
Bruls, Thomas ;
Batto, Jean-Michel ;
Bertalan, Marcelo ;
Borruel, Natalia ;
Casellas, Francesc ;
Fernandez, Leyden ;
Gautier, Laurent ;
Hansen, Torben ;
Hattori, Masahira ;
Hayashi, Tetsuya ;
Kleerebezem, Michiel ;
Kurokawa, Ken ;
Leclerc, Marion ;
Levenez, Florence ;
Manichanh, Chaysavanh ;
Nielsen, H. Bjorn ;
Nielsen, Trine ;
Pons, Nicolas ;
Poulain, Julie ;
Qin, Junjie ;
Sicheritz-Ponten, Thomas ;
Tims, Sebastian ;
Torrents, David ;
Ugarte, Edgardo ;
Zoetendal, Erwin G. ;
Wang, Jun ;
Guarner, Francisco ;
Pedersen, Oluf ;
de Vos, Willem M. ;
Brunak, Soren ;
Dore, Joel ;
Weissenbach, Jean ;
Ehrlich, S. Dusko ;
Bork, Peer .
NATURE, 2011, 473 (7346) :174-180
[6]  
Boulesteix A.-L., 2004, STAT APPL GENET MOL, V3, P1, DOI [DOI 10.2202/1544-6115.1075, 10.2202/1544-6115.1075]
[7]   Partial least squares: a versatile tool for the analysis of high-dimensional genomic data [J].
Boulesteix, Anne-Laure ;
Strimmer, Korbinian .
BRIEFINGS IN BIOINFORMATICS, 2007, 8 (01) :32-44
[8]   MixMC: A Multivariate Statistical Framework to Gain Insight into Microbial Communities [J].
Cao, Kim-Anh Le ;
Costello, Mary-Ellen ;
Lakis, Vanessa Anne ;
Bartolo, Francois ;
Chua, Xin-Yi ;
Brazeilles, Remi ;
Rondeau, Pascale .
PLOS ONE, 2016, 11 (08)
[9]   Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems [J].
Cao, Kim-Anh Le ;
Boitard, Simon ;
Besse, Philippe .
BMC BIOINFORMATICS, 2011, 12
[10]  
Chung Dongjun., 2013, SPLS SPARSE PARTIAL