Evaluation of O2PLS in Omics data integration

被引:126
作者
el Bouhaddani, Said [1 ]
Houwing-Duistermaat, Jeanine [1 ]
Salo, Perttu [2 ]
Perola, Markus [2 ]
Jongbloed, Geurt [3 ]
Uh, Hae-Won [1 ]
机构
[1] LUMC, Dept Med Stat & Bioinformat, NL-2300 RC Leiden, Netherlands
[2] Natl Inst Hlth & Welf THL, FI-00271 Helsinki, Finland
[3] Delft Univ Technol, EEMCS, Dept Stat, NL-2628 CD Delft, Netherlands
关键词
Integration of Omics data; Dimension reduction; Latent variable regression; O2PLS; SELECTION;
D O I
10.1186/s12859-015-0854-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Rapid computational and technological developments made large amounts of omics data available in different biological levels. It is becoming clear that simultaneous data analysis methods are needed for better interpretation and understanding of the underlying systems biology. Different methods have been proposed for this task, among them Partial Least Squares (PLS) related methods. To also deal with orthogonal variation, systematic variation in the data unrelated to one another, we consider the Two-way Orthogonal PLS (O2PLS): an integrative data analysis method which is capable of modeling systematic variation, while providing more parsimonious models aiding interpretation. Results: A simulation study to assess the performance of O2PLS showed positive results in both low and higher dimensions. More noise (50 % of the data) only affected the systematic part estimates. A data analysis was conducted using data on metabolomics and transcriptomics from a large Finnish cohort (DILGOM). A previous sequential study, using the same data, showed significant correlations between the Lipo-Leukocyte (LL) module and lipoprotein metabolites. The O2PLS results were in agreement with these findings, identifying almost the same set of co-varying variables. Moreover, our integrative approach identified other associative genes and metabolites, while taking into account systematic variation in the data. Including orthogonal components enhanced overall fit, but the orthogonal variation was difficult to interpret. Conclusions: Simulations showed that the O2PLS estimates were close to the true parameters in both low and higher dimensions. In the presence of more noise (50 %), the orthogonal part estimates could not distinguish well between joint and unique variation. The joint estimates were not systematically affected. Simultaneous analysis with O2PLS on metabolome and transcriptome data showed that the LL module, together with VLDL and HDL metabolites, were important for the metabolomic and transcriptomic relation. This is in agreement with an earlier study. In addition more gene expression and metabolites are identified being important for the joint covariation.
引用
收藏
页数:16
相关论文
共 20 条
[1]   AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[2]   Data integration in plant biology:: the O2PLS method for combined modeling of transcript and metabolite data [J].
Bylesjo, Max ;
Eriksson, Daniel ;
Kusano, Miyako ;
Moritz, Thomas ;
Trygg, Johan .
PLANT JOURNAL, 2007, 52 (06) :1181-1191
[3]   HIGHLIGHTING RELATIONSHIPS BETWEEN HETEROGENEOUS BIOLOGICAL DATA THROUGH GRAPHICAL DISPLAYS BASED ON REGULARIZED CANONICAL CORRELATION ANALYSIS [J].
Gonzalez, I. ;
Dejean, S. ;
Martin, P. G. P. ;
Goncalves, O. ;
Besse, P. ;
Baccini, A. .
JOURNAL OF BIOLOGICAL SYSTEMS, 2009, 17 (02) :173-199
[4]   Metabonomic, transcriptomic, and genomic variation of a population cohort [J].
Inouye, Michael ;
Kettunen, Johannes ;
Soininen, Pasi ;
Silander, Kaisa ;
Ripatti, Samuli ;
Kumpula, Linda S. ;
Haemaelaeinen, Eija ;
Jousilahti, Pekka ;
Kangas, Antti J. ;
Mannisto, Satu ;
Savolainen, Markku J. ;
Jula, Antti ;
Leiviska, Jaana ;
Palotie, Aarno ;
Salomaa, Veikko ;
Perola, Markus ;
Ala-Korpela, Mika ;
Peltonen, Leena .
MOLECULAR SYSTEMS BIOLOGY, 2010, 6
[5]   An Immune Response Network Associated with Blood Lipid Levels [J].
Inouye, Michael ;
Silander, Kaisa ;
Hamalainen, Eija ;
Salomaa, Veikko ;
Harald, Kennet ;
Jousilahti, Pekka ;
Mannisto, Satu ;
Eriksson, Johan G. ;
Saarela, Janna ;
Ripatti, Samuli ;
Perola, Markus ;
van Ommen, Gert-Jan B. ;
Taskinen, Marja-Riitta ;
Palotie, Aarno ;
Dermitzakis, Emmanouil T. ;
Peltonen, Leena .
PLOS GENETICS, 2010, 6 (09)
[6]  
Le Cao KA, 2011, J SFDS, V152, P77
[7]   mRNA and microRNA Expression Profiles of the NCI-60 Integrated with Drug Activities [J].
Liu, Hongfang ;
D'Andrade, Petula ;
Fulmer-Smentek, Stephanie ;
Lorenzi, Philip ;
Kohn, Kurt W. ;
Weinstein, John N. ;
Pommier, Yves ;
Reinhold, William C. .
MOLECULAR CANCER THERAPEUTICS, 2010, 9 (05) :1080-1091
[8]   JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES [J].
Lock, Eric F. ;
Hoadley, Katherine A. ;
Marron, J. S. ;
Nobel, Andrew B. .
ANNALS OF APPLIED STATISTICS, 2013, 7 (01) :523-542
[9]   OnPLS-a novel multiblock method for the modelling of predictive and orthogonal variation [J].
Lofstedt, Tommy ;
Trygg, Johan .
JOURNAL OF CHEMOMETRICS, 2011, 25 (08) :441-455
[10]  
Pournelle G. H., 1953, Journal of Mammalogy, V34, P133, DOI 10.1890/0012-9658(2002)083[1421:SDEOLC]2.0.CO