A hierarchical approach to removal of unwanted variation for large-scale metabolomics data

被引:26
作者
Kim, Taiyun [1 ,2 ,3 ]
Tang, Owen [1 ,4 ,5 ,6 ]
Vernon, Stephen T. [1 ,4 ,5 ,6 ]
Kott, Katharine A. [1 ,4 ,5 ,6 ]
Koay, Yen Chin [1 ,6 ,7 ]
Park, John [1 ,4 ,5 ,6 ]
James, David E. [1 ,8 ,9 ]
Grieve, Stuart M. [1 ,10 ,11 ]
Speed, Terence P. [12 ,13 ]
Yang, Pengyi [1 ,2 ,3 ]
Figtree, Gemma A. [1 ,4 ,5 ,6 ]
O'Sullivan, John F. [1 ,6 ,7 ,14 ,15 ]
Yang, Jean Yee Hwa [1 ,2 ]
机构
[1] Univ Sydney, Charles Perkins Ctr, Sydney, NSW, Australia
[2] Univ Sydney, Sch Math & Stat, Sydney, NSW, Australia
[3] Childrens Med Res Inst, Computat Syst Biol Grp, Westmead, NSW, Australia
[4] Royal North Shore Hosp, Dept Cardiol, Sydney, NSW, Australia
[5] Univ Sydney, Kolling Inst Med Res, Cardiovasc Discovery Grp, Sydney, NSW, Australia
[6] Univ Sydney, Fac Med & Hlth, Sydney, NSW, Australia
[7] Heart Res Inst, Sydney, NSW, Australia
[8] Univ Sydney, Sch Life & Environm Sci, Sydney, NSW, Australia
[9] Univ Sydney, Sch Med Sci, Sydney, NSW, Australia
[10] Univ Sydney, Charles Perkins Ctr, Imaging & Phenotyping Lab, Sydney, NSW, Australia
[11] Royal Prince Alfred Hosp, Dept Radiol, Camperdown, NSW, Australia
[12] Walter Eliza Hall Inst, Bioinformat Div, Parkville, Vic, Australia
[13] Univ Melbourne, Sch Math & Stat, Parkville, Vic, Australia
[14] Royal Prince Alfred Hosp, Dept Cardiol, Sydney, NSW, Australia
[15] Tech Univ Dresden, Fac Med, Dresden, Germany
基金
英国医学研究理事会; 澳大利亚研究理事会;
关键词
MASS-SPECTROMETRY; GAS-CHROMATOGRAPHY; METABOLITES; CAMP; TIME; TOOL;
D O I
10.1038/s41467-021-25210-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Liquid chromatography-mass spectrometry-based metabolomics studies are increasingly applied to large population cohorts, which run for several weeks or even years in data acquisition. This inevitably introduces unwanted intra- and inter-batch variations over time that can overshadow true biological signals and thus hinder potential biological discoveries. To date, normalisation approaches have struggled to mitigate the variability introduced by technical factors whilst preserving biological variance, especially for protracted acquisitions. Here, we propose a study design framework with an arrangement for embedding biological sample replicates to quantify variance within and between batches and a workflow that uses these replicates to remove unwanted variation in a hierarchical manner (hRUV). We use this design to produce a dataset of more than 1000 human plasma samples run over an extended period of time. We demonstrate significant improvement of hRUV over existing methods in preserving biological signals whilst removing unwanted variation for large scale metabolomics studies. Our tools not only provide a strategy for large scale data normalisation, but also provides guidance on the design strategy for large omics studies. Mass spectrometry-based metabolomics is a powerful method for profiling large clinical cohorts but batch variations can obscure biologically meaningful differences. Here, the authors develop a computational workflow that removes unwanted data variation while preserving biologically relevant information.
引用
收藏
页数:10
相关论文
共 50 条
[1]   Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction [J].
Brunius, Carl ;
Shi, Lin ;
Landberg, Rikard .
METABOLOMICS, 2016, 12 (11)
[2]  
Chakraborty S, 2020, HYPERTENSION, V75, P1386, DOI [10.1161/HYPERTENSIONAHA.120.13896.), 10.1161/HYPERTENSIONAHA.120.13896]
[3]   Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets [J].
Chawade, Aakash ;
Alexandersson, Erik ;
Levander, Fredrik .
JOURNAL OF PROTEOME RESEARCH, 2014, 13 (06) :3114-3120
[4]   Metabolomics: an emerging but powerful tool for precision medicine [J].
Clish, Clary B. .
COLD SPRING HARBOR MOLECULAR CASE STUDIES, 2015, 1 (01)
[5]   NormalizeMets: assessing, selecting and implementing statistical methods for normalizing metabolomics data [J].
De Livera, Alysha M. ;
Olshansky, Gavriel ;
Simpson, Julie A. ;
Creek, Darren J. .
METABOLOMICS, 2018, 14 (05)
[6]   Statistical Methods for Handling Unwanted Variation in Metabolomics Data [J].
De Livera, Alysha M. ;
Sysi-Aho, Marko ;
Jacob, Laurent ;
Gagnon-Bartsch, Johann A. ;
Castillo, Sandra ;
Simpson, Julie A. ;
Speed, Terence P. .
ANALYTICAL CHEMISTRY, 2015, 87 (07) :3606-3615
[7]   Normalizing and Integrating Metabolomics Data [J].
De Livera, Alysha M. ;
Dias, Daniel A. ;
De Souza, David ;
Rupasinghe, Thusitha ;
Pyke, James ;
Tull, Dedreia ;
Roessner, Ute ;
McConville, Malcolm ;
Speed, Terence P. .
ANALYTICAL CHEMISTRY, 2012, 84 (24) :10768-10776
[8]   WaveICA: A novel algorithm to remove batch effects for large-scale untargeted metabolomics data based on wavelet analysis [J].
Deng, Kui ;
Zhang, Fan ;
Tan, Qilong ;
Huang, Yue ;
Song, Wei ;
Rong, Zhiwei ;
Zhu, Zheng-Jiang ;
Li, Zhenzi ;
Li, Kang .
ANALYTICA CHIMICA ACTA, 2019, 1061 :60-69
[9]  
Dunn WB, 2012, BIOANALYSIS, V4, P2249, DOI [10.4155/BIO.12.204, 10.4155/bio.12.204]
[10]   Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry [J].
Dunn, Warwick B. ;
Broadhurst, David ;
Begley, Paul ;
Zelena, Eva ;
Francis-McIntyre, Sue ;
Anderson, Nadine ;
Brown, Marie ;
Knowles, Joshau D. ;
Halsall, Antony ;
Haselden, John N. ;
Nicholls, Andrew W. ;
Wilson, Ian D. ;
Kell, Douglas B. ;
Goodacre, Royston .
NATURE PROTOCOLS, 2011, 6 (07) :1060-1083