Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction

被引:129
作者
Brunius, Carl [1 ,3 ]
Shi, Lin [1 ,3 ]
Landberg, Rikard [1 ,2 ,3 ]
机构
[1] Swedish Univ Agr Sci, Uppsala BioCtr, Dept Food Sci, Box 7051, S-75007 Uppsala, Sweden
[2] Karolinska Inst, Inst Environm Med, Unit Nutr Epidemiol, Box 210, S-17177 Stockholm, Sweden
[3] Chalmers Univ Technol, Dept Biol & Biol Engn, S-41296 Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
Metabolomics; LC-MS; Data correction; Batch alignment; Drift correction; NORMALIZATION METHODS; BIOMARKERS; CHROMATOGRAPHY; DISCOVERY; SAMPLES;
D O I
10.1007/s11306-016-1124-4
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Introduction Liquid chromatography-mass spectrometry (LC-MS) is a commonly used technique in untargeted metabolomics owing to broad coverage of metabolites, high sensitivity and simple sample preparation. However, data generated from multiple batches are affected by measurement errors inherent to alterations in signal intensity, drift in mass accuracy and retention times between samples both within and between batches. These measurement errors reduce repeatability and reproducibility and may thus decrease the power to detect biological responses and obscure interpretation. Objective Our aim was to develop procedures to address and correct for within-and between-batch variability in processing multiple-batch untargeted LC-MS metabolomics data to increase their quality. Methods Algorithms were developed for: (i) alignment and merging of features that are systematically misaligned between batches, through aggregating feature presence/missingness on batch level and combining similar features worthogonally present between batches; and (ii) within-batch drift correction using a cluster-based approach that allows multiple drift patterns within batch. Furthermore, a heuristic criterion was developed for the feature-wise choice of reference-based or population-based between-batch normalisation. Results In authentic data, between-batch alignment resulted in picking 15 % more features and deconvoluting 15 % of features previously erroneously aligned. Within-batch correction provided a decrease in median quality control feature coefficient of variation from 20.5 to 15.1 %. Algorithms are open source and available as an R package ('batchCorr'). Conclusions The developed procedures provide unbiased measures of improved data quality, with implications for improved data analysis. Although developed for LC-MS based metabolomics, these methods are generic and can be applied to other data suffering from similar limitations.
引用
收藏
页数:13
相关论文
共 49 条
[1]   Analytical methods in untargeted metabolomics: state of the art in 2015 [J].
Alonso, Arnald ;
Marsal, Sara ;
Julia, Antonio .
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2015, 3
[2]   Alignment and statistical difference analysis of complex peptide data sets generated by multidimensional LC-MS [J].
America, AHP ;
Cordewener, JHG ;
van Geffen, MHA ;
Lommen, A ;
Vissers, JPC ;
Bino, RJ ;
Hall, RD .
PROTEOMICS, 2006, 6 (02) :641-653
[3]  
[Anonymous], 2001, GUID IND BIOAN METH
[4]  
[Anonymous], 2012, 597 U WASH DEP STAT
[5]  
Bajad S, 2011, METHODS MOL BIOL, V708, P213, DOI 10.1007/978-1-61737-985-7_13
[6]   Dietary exposure biomarker-lead discovery based on metabolomics analysis of urine samples [J].
Beckmann, Manfred ;
Lloyd, Amanda J. ;
Haldar, Sumanto ;
Fave, Gaelle ;
Seal, Chris J. ;
Brandt, Kirsten ;
Mathers, John C. ;
Draper, John .
PROCEEDINGS OF THE NUTRITION SOCIETY, 2013, 72 (03) :352-361
[7]   Large-scale human metabolomics studies: A strategy for data (pre-) processing and validation [J].
Bijlsma, S ;
Bobeldijk, L ;
Verheij, ER ;
Ramaker, R ;
Kochhar, S ;
Macdonald, IA ;
van Ommen, B ;
Smilde, AK .
ANALYTICAL CHEMISTRY, 2006, 78 (02) :567-574
[8]   Bacterial associations reveal spatial population dynamics in Anopheles gambiae mosquitoes [J].
Buck, Moritz ;
Nilsson, Louise K. J. ;
Brunius, Carl ;
Dabire, Roch K. ;
Hopkins, Richard ;
Terenius, Olle .
SCIENTIFIC REPORTS, 2016, 6
[9]   Untargeted Metabolic Profiling Identifies Altered Serum Metabolites of Type 2 Diabetes Mellitus in a Prospective, Nested Case Control Study [J].
Drogan, Dagmar ;
Dunn, Warwick B. ;
Lin, Wanchang ;
Buijsse, Brian ;
Schulze, Matthias B. ;
Langenberg, Claudia ;
Brown, Marie ;
Floegel, Anna ;
Dietrich, Stefan ;
Rolandsson, Olov ;
Wedge, David C. ;
Goodacre, Royston ;
Forouhi, Nita G. ;
Sharp, Stephen J. ;
Spranger, Joachim ;
Wareham, Nick J. ;
Boeing, Heiner .
CLINICAL CHEMISTRY, 2015, 61 (03) :487-497
[10]   Diabetes - the Role of Metabolomics in the Discovery of New Mechanisms and Novel Biomarkers [J].
Dunn W.B. .
Current Cardiovascular Risk Reports, 2013, 7 (1) :25-32