Implementation of quality controls is essential to prevent batch effects in breathomics data and allow for cross-study comparisons

被引:11
作者
Stavropoulos, Georgios [1 ]
Jonkers, Daisy M. A. E. [2 ]
Mujagic, Zlatan [2 ]
Koek, Ger H. [2 ]
Masclee, Ad A. M. [2 ]
Pierik, Marieke J. [2 ]
Dallinga, Jan W. [1 ]
Van Schooten, Frederik-Jan [1 ]
Smolinska, Agnieszka [1 ]
机构
[1] Maastricht Univ, NUTRIM Sch Nutr & Translat Res, Dept Pharmacol & Toxicol, Maastricht, Netherlands
[2] Maastricht Univ, NUTRIM Sch Nutr & Translat Res, Div Gastroenterol & Hepatol, Maastricht, Netherlands
关键词
exhaled breath; volatile organic compounds; VOCs; data analysis; batch effects; IBD; IBS; liver cirrhosis; VOLATILE ORGANIC-COMPOUNDS; IRRITABLE-BOWEL-SYNDROME; PARTIAL LEAST-SQUARES; GENE-EXPRESSION; MICROARRAY DATA; CLASSIFICATION; PERFORMANCE; PREDICTION; DIAGNOSIS; DISEASE;
D O I
10.1088/1752-7163/ab7b8d
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Exhaled breath analysis has become a promising monitoring tool for various ailments by identifying volatile organic compounds (VOCs) as indicative biomarkers excreted in the human body. Throughout the process of sampling, measuring, and data processing, non-biological variations are introduced in the data leading to batch effects. Algorithmic approaches have been developed to cope with within-study batch effects. Batch differences, however, may occur among different studies too, and up-to-date, ways to correct for cross-study batch effects are lacking; ultimately, cross-study comparisons to verify the uniqueness of found VOC profiles for a specific disease may be challenging. This study applies within-study batch-effect-correction approaches to correct for cross-study batch effects; suggestions are made that may help prevent the introduction of cross-study variations. Three batch-effect-correction algorithms were investigated: zero-centering, combat, and the analysis of covariance framework. The breath samples were collected from inflammatory bowel disease (n = 213), chronic liver disease (n = 189), and irritable bowel syndrome (n = 261) patients at different periods, and they were analysed via gas chromatography-mass spectrometry. Multivariate statistics were used to visualise and verify the results. The visualisation of the data before any batch-effect-correction technique was applied showed a clear distinction due to probable batch effects among the datasets of the three cohorts. The visualisation of the three datasets after implementing all three correction techniques showed that the batch effects were still present in the data. Predictions made using partial least squares discriminant analysis and random forest confirmed this observation. The within-study batch-effect-correction approaches fail to correct for cross-study batch effects present in the data. The present study proposes a framework for systematically standardising future breathomics data by using internal standards or quality control samples at regular analysis intervals. Further knowledge regarding the nature of the unsolicited variations among cross-study batches must be obtained to move the field further.
引用
收藏
页数:12
相关论文
共 60 条
  • [41] A profile of volatile organic compounds in exhaled air as a potential non-invasive biomarker for liver cirrhosis
    Pijls, Kirsten E.
    Smolinska, Agnieszka
    Jonkers, Daisy M. A. E.
    Dallinga, Jan W.
    Masclee, Ad A. M.
    Koek, Ger H.
    van Schooten, Frederik-Jan
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [42] Liver cirrhosis
    Pinzani, Massimo
    Rosselli, Matteo
    Zuckermann, Michele
    [J]. BEST PRACTICE & RESEARCH CLINICAL GASTROENTEROLOGY, 2011, 25 (02) : 281 - 290
  • [43] INFLAMMATORY BOWEL-DISEASE .2.
    PODOLSKY, DK
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 1991, 325 (14) : 1008 - 1016
  • [44] The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets
    Saito, Takaya
    Rehmsmeier, Marc
    [J]. PLOS ONE, 2015, 10 (03):
  • [45] Model selection for within-batch effect correction in UPLC-MS metabolomics using quality control - Support vector regression
    Sanchez-Illana, Angel
    Perez-Guaita, David
    Cuesta-Garcia, Daniel
    Daniel Sanjuan-Herraez, Juan
    Vento, Maximo
    Luis Ruiz-Cerda, Jose
    Quintas, Guillermo
    Kuligowski, Julia
    [J]. ANALYTICA CHIMICA ACTA, 2018, 1026 : 62 - 68
  • [46] Evaluation of batch effect elimination using quality control replicates in LC-MS metabolite profiling
    Sanchez-Illana, Angel
    David Pineiro-Ramos, Jose
    Daniel Sanjuan-Herraez, Juan
    Vento, Maximo
    Quintas, Guillermo
    Kuligowski, Julia
    [J]. ANALYTICA CHIMICA ACTA, 2018, 1019 : 38 - 48
  • [47] Analysis of volatile organic compounds in exhaled breath to diagnose ventilator-associated pneumonia
    Schnabel, Ronny
    Fijten, Rianne
    Smolinska, Agnieszka
    Dallinga, Jan
    Boumans, Marie-Louise
    Stobberingh, Ellen
    Boots, Agnes
    Roekaerts, Paul
    Bergmans, Dennis
    van Schooten, Frederik Jan
    [J]. SCIENTIFIC REPORTS, 2015, 5
  • [48] Discriminating IBD from IBS: Comparison of the test performance of fecal markers, blood leukocytes, CRP, and IBD antibodies
    Schoepfer, Alain M.
    Trummler, Michael
    Seeholzer, Petra
    Seibold-Schmid, Beatrice
    Seibold, Frank
    [J]. INFLAMMATORY BOWEL DISEASES, 2008, 14 (01) : 32 - 39
  • [49] Removal of batch effects using distribution-matching residual networks
    Shaham, Uri
    Stanton, Kelly P.
    Zhao, Jun
    Li, Huamin
    Raddassi, Khadir
    Montgomery, Ruth
    Kluger, Yuval
    [J]. BIOINFORMATICS, 2017, 33 (16) : 2539 - 2546
  • [50] Linearity and nonlinearity of basin response as a function of scale: Discussion of alternative definitions
    Sivapalan, M
    Jothityangkoon, C
    Menabde, M
    [J]. WATER RESOURCES RESEARCH, 2002, 38 (02) : 4 - 1