Current breathomics-a review on data pre-processing techniques and machine learning in metabolomics breath analysis

被引:154
作者
Smolinska, A. [1 ,2 ]
Hauschild, A-Ch [3 ]
Fijten, R. R. R. [1 ]
Dallinga, J. W. [1 ]
Baumbach, J. [4 ]
van Schooten, F. J. [1 ]
机构
[1] Maastricht Univ, Nutr & Toxicol Res Inst Maastricht NUTRIM, Dept Toxicol, NL-6200 MD Maastricht, Netherlands
[2] Top Inst Food & Nutr, Wageningen, Netherlands
[3] Max Planck Inst Informat, Computat Syst Biol Grp, D-66123 Saarbrucken, Germany
[4] Univ Southern Denmark, Dept Math & Comp Sci, Computat Biol Grp, Odense, Denmark
关键词
GC-MS; MCC-IMS; exhaled air; multivariate analysis; volatile organic compounds (VOCs); VOLATILE ORGANIC-COMPOUNDS; FLIGHT MASS-SPECTROMETER; EXHALED BREATH; LUNG-CANCER; BIOLOGICAL DATA; VARIABLE SELECTION; RETENTION TIME; BIOMARKERS; PEAK; NOSE;
D O I
10.1088/1752-7155/8/2/027105
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We define breathomics as the metabolomics study of exhaled air. It is a strongly emerging metabolomics research field that mainly focuses on health-related volatile organic compounds (VOCs). Since the amount of these compounds varies with health status, breathomics holds great promise to deliver non-invasive diagnostic tools. Thus, the main aim of breathomics is to find patterns of VOCs related to abnormal (for instance inflammatory) metabolic processes occurring in the human body. Recently, analytical methods for measuring VOCs in exhaled air with high resolution and high throughput have been extensively developed. Yet, the application of machine learning methods for fingerprinting VOC profiles in the breathomics is still in its infancy. Therefore, in this paper, we describe the current state of the art in data pre-processing and multivariate analysis of breathomics data. We start with the detailed pre-processing pipelines for breathomics data obtained from gas-chromatography mass spectrometry and an ion-mobility spectrometer coupled to multi-capillary columns. The outcome of data pre-processing is a matrix containing the relative abundances of a set of VOCs for a group of patients under different conditions (e.g. disease stage, treatment). Independently of the utilized analytical method, the most important question, 'which VOCs are discriminatory?', remains the same. Answers can be given by several modern machine learning techniques (multivariate statistics) and, therefore, are the focus of this paper. We demonstrate the advantages as well the drawbacks of such techniques. We aim to help the community to understand how to profit from a particular method. In parallel, we hope to make the community aware of the existing data fusion methods, as yet unresearched in breathomics.
引用
收藏
页数:20
相关论文
共 130 条
  • [41] From projection pursuit to other unsupervised chemometric techniques
    Daszykowski, Michal
    [J]. JOURNAL OF CHEMOMETRICS, 2007, 21 (7-9) : 270 - 279
  • [42] Chemical characterization of exhaled breath to differentiate between patients with malignant plueral mesothelioma from subjects with similar professional asbestos exposure
    de Gennaro, G.
    Dragonieri, S.
    Longobardi, F.
    Musti, M.
    Stallone, G.
    Trizio, L.
    Tutino, M.
    [J]. ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2010, 398 (7-8) : 3043 - 3050
  • [43] Separating Smoking-Related Diseases Using NMR-Based Metabolomics of Exhaled Breath Condensate
    de Laurentiis, Guglielmo
    Paris, Debora
    Melck, Dominique
    Montuschi, Paolo
    Maniscalco, Mauro
    Bianco, Andrea
    Sofia, Matteo
    Motta, Andrea
    [J]. JOURNAL OF PROTEOME RESEARCH, 2013, 12 (03) : 1502 - 1511
  • [44] Lung cancer identification by the analysis of breath by means of an array of non-selective gas sensors
    Di Natale, C
    Macagnano, A
    Martinelli, E
    Paolesse, R
    D'Arcangelo, G
    Roscioni, C
    Finazzi-Agrò, A
    D'Amico, A
    [J]. BIOSENSORS & BIOELECTRONICS, 2003, 18 (10) : 1209 - 1218
  • [45] Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures.: Application in 1H NMR metabonomics
    Dieterle, Frank
    Ross, Alfred
    Schlotterbeck, Gotz
    Senn, Hans
    [J]. ANALYTICAL CHEMISTRY, 2006, 78 (13) : 4281 - 4290
  • [46] Group aggregating normalization method for the preprocessing of NMR-based metabolomic data
    Dong, Jiyang
    Cheng, Kian-Kai
    Xu, Jingjing
    Chen, Zhong
    Griffin, Julian L.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2011, 108 (02) : 123 - 132
  • [47] An electronic nose discriminates exhaled breath of patients with untreated pulmonary sarcoidosis from controls
    Dragonieri, Silvano
    Brinkman, Paul
    Mouw, Evert
    Zwinderman, Aeilko H.
    Carratu, Pierluigi
    Resta, Onofrio
    Sterk, Peter J.
    Jonkers, Rene E.
    [J]. RESPIRATORY MEDICINE, 2013, 107 (07) : 1073 - 1078
  • [48] Electronic nose based tea quality standardization
    Dutta, R
    Kashwan, KR
    Bhuyan, M
    Hines, EL
    Gardner, JW
    [J]. NEURAL NETWORKS, 2003, 16 (5-6) : 847 - 853
  • [49] Application of the Electronic Nose Technique to Differentiation between Model Mixtures with COPD Markers
    Dymerski, Tomasz
    Gebicki, Jacek
    Wisniewska, Paulina
    Sliwinska, Magdalena
    Wardencki, Waldemar
    Namiesnik, Jacek
    [J]. SENSORS, 2013, 13 (04): : 5008 - 5027
  • [50] Flexible smoothing with B-splines and penalties
    Eilers, PHC
    Marx, BD
    [J]. STATISTICAL SCIENCE, 1996, 11 (02) : 89 - 102