Discriminant analysis and feature selection in mass spectrometry imaging using constrained repeated random sampling - Cross validation (CORRS-CV)

被引:13
作者
Perez-Guaita, David [1 ]
Quintas, Guillermo [2 ,3 ]
Kuligowski, Julia [4 ]
机构
[1] FOCAS Res Inst, Dublin, Ireland
[2] LEITAT Technol Ctr, Hlth & Biomed, Barcelona, Spain
[3] Hlth Res Inst Hosp La Fe, Unidad Analit, Valencia, Spain
[4] Hlth Res Inst Hosp La Fe, Neonatal Res Unit, Valencia, Spain
基金
欧洲研究理事会;
关键词
Mass spectrometry imaging (MSI); Cross validation (CV); Constrained repeated random sampling; Cross validation (CORRSCV); Partial least squares-discriminant analysis (PLS-DA); Feature selection;
D O I
10.1016/j.aca.2019.10.039
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The identification of biomarkers through Mass spectrometry imaging (MSI) is gaining popularity in the clinical field. However, considering the complexity of spectral and spatial variables faced, data mining of the hyperspectral images can be troublesome. The discovery of markers generally depends on the creation of classification models which should be validated to ensure the statistical significance of the discriminants m/z detected. Internal validation using resampling methods such as cross validation (CV) are widely used for model selection, the estimation of its generalization performance and biomarker discovery when sample sizes are limited and an independent test set is not available. Here, we introduce for first time the use of Constrained Repeated Random Subsampling CV (CORRS-CV) on multi-images for the validation of classification models on MSI. Although several aspects must be taken into account (e.g. image size, CORRS-CVavalue, the similarity across spatially close pixels, the total computation time), CORRS-CV provides more accurate estimates of the model performance than k-fold CV using of biological replicates to define the data split when the number of biological replicates is scarce and holding images back for testing is a waste of valuable information. Besides, the combined use of CORRS-CV and rank products increases the robustness of the selection of discriminant features as candidate biomarkers which is an important issue due to the increased biological, environmental and technical variabilities when analysing multiple images, especially from human tissues collected in clinical studies. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:30 / 36
页数:7
相关论文
共 20 条
  • [11] MassImager: A software for interactive and in-depth analysis of mass spectrometry imaging data
    He, Jiuming
    Huang, Luojiao
    Tian, Runtao
    Li, Tiegang
    Sun, Chenglong
    Song, Xiaowei
    Lv, Yiwei
    Luo, Zhigang
    Li, Xin
    Abliz, Zeper
    [J]. ANALYTICA CHIMICA ACTA, 2018, 1015 : 50 - 57
  • [12] On the implementation of spatial constraints in multivariate curve resolution alternating least squares for hyperspectral image analysis
    Hugelier, Siewert
    Devos, Olivier
    Ruckebusch, Cyril
    [J]. JOURNAL OF CHEMOMETRICS, 2015, 29 (10) : 557 - 561
  • [13] Imaging mass spectrometry statistical analysis
    Jones, Emrys A.
    Deininger, Soeren-Oliver
    Hogendoorn, Pancras C. W.
    Deelder, Andre M.
    McDonnell, Liam A.
    [J]. JOURNAL OF PROTEOMICS, 2012, 75 (16) : 4962 - 4989
  • [14] Assessment of discriminant models in infrared imaging using constrained repeated random sampling - Cross validation
    Perez-Guaita, David
    Kuligowski, Julia
    Lendl, Bernhard
    Wood, Bayden R.
    Quintas, Guillermo
    [J]. ANALYTICA CHIMICA ACTA, 2018, 1033 : 156 - 164
  • [15] R Development Core Team, R: A language and environment for statistical computing
  • [16] Mass spectrometry imaging: a novel technology in rheumatology
    Rocha, Beatriz
    Ruiz-Romero, Cristina
    Blanco, Francisco J.
    [J]. NATURE REVIEWS RHEUMATOLOGY, 2017, 13 (01) : 52 - 63
  • [17] Assessing the performance of statistical validation tools for megavariate metabolomics data
    Rubingh, Carina M.
    Bijlsma, Sabina
    Derks, Eduard P. P. A.
    Bobeldijk, Ivana
    Verheij, Elwin R.
    Kochhar, Sunil
    Smilde, Age K.
    [J]. METABOLOMICS, 2006, 2 (02) : 53 - 61
  • [18] Mass spectrometry imaging and its application in pharmaceutical research and development: A concise review
    Swales, John G.
    Hamm, Gregory
    Clench, Malcolm R.
    Goodwin, Richard J. A.
    [J]. INTERNATIONAL JOURNAL OF MASS SPECTROMETRY, 2019, 437 : 99 - 112
  • [19] Tsamardinos I, 2014, LECT NOTES ARTIF INT, V8445, P1, DOI 10.1007/978-3-319-07064-3_1
  • [20] Assessment of PLSDA cross validation
    Westerhuis, Johan A.
    Hoefsloot, Huub C. J.
    Smit, Suzanne
    Vis, Daniel J.
    Smilde, Age K.
    van Velzen, Ewoud J. J.
    van Duijnhoven, John P. M.
    van Dorsten, Ferdi A.
    [J]. METABOLOMICS, 2008, 4 (01) : 81 - 89