Effect of data set size on geochemical quantification accuracy with laser-induced breakdown spectroscopy

被引:30
作者
Dyar, M. Darby [1 ,2 ]
Ytsma, Cai R. [1 ]
机构
[1] Mt Holyoke Coll, Dept Astron, 50 Coll St, S Hadley, MA 01075 USA
[2] Planetary Sci Inst, 1700 East Ft Lowell,Suite 106, Tucson, AZ 85719 USA
关键词
LIBS; PLS; Geostandards; Quantification; Accuracy; GALE CRATER; CHEMCAM INSTRUMENT; EMISSION-LINES; MARS; ROCKS; UNIVARIATE; PREDICTION; CHLORIDES; ELEMENTS; LIBS;
D O I
10.1016/j.sab.2021.106073
中图分类号
O433 [光谱学];
学科分类号
0703 ; 070302 ;
摘要
Laser-induced breakdown spectroscopy (LIBS) data acquired from 2959 geochemical standards allow the effects of training set size on LIBS accuracy in geochemical analyses to be evaluated. In addition, LIBS prediction accuracies are quantified for 65 elements based on a typical benchtop instrument. Analyses used two equivalent randomly selected subsets of the full data set to compare prediction accuracies of partial least squares models using 75, 50, 25, 10, 5, 2.5, 1, and 0.5% of the total data set for training and the remainder for testing. The number of components, a measure of complexity, in the PLS models was shown to increase with the size of the training set. Based on root mean square errors on unseen test data, our results show that the larger the training set, the better (lower) the prediction accuracy will be on unseen data. Calibration (training set) size was shown to have a first-order effect on prediction accuracy relative to spectral resolution and detector sensitivity. Different methods of assessing model accuracy using root mean square error (RMSE) are compared, including the error of the calibration (RMSE-C), the error of cross-validation (RMSE-CV), and the error of prediction (RMSE-P). Use of RMSE-C is inappropriate because the samples being predicted are those on which the model was trained. In data sets that are sufficiently large, use of test data (RMSE-P) provides the best measure of prediction accuracy, while RMSE-CV is useful only to provide an estimate of subsequent model performance. Increasing the number of crossvalidation folds for our large dataset yields surprisingly comparable RMSE-CV values for models with five or more (up to 100) folds, but this result is likely not applicable to smaller data sets and needs further evaluation.
引用
收藏
页数:15
相关论文
共 56 条
[1]   Characterization of LIBS emission lines for the identification of chlorides, carbonates, and sulfates in salt/basalt mixtures for the application to MSL ChemCam data [J].
Anderson, D. E. ;
Ehlmann, B. L. ;
Forni, O. ;
Clegg, S. M. ;
Cousin, A. ;
Thomas, N. H. ;
Lasue, J. ;
Delapp, D. M. ;
McInroy, R. E. ;
Gasnault, O. ;
Dyar, M. D. ;
Schroeder, S. ;
Maurice, S. ;
Wiens, R. C. .
JOURNAL OF GEOPHYSICAL RESEARCH-PLANETS, 2017, 122 (04) :744-770
[2]  
Anderson D.E., 2015, 46 LUN PLAN SCI C WO
[3]   Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models [J].
Anderson, Ryan B. ;
Clegg, Samuel M. ;
Frydenvang, Jens ;
Wiens, Roger C. ;
McLennan, Scott ;
Morris, Richard V. ;
Ehlmann, Bethany ;
Dyar, M. Darby .
SPECTROCHIMICA ACTA PART B-ATOMIC SPECTROSCOPY, 2017, 129 :49-57
[4]   Correlating multispectral imaging and compositional data from the Mars Exploration Rovers and implications for Mars Science Laboratory [J].
Anderson, Ryan B. ;
Bell, James F., III .
ICARUS, 2013, 223 (01) :157-180
[5]   Clustering and training set selection methods for improving the accuracy of quantitative laser induced breakdown spectroscopy [J].
Anderson, Ryan B. ;
Bell, James F., III ;
Wiens, Roger C. ;
Morris, Richard V. ;
Clegg, Samuel M. .
SPECTROCHIMICA ACTA PART B-ATOMIC SPECTROSCOPY, 2012, 70 :24-32
[6]   The influence of multivariate analysis methods and target grain size on the accuracy of remote quantitative chemical analysis of rocks using laser induced breakdown spectroscopy [J].
Anderson, Ryan B. ;
Morris, Richard V. ;
Clegg, Samuel M. ;
Bell, James F., III ;
Wiens, Roger C. ;
Humphries, Seth D. ;
Mertzman, Stanley A. ;
Graff, Trevor G. ;
McInroy, Rhonda .
ICARUS, 2011, 215 (02) :608-627
[7]  
Blank J.G., 2015, 46 LUN PLAN SCI C WO
[8]   Manifold preprocessing for laser-induced breakdown spectroscopy under Mars conditions [J].
Boucher, Thomas ;
Carey, C. J. ;
Dyar, Melinda Darby ;
Mahadevan, Sridhar ;
Clegg, Samuel ;
Wiens, Roger .
JOURNAL OF CHEMOMETRICS, 2015, 29 (09) :484-491
[9]   A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy [J].
Boucher, Thomas F. ;
Ozanne, Marie V. ;
Carmosino, Marco L. ;
Dyar, M. Darby ;
Mahadevan, Sridhar ;
Breves, Elly A. ;
Lepore, Kate H. ;
Clegg, Samuel M. .
SPECTROCHIMICA ACTA PART B-ATOMIC SPECTROSCOPY, 2015, 107 :1-10
[10]   Evaluation of self-absorption of manganese emission lines in Laser Induced Breakdown Spectroscopy measurements [J].
Bredice, F. ;
Borges, F. O. ;
Sobral, H. ;
Villagran-Muniz, M. ;
Di Rocco, H. O. ;
Cristoforetti, G. ;
Legnaioli, S. ;
Palleschi, V. ;
Pardini, L. ;
Salvetti, A. ;
Tognoni, E. .
SPECTROCHIMICA ACTA PART B-ATOMIC SPECTROSCOPY, 2006, 61 (12) :1294-1303