Predicting electronic stopping powers using stacking ensemble machine learning method

被引:13
作者
Akbari, Fatemeh [1 ,2 ]
Taghizadeh, Somayeh [1 ]
Shvydka, Diana [1 ]
Sperling, Nicholas Niven [1 ]
Parsai, E. Ishmael [1 ]
机构
[1] Univ Toledo, Dept Radiat Oncol, Hlth Sci Campus, Toledo, OH 43614 USA
[2] Mail Stop 1151,3000 Arlington Ave, Toledo, OH 43614 USA
关键词
Stopping power; Prediction; Machine learning; Stacking ensemble methods; ABSOLUTE ERROR MAE; ACCURACY; IONS; RMSE;
D O I
10.1016/j.nimb.2023.02.023
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
Purpose: Accurate electronic stopping power data is crucial for calculations of radiation-induced effects in a wide range of applications, from dosimetry and radiotherapy to particle physics. The data is dependent on the pa-rameters of both the incident charged particle and the stopping medium. The existent Bethe theory can be used to calculate the stopping power of high-energy ions, but fails at lower energies, leaving incomplete and even contradictory experimental data, often expanded through extrapolations with fitting formula, as the only accessible resource. Moreover, the majority of the experimental data is available for elements only, further limiting the validity of fitting approaches for complex material compositions. A relatively novel machine learning methodology has been proven to be effective for exactly these types of problems. In this study, Stacking Ensemble Machine Learning (EML) algorithm was developed to predict electronic stopping power for any incident ion and target combination over a wide range of ion energies. For this purpose, five ML models, namely Bagging Re-gressor (BR), eXtreme Gradient Boosting (XGB), Adaptive Boosting (AdB), Gradient Boosting (GB), and Random Forest (RF), were selected as base and meta learners to construct the final Stacking EML. Methods: 40,044 experimental measurements, from 1928 to the present, available on the International Atomic Energy Agency (IAEA) website were used to train machine learning (ML) algorithms. This database consists of 593 various ion-target combinations across the energy range of 0.037 to 985 MeV. For model training, the eleven most important features were selected. The model evaluation was performed using several error metrics, including R-squared (R2), root-mean-squared-error (RMSE), mean-absolute-error (MAE), and mean-absolute -percentage-error (MAPE), on both the training and test datasets. Results: Based on model performance evaluation tests, a stack of XGB and RF via BR meta-learner had the lowest error margin. The value of R2 = 0.9985 indicated a near-ideal fit to all samples in the training data across the entire range of stopping powers. R2 = 0.9955 for predictions made by the model on the unseen test data sug-gested that the model accurately predicted the test data. Conclusions: The developed model can serve as a universal tool to generate the eSP data in a wide range of cases, regardless of the availability of experimental data or reliable theoretical equations. Overall, the results of the developed tool testified to the power of machine learning approaches, and the suitability of the chosen models for solutions to practically important physics problems.
引用
收藏
页码:8 / 16
页数:9
相关论文
共 40 条
[1]   Beware of R2: Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models [J].
Alexander, D. L. J. ;
Tropsha, A. ;
Winkler, David A. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2015, 55 (07) :1316-1322
[2]  
[Anonymous], 2005, Journal of the ICRU, V5, P1, DOI 10.1093/jicru/ndi004
[3]  
[Anonymous], 1990, HLTH EFF EXP LOW LEV
[4]   A comparative analysis of gradient boosting algorithms [J].
Bentejac, Candice ;
Csorgo, Anna ;
Martinez-Munoz, Gonzalo .
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (03) :1937-1967
[5]   A review of feature selection methods on synthetic data [J].
Bolon-Canedo, Veronica ;
Sanchez-Marono, Noelia ;
Alonso-Betanzos, Amparo .
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 34 (03) :483-519
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
Burkov A., 2019, Expert Systems, V5, P132, DOI [10.1111/j.14680394.1988.tb00341.x, DOI 10.1111/J.14680394.1988.TB00341.X]
[8]   Root mean square error (RMSE) or mean absolute error (MAE)? - Arguments against avoiding RMSE in the literature [J].
Chai, T. ;
Draxler, R. R. .
GEOSCIENTIFIC MODEL DEVELOPMENT, 2014, 7 (03) :1247-1250
[9]  
Chen T., 2015, R package version 0.4-2 1(4), V4, P1
[10]   Impact of new ICRU Report 90 recommendations on calculated correction factors for reference dosimetry [J].
Czarnecki, Damian ;
Poppe, Bjoern ;
Zink, Klemens .
PHYSICS IN MEDICINE AND BIOLOGY, 2018, 63 (15)