Classifying and analyzing small-angle scattering data using weighted k nearest neighbors machine learning techniques

被引:34
作者
Archibald, Richard K. [1 ]
Doucet, Mathieu [2 ]
Johnston, Travis [1 ]
Young, Steven R. [1 ]
Yang, Erika [1 ]
Heller, William T. [2 ]
机构
[1] Oak Ridge Natl Lab, Comp Sci & Math Div, POB 2009, Oak Ridge, TN 37831 USA
[2] Oak Ridge Natl Lab, Neutron Scattering Div, POB 2009, Oak Ridge, TN 37831 USA
基金
欧盟地平线“2020”;
关键词
small-angle scattering data; machine learning; modeling; SasView; COMPUTER; CALIBRATION;
D O I
10.1107/S1600576720000552
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A consistent challenge for both new and expert practitioners of small-angle scattering (SAS) lies in determining how to analyze the data, given the limited information content of said data and the large number of models that can be employed. Machine learning (ML) methods are powerful tools for classifying data that have found diverse applications in many fields of science. Here, ML methods are applied to the problem of classifying SAS data for the most appropriate model to use for data analysis. The approach employed is built around the method of weighted k nearest neighbors (wKNN), and utilizes a subset of the models implemented in the SasView package (https://www. sasview.org/) for generating a well defined set of training and testing data. The prediction rate of the wKNN method implemented here using a subset of SasView models is reasonably good for many of the models, but has difficulty with others, notably those based on spherical structures. A novel expansion of the wKNN method was also developed, which uses Gaussian processes to produce local surrogate models for the classification, and this significantly improves the classification accuracy. Further, by integrating a stochastic gradient descent method during post-processing, it is possible to leverage the local surrogate model both to classify the SAS data with high accuracy and to predict the structural parameters that best describe the data. The linking of data classification and model fitting has the potential to facilitate the translation of measured data into results for both novice and expert practitioners of SAS.
引用
收藏
页码:326 / 334
页数:9
相关论文
共 29 条
[21]   FML-kNN: scalable machine learning on Big Data using k-nearest neighbor joins [J].
Chatzigeorgakidis, Georgios ;
Karagiorgou, Sophia ;
Athanasiou, Spiros ;
Skiadopoulos, Spiros .
JOURNAL OF BIG DATA, 2018, 5 (01)
[22]   Machine learning-accelerated small-angle X-ray scattering analysis of disordered two- and three-phase materials [J].
Roding, Magnus ;
Tomaszewski, Piotr ;
Yu, Shun ;
Borg, Markus ;
Ronnols, Jerk .
FRONTIERS IN MATERIALS, 2022, 9
[23]   Analyzing uncertainty in cardiotocogram data for the prediction of fetal risks based on machine learning techniques using rough set [J].
Kannan, E. ;
Ravikumar, S. ;
Anitha, A. ;
Kumar, Sathish A. P. ;
Vijayasarathy, M. .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021,
[24]   Absolute measurements of anomalous small-angle X-ray scattering intensity using glassy carbon at the Mg K absorption edge [J].
Aoyama, Keita ;
Okuda, Hiroshi ;
Lin, Shan ;
Mase, Kazuhiko ;
Kitajima, Yoshinori ;
Tamenori, Yusuke .
JAPANESE JOURNAL OF APPLIED PHYSICS, 2022, 61 (07)
[25]   Ultra-sensitive Terahertz Hemoglobin Detection Using Graphene-Enhanced Metasurface Surface Plasmon Resonance Biosensors Optimized with K-Nearest Neighbors Regression Machine Learning [J].
Wekalao, Jacob ;
Mandela, Ngaira ;
Mwendwa, Gideon ;
Elamri, Oumaymah ;
Maamar, Alla Eddine Toubal .
PLASMONICS, 2025,
[26]   Prediction of Recurrence in Non Small Cell Lung Cancer Patients with Gene Expression Data Using Machine Learning Techniques [J].
Bhattacharjee, Sudipto ;
Saha, Banani ;
Saha, Sudipto .
2023 INTERNATIONAL CONFERENCE ON COMPUTER, ELECTRICAL & COMMUNICATION ENGINEERING, ICCECE, 2023,
[27]   Machine Learning Diagnosis of Small- Bowel Crohn Disease Using T2-Weighted MRI Radiomic and Clinical Data [J].
Liu, Richard X. ;
Li, Hailong ;
Towbin, Alexander J. ;
Abu Ata, Nadeen ;
Smith, Ethan A. ;
Tkach, Jean A. ;
Denson, Lee A. ;
He, Lili ;
Dillman, Jonathan R. .
AMERICAN JOURNAL OF ROENTGENOLOGY, 2024, 222 (01)
[28]   Analyzing of metal organic frameworks performance in CH4 adsorption using machine learning techniques: A GBRT model based on small training dataset [J].
Wei, Xin ;
Peng, Ding ;
Shen, Lin ;
Ai, Yuejie ;
Lu, Zhanhui .
JOURNAL OF ENVIRONMENTAL CHEMICAL ENGINEERING, 2023, 11 (03)
[29]   Flood Detection and Susceptibility Mapping Using Sentinel-1 Remote Sensing Data and a Machine Learning Approach: Hybrid Intelligence of Bagging Ensemble Based on K-Nearest Neighbor Classifier [J].
Shahabi, Himan ;
Shirzadi, Ataollah ;
Ghaderi, Kayvan ;
Omidvar, Ebrahim ;
Al-Ansari, Nadhir ;
Clague, John J. ;
Geertsema, Marten ;
Khosravi, Khabat ;
Amini, Ata ;
Bahrami, Sepideh ;
Rahmati, Omid ;
Habibi, Kyoumars ;
Mohammadi, Ayub ;
Hoang Nguyen ;
Melesse, Assefa M. ;
Bin Ahmad, Baharin ;
Ahmad, Anuar .
REMOTE SENSING, 2020, 12 (02)