Extracting structural motifs from pair distribution function data of nanostructures using explainable machine learning

被引:25
作者
Anker, Andy S. [1 ,2 ]
Kjaer, Emil T. S. [1 ,2 ]
Juelsholt, Mikkel [3 ]
Christiansen, Troels Lindahl [1 ,2 ]
Skjaervo, Susanne Linn [1 ,2 ]
Jorgensen, Mads Ry Vogel [4 ,5 ,6 ]
Kantor, Innokenty [6 ,7 ]
Sorensen, Daniel Risskov [4 ,5 ,6 ]
Billinge, Simon J. L. [8 ,9 ]
Selvan, Raghavendra [10 ,11 ]
Jensen, Kirsten M. O. [1 ,2 ]
机构
[1] Univ Copenhagen, Dept Chem, DK-2100 Copenhagen, Denmark
[2] Univ Copenhagen, Nanosci Ctr, DK-2100 Copenhagen, Denmark
[3] Univ Oxford, Dept Mat, Parks Rd, Oxford, England
[4] Aarhus Univ, Dept Chem, DK-8000 Aarhus, Denmark
[5] Aarhus Univ, iNANO, DK-8000 Aarhus, Denmark
[6] Lund Univ, MAX IV Lab, S-22484 Lund, Sweden
[7] Tech Univ Denmark, Dept Phys, DK-2880 Lyngby, Denmark
[8] Columbia Univ, Dept Appl Phys & Appl Math, New York, NY 10027 USA
[9] Brookhaven Natl Lab, Condensed Matter Phys & Mat Sci Dept, Upton, NY 11973 USA
[10] Univ Copenhagen, Dept Comp Sci, DK-2100 Copenhagen, Denmark
[11] Univ Copenhagen, Dept Neurosci, DK-2200 Copenhagen, Denmark
基金
瑞典研究理事会; 欧洲研究理事会; 美国国家科学基金会;
关键词
AB-INITIO DETERMINATION; ATOMIC-STRUCTURE; CRYSTAL; PROGRAM; CRYSTALLOGRAPHY; NANOPARTICLES; COMPLEX;
D O I
10.1038/s41524-022-00896-3
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Characterization of material structure with X-ray or neutron scattering using e.g. Pair Distribution Function (PDF) analysis most often rely on refining a structure model against an experimental dataset. However, identifying a suitable model is often a bottleneck. Recently, automated approaches have made it possible to test thousands of models for each dataset, but these methods are computationally expensive and analysing the output, i.e. extracting structural information from the resulting fits in a meaningful way, is challenging. Our Machine Learning based Motif Extractor (ML-MotEx) trains an ML algorithm on thousands of fits, and uses SHAP (SHapley Additive exPlanation) values to identify which model features are important for the fit quality. We use the method for 4 different chemical systems, including disordered nanomaterials and clusters. ML-MotEx opens for a type of modelling where each feature in a model is assigned an importance value for the fit quality based on explainable ML.
引用
收藏
页数:11
相关论文
共 45 条
[1]   Formation and growth mechanism for niobium oxide nanoparticles: atomistic insight from in situ X-ray total scattering [J].
Aalling-Frederiksen, Olivia ;
Juelsholt, Mikkel ;
Anker, Andy S. ;
Jensen, Kirsten M. O. .
NANOSCALE, 2021, 13 (17) :8087-8097
[2]  
Anker A. S., 2020, PROC 16 INT WORKSHOP
[3]   Structural Changes during the Growth of Atomically Precise Metal Oxido Nanoclusters from Combined Pair Distribution Function and Small-Angle X-ray Scattering Analysis [J].
Anker, Andy S. ;
Christiansen, Troels Lindahl ;
Weber, Marcus ;
Schmiele, Martin ;
Brok, Erik ;
Kjaer, Emil T. S. ;
Juhas, Pavol ;
Thomas, Rico ;
Mehring, Michael ;
Jensen, Kirsten M. O. .
ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2021, 60 (37) :20407-20416
[4]   Cluster-mining: an approach for determining core structures of metallic nanoparticles from atomic pair distribution function data [J].
Banerjee, Soham ;
Liu, Chia-Hao ;
Jensen, Kirsten M. O. ;
Juhas, Pavol ;
Lee, Jennifer D. ;
Tofanelli, Marcus ;
Ackerson, Christopher J. ;
Murray, Christopher B. ;
Billinge, Simon J. L. .
ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2020, 76 :24-31
[5]   Amorphous Metal-Organic Frameworks [J].
Bennett, Thomas D. ;
Cheetham, Anthony K. .
ACCOUNTS OF CHEMICAL RESEARCH, 2014, 47 (05) :1555-1562
[6]   The problem with determining atomic structure at the nanoscale [J].
Billinge, Simon J. L. ;
Levin, Igor .
SCIENCE, 2007, 316 (5824) :561-565
[7]   Beyond crystallography: the study of disorder, nanocrystallinity and crystallographically challenged materials with pair distribution functions [J].
Billinge, SJL ;
Kanatzidis, MG .
CHEMICAL COMMUNICATIONS, 2004, (07) :749-760
[8]   Structural Relationships among Methyl-, Dimethyl-, and Trimethylammonium Phosphododecatungstates [J].
Busbongthong, Suntharee ;
Ozeki, Tomoji .
BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN, 2009, 82 (11) :1393-1397
[9]   Interpretable, calibrated neural networks for analysis and understanding of inelastic neutron scattering data [J].
Butler, Keith T. ;
Le, Manh Duc ;
Thiyagalingam, Jeyan ;
Perring, Toby G. .
JOURNAL OF PHYSICS-CONDENSED MATTER, 2021, 33 (19)
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794