Evaluation of performance of machine learning methods in mining structure-property data of halide perovskite materials

被引:11
作者
Zhao, Ruoting [1 ,2 ]
Xing, Bangyu [1 ,2 ]
Mu, Huimin [3 ]
Fu, Yuhao [3 ,4 ]
Zhang, Lijun [1 ,2 ,4 ]
机构
[1] Jilin Univ, State Key Lab Integrated Optoelect, Key Lab Automobile Mat MOE,Electron Microscopy Ct, Jilin Prov Int Cooperat Key Lab High Efficiency C, Changchun 130012, Peoples R China
[2] Jilin Univ, Sch Mat Sci & Engn, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Phys, State Key Lab Superhard Mat, Changchun 130012, Peoples R China
[4] Jilin Univ, Int Ctr Computat Method & Software, Changchun 130012, Peoples R China
基金
中国国家自然科学基金;
关键词
machine learning; material informatics; first-principles calculations; halide perovskites; ORGANIC-INORGANIC PEROVSKITES; DESIGN; APPROXIMATION; SINGLE;
D O I
10.1088/1674-1056/ac5d2d
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
With the rapid development of artificial intelligence and machine learning (ML) methods, materials science is rapidly entering the era of data-driven materials informatics. ML models serve as the most crucial component, closely bridging material structure and material properties. There is a considerable difference in the prediction performance of different ML methods for material systems. Herein, we evaluated three categories (linear, kernel, and nonlinear methods) of models, with twelve ML algorithms commonly used in the materials field. In addition, halide perovskite was chosen as an example to evaluate the fitting performance of different models. We constructed a total dataset of 540 halide perovskites and 72 features, with formation energy and bandgap as target properties. We found that different categories of ML models show similar trends for different target properties. Among them, the difference between the models is enormous for the formation energy, with the coefficient of determination (R-2) range 0.69-0.953. The fitting performance between the models is closer for bandgap, with the R-2 range 0.941-0.997. The nonlinear-ensemble model shows the best fitting performance for both the formation energy and the bandgap. It shows that the nonlinear-ensemble model, constructed by combining multiple weak learners, effectively describes the nonlinear relationship between material features and target property. In addition, the extreme gradient boosting decision tree model shows the most superior results among all the models and searches for two new descriptors that are crucial for formation energy and bandgap. Our work provides useful guidance for the selection of effective machine learning methods in the data-mining studies of specific material systems. The dataset that supported the findings of this study is available in Science Data Bank, with the link .
引用
收藏
页数:8
相关论文
共 50 条
[21]   Machine learning guided rapid discovery of narrow-bandgap inorganic halide perovskite materials [J].
Gang Li ;
Chaofeng Wang ;
Jiajia Huang ;
Like Huang ;
Yuejin Zhu .
Applied Physics A, 2024, 130
[22]   Performance prediction of perovskite materials based on different machine learning algorithms [J].
Zheng W.-D. ;
Zhang H.-R. ;
Hu H.-Q. ;
Liu Y. ;
Li S.-Z. ;
Ding G.-T. ;
Zhang J.-C. .
Zhongguo Youse Jinshu Xuebao/Chinese Journal of Nonferrous Metals, 2019, 29 (04) :803-809
[23]   Machine learning stability and band gap of lead-free halide double perovskite materials for perovskite solar cells [J].
Guo, Zongmei ;
Lin, Bin .
SOLAR ENERGY, 2021, 228 :689-699
[24]   Data mining and Machine Learning Approaches on Engineering Materials-A Review [J].
Antony, P. J. ;
Manujesh, Prajna ;
Jnanesh, N. A. .
2016 IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2016, :69-73
[25]   Elasticity of dense anisotropic carbons: A machine learning model of the structure-property relationship informed by large scale molecular dynamics data [J].
Polewczyk, Franck ;
Leyssale, Jean -Marc ;
Aurel, Philippe ;
Pineau, Nicolas ;
Denoual, Christophe ;
Vignoles, Gerard L. ;
Lafourcade, Paul .
ACTA MATERIALIA, 2024, 270
[26]   Quantifying performance of machine learning methods for neuroimaging data [J].
Jollans, Lee ;
Boyle, Rory ;
Artiges, Eric ;
Banaschewski, Tobias ;
Desrivieres, Sylvane ;
Grigis, Antoine ;
Martinot, Jean-Luc ;
Paus, Tomas ;
Smolka, Michael N. ;
Walter, Henrik ;
Schumann, Gunter ;
Garavan, Hugh ;
Whelan, Robert .
NEUROIMAGE, 2019, 199 :351-365
[27]   Data Mining and Machine Learning Methods Applied to A Numerical Clinching Model [J].
Goetz, Marco ;
Leichsenring, Ferenc ;
Kropp, Thomas ;
Muller, Peter ;
Falk, Tobias ;
Graf, Wolfgang ;
Kaliske, Michael ;
Drossel, Welf-Guntram .
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2018, 117 (03) :387-423
[28]   Performance Evaluation of Machine Learning Methods in Cultural Modeling [J].
Li, Xiao-Chen ;
Mao, Wen-Ji ;
Zeng, Daniel ;
Su, Peng ;
Wang, Fei-Yue .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2009, 24 (06) :1010-1017
[29]   Performance Evaluation of Machine Learning Methods in Cultural Modeling [J].
李晓晨 ;
毛文吉 ;
曾大军 ;
苏鹏 ;
王飞跃 .
Journal of Computer Science & Technology, 2009, 24 (06) :1010-1017
[30]   Performance Evaluation of Machine Learning Methods in Cultural Modeling [J].
Xiao-Chen Li ;
Wen-Ji Mao ;
Daniel Zeng ;
Peng Su ;
Fei-Yue Wang .
Journal of Computer Science and Technology, 2009, 24 :1010-1017