PREDICTION OF FORMATION ENERGY USING TWO-STAGE MACHINE LEARNING BASED ON CLUSTERING

被引:2
作者
Fan, Xingyue [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
来源
MATERIALI IN TEHNOLOGIJE | 2021年 / 55卷 / 02期
关键词
ABO3-type perovskites; formation energy; hierarchical clustering; regression model;
D O I
10.17222/mit.2020.174
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The formation energy (Hf) is one of the important properties associated with the thermodynamic stability of ABO3-type perovskite. In this work, two-stage machine learning based on hierarchical clustering and regression was designed for improving the prediction values of the density-functional theory (DFT) Hf of ABO3-type perovskites. A global dataset was clustered into Cluster 1 and Cluster 2 using the CHI (the Calinski-Harabasz index). To compare the prediction performances of Hf, DTR (decision tree regression), GBRT (gradient boosted regression trees), RFR (random forest regression) and ETR (extra tree regression) were applied to build models of Cluster 1, Cluster 2 and the global dataset, respectively. The results showed that all four different regression models of Cluster 1 had a higher R-2, and lower MSE and MAE than those of the global dataset, while the models of Cluster 2 were poorer. Meanwhile, the GBRT model of Cluster 1 achieved a higher R-2 of 0.917, and lower MSE and MAE of 0.033 eV/atom and 0.125 eV/atom. We further validated and compared the generalization ability of the models by predicting the Hf of ABO3-type perovskite previously unseen in the training set. The two-stage machine-learning models proposed here can provide useful guidance for accelerating the exploration of materials with desired properties.
引用
收藏
页码:263 / 268
页数:6
相关论文
共 25 条
  • [1] Principal component analysis
    Abdi, Herve
    Williams, Lynne J.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04): : 433 - 459
  • [2] Experimental search for high-temperature ferroelectric perovskites guided by two-step machine learning
    Balachandran, Prasanna V.
    Kowalski, Benjamin
    Sehirlioglu, Alp
    Lookman, Turab
    [J]. NATURE COMMUNICATIONS, 2018, 9
  • [3] Predictions of new ABO3 perovskite compounds by combining machine learning and density functional theory
    Balachandran, Prasanna V.
    Emery, Antoine A.
    Gubernatis, James E.
    Lookman, Turab
    Wolverton, Chris
    Zunger, Alex
    [J]. PHYSICAL REVIEW MATERIALS, 2018, 2 (04):
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Caliski T., 1974, COMMUN STAT, V3, P1, DOI [10.1080/03610927408827101, DOI 10.1080/03610927408827101]
  • [6] Clustering methods and their uses in computational chemistry
    Downs, GM
    Barnard, JM
    [J]. REVIEWS IN COMPUTATIONAL CHEMISTRY, VOL 18, 2002, 18 : 1 - 40
  • [7] High-throughput DFT calculations of formation energy, stability and oxygen vacancy formation energy of ABO3 perovskites
    Emery, Antoine A.
    Wolverton, Chris
    [J]. SCIENTIFIC DATA, 2017, 4
  • [8] Crystal structure representations for machine learning models of formation energies
    Faber, Felix
    Lindmaa, Alexander
    von Lilienfeld, O. Anatole
    Armiento, Rickard
    [J]. INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY, 2015, 115 (16) : 1094 - 1101
  • [9] Greedy function approximation: A gradient boosting machine
    Friedman, JH
    [J]. ANNALS OF STATISTICS, 2001, 29 (05) : 1189 - 1232
  • [10] Extremely randomized trees
    Geurts, P
    Ernst, D
    Wehenkel, L
    [J]. MACHINE LEARNING, 2006, 63 (01) : 3 - 42