Deep transfer learning for predicting frontier orbital energies of organic materials using small data and its application to porphyrin photocatalysts

被引:8
|
作者
Su, An [1 ]
Zhang, Xin [1 ]
Zhang, Chengwei [1 ]
Ding, Debo [1 ]
Yang, Yun-Fang [1 ]
Wang, Keke [1 ]
She, Yuan-Bin [1 ]
机构
[1] Zhejiang Univ Technol, Coll Chem Engn, Hangzhou 310014, Peoples R China
基金
中国国家自然科学基金;
关键词
GAUSSIAN-TYPE BASIS; VISIBLE-LIGHT; BASIS-SETS; AQUEOUS-SOLUTIONS; DESIGN; CO2; HYDROXYLATION; POTENTIALS; CONVERSION; REDUCTION;
D O I
10.1039/d3cp00917c
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Machine learning (ML) models have received increasing attention as a new approach for the virtual screening of organic materials. Although some ML models trained on large databases have achieved high prediction accuracy, the application of ML to certain types of organic materials is limited by the small amount of available data. On the other hand, metalloporphyrins and porphyrins (MpPs) have received increasing attention as potential photocatalysts, and recent studies have found that both HOMO/LUMO energy levels and energy gaps are important factors controlling the MpP photocatalysts. Since the training data of MpPs are insufficient and limited to porphyrin-based dyes, in this study, we proposed a deep transfer learning approach to rapidly predict the HOMO/LUMO energy levels and energy gaps of MpPs. To complement the open-source Porphyrin-based Dyes Database (PBDD), we curated a new database, the Metalloporphyrins and Porphyrins Database (MpPD), in which MpPs were specifically designed as potential photocatalysts and the HOMO/LUMO energies were calculated by advanced DFT functionals. We proposed PorphyBERT, a BERT-based regression model that was pre-trained with PBDD and fine-tuned with MpPD. The model performed satisfactorily in predicting HOMO and LUMO energies and energy gap with RMSEs of 0.0955, 0.0988, and 0.0787 eV and MAEs of 0.0774, 0.0824, and 0.0549 eV. Furthermore, due to its unique unsupervised pre-training phase, the model is not affected by the difference in computational functionals between pre-training and fine-tuning databases. Finally, we recommended 12 MpPs as potential photocatalysts for CO2 reduction with out-of-sample model predictions of energy gaps close to the values calculated by DFT.
引用
收藏
页码:10536 / 10549
页数:14
相关论文
共 25 条
  • [1] Assessment of Predicting Frontier Orbital Energies for Small Organic Molecules Using Knowledge-Based and Structural Information
    Ye, Zong-Rong
    Hung, Sheng-Hsuan
    Chen, Berlin
    Tsai, Ming-Kang
    ACS ENGINEERING AU, 2022, 2 (04): : 360 - 368
  • [2] Construction frontier molecular orbital prediction model with transfer learning for organic materials
    Peng, Xinyu
    Liang, Jiaojiao
    Wang, Kuo
    Zhao, Xiaojie
    Peng, Zhiyan
    Li, Zhennan
    Zeng, Jinhui
    Lan, Zheng
    Lei, Min
    Huang, Di
    NPJ COMPUTATIONAL MATERIALS, 2024, 10 (01)
  • [3] Deep Learning Total Energies and Orbital Energies of Large Organic Molecules Using Hybridization of Molecular Fingerprints
    Rahaman, Obaidur
    Gagliardi, Alessio
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (12) : 5971 - 5983
  • [4] Predicting Small Molecule Transfer Free Energies by Combining Molecular Dynamics Simulations and Deep Learning
    Bennett, W. F. Drew
    He, Stewart
    Bilodeau, Camille L.
    Jones, Derek
    Sun, Delin
    Kim, Hyojin
    Allen, Jonathan E.
    Lightstone, Felice C.
    Ingolfsson, Helgi I.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (11) : 5375 - 5381
  • [5] Predicting Materials Properties with Little Data Using Shotgun Transfer Learning
    Yamada, Hironao
    Liu, Chang
    Wu, Stephen
    Koyama, Yukinori
    Ju, Shenghong
    Shiomi, Junichiro
    Morikawa, Junko
    Yoshida, Ryo
    ACS CENTRAL SCIENCE, 2019, 5 (10) : 1717 - 1730
  • [6] Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies
    Zhang, Zhihui
    Zhang, Chengwei
    Zhang, Yutao
    Deng, Shengwei
    Yang, Yun-Fang
    Su, An
    She, Yuan-Bin
    RSC ADVANCES, 2023, 13 (25) : 16952 - 16962
  • [7] A general deep transfer learning framework for predicting the flow field of airfoils with small data
    Wang, Zhipeng
    Liu, Xuejun
    Yu, Jian
    Wu, Haizhou
    Lyu, Hongqiang
    COMPUTERS & FLUIDS, 2023, 251
  • [8] Predicting socioeconomic indicators using transfer learning on imagery data: an application in Brazil
    Diego A. Castro
    Mauricio A. Álvarez
    GeoJournal, 2023, 88 : 1081 - 1102
  • [9] Predicting socioeconomic indicators using transfer learning on imagery data: an application in Brazil
    Castro, Diego A.
    Alvarez, Mauricio A.
    GEOJOURNAL, 2023, 88 (01) : 1081 - 1102
  • [10] Predicting the UV-Vis spectra of tetraarylcyclopentadienones: Using DFT molecular orbital energies to model electronic transitions of organic materials
    Potter, Robert G.
    Hughes, Thomas S.
    JOURNAL OF ORGANIC CHEMISTRY, 2008, 73 (08): : 2995 - 3004