Generalizing property prediction of ionic liquids from limited labeled data: a one-stop framework empowered by transfer learning

被引:15
作者
Chen, Guzhong [1 ,2 ]
Song, Zhen [1 ]
Qi, Zhiwen [1 ]
Sundmacher, Kai [2 ,3 ]
机构
[1] East China Univ Sci & Technol, Sch Chem Engn, State Key Lab Chem Engn, 130 Meilong Rd, Shanghai 200237, Peoples R China
[2] Max Planck Inst Dynam Complex Tech Syst, Proc Syst Engn, Sandtorstr 1, D-39106 Magdeburg, Germany
[3] Otto von Guericke Univ, Proc Syst Engn, Univ Pl 2, D-39106 Magdeburg, Germany
来源
DIGITAL DISCOVERY | 2023年 / 2卷 / 03期
基金
中国国家自然科学基金;
关键词
GROUP-CONTRIBUTION QSPRS; EXTENSIVE DATABASES; SOLVENTS; DESIGN; SMILES; STATE;
D O I
10.1039/d3dd00040k
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Ionic liquids (ILs) could find use in almost every chemical process due to their wide spectrum of unique properties. The crux of the matter lies in whether a task-specific IL selection from enormous chemical space can be achieved by property prediction, for which limited labeled data represents a major obstacle. Here, we propose a one-stop ILTransR (IL transfer learning of representations) that employs large-scale unlabeled data for generalizing IL property prediction from limited labeled data. By first pre-training on similar to 10 million IL-like molecules, IL representations are derived from the encoder state of a transformer model. Employing the pre-trained IL representations, convolutional neural network (CNN) models for IL property prediction are trained and tested on eleven datasets of different IL properties. The obtained ILTransR presents superior performance as opposed to state-of-the-art models in all benchmarks. The application of ILTransR is exemplified by extensive screening of CO2 absorbent from a huge database of 8 333 096 synthetically-feasible ILs. We are introducing ILTransR, a transfer learning based one-stop framework to predict ionic liquid (IL) properties. High accuracy can be achieved by pre-training the model on millions of unlabeled data and fine-tuning on limited labeled data.
引用
收藏
页码:591 / 601
页数:12
相关论文
共 56 条
  • [1] Ionic Liquids as Modifying Agents for Protein Separation in Centrifugal Partition Chromatography
    Bezold, Franziska
    Roehrer, Simon
    Minceva, Mirjana
    [J]. CHEMICAL ENGINEERING & TECHNOLOGY, 2019, 42 (02) : 474 - 482
  • [2] Bjerrum E.J., 2017, arXiv
  • [3] Ionic liquids: Innovative fluids for chemical processing
    Brennecke, JF
    Maginn, EJ
    [J]. AICHE JOURNAL, 2001, 47 (11) : 2384 - 2389
  • [4] High Prevalence of Assisted Injection Among Street-Involved Youth in a Canadian Setting
    Cheng, Tessa
    Kerr, Thomas
    Small, Will
    Dong, Huiru
    Montaner, Julio
    Wood, Evan
    DeBeck, Kora
    [J]. AIDS AND BEHAVIOR, 2016, 20 (02) : 377 - 384
  • [5] Predictive methods for the estimation of thermophysical properties of ionic liquids
    Coutinho, Joao A. P.
    Carvalho, Pedro J.
    Oliveira, Nuno M. C.
    [J]. RSC ADVANCES, 2012, 2 (19) : 7322 - 7346
  • [6] Hidden representations in deep neural networks: Part 2. Regression problems
    Das, Laya
    Sivaram, Abhishek
    Venkatasubramanian, Venkat
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2020, 139
  • [7] Advances in QSPR/QSTR models of ionic liquids for the design of greener solvents of the future
    Das, Rudra Narayan
    Roy, Kunal
    [J]. MOLECULAR DIVERSITY, 2013, 17 (01) : 151 - 196
  • [8] Molecular fingerprint-based machine learning assisted QSAR model development for prediction of ionic liquid properties
    Ding, Yi
    Chen, Minchun
    Guo, Chao
    Zhang, Peng
    Wang, Jingwen
    [J]. JOURNAL OF MOLECULAR LIQUIDS, 2021, 326
  • [9] ILThermo: A free-access web database for thermodynamic properties of ionic liquids
    Dong, Qian
    Muzny, Chris D.
    Kazakov, Andrei
    Diky, Vladimir
    Magee, Joseph W.
    Widegren, Jason A.
    Chirico, Robert D.
    Marsh, Kenneth N.
    Frenkel, Michael
    [J]. JOURNAL OF CHEMICAL AND ENGINEERING DATA, 2007, 52 (04) : 1151 - 1159
  • [10] Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
    Gomez-Bombarelli, Rafael
    Wei, Jennifer N.
    Duvenaud, David
    Hernandez-Lobato, Jose Miguel
    Sanchez-Lengeling, Benjamin
    Sheberla, Dennis
    Aguilera-Iparraguirre, Jorge
    Hirzel, Timothy D.
    Adams, Ryan P.
    Aspuru-Guzik, Alan
    [J]. ACS CENTRAL SCIENCE, 2018, 4 (02) : 268 - 276