Insight to the prediction of CO2 solubility in ionic liquids based on the interpretable machine learning model

被引:21
作者
Yang, Ao [1 ]
Sun, Shirui [2 ]
Su, Yang [3 ]
Kong, Zong Yang [4 ]
Ren, Jingzheng [5 ]
Shen, Weifeng [6 ]
机构
[1] Chongqing Univ Sci & Technol, Coll Safety Engn, Chongqing 401331, Peoples R China
[2] Yangtze Normal Univ, Coll Chem & Chem Engn, Chongqing 408100, Peoples R China
[3] Chongqing Univ Sci & Technol, Sch Intelligent Technol & Engn, Chongqing 401331, Peoples R China
[4] Sunway Univ, Sch Engn & Technol, Dept Engn, Bandar Sunway 47500, Selangor, Malaysia
[5] Hong Kong Polytech Univ, Res Inst Adv Mfg, Dept Ind & Syst Engn, Hong Kong, Peoples R China
[6] Chongqing Univ, Sch Chem & Chem Engn, Chongqing 400044, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine learning; CO; 2; capture; Interpretation model; QSPR; Ionic liquids; MOLECULAR DESIGN; LIGHTGBM; SYSTEMS; TOOLS;
D O I
10.1016/j.ces.2024.120266
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
In this work, we investigated three different machine learning (ML)-based models, i.e., gaussian process regression (GPR), LightGBM, and CatBoost, for predicting the solubility of CO2 in various ionic liquids (ILs). Three molecular descriptors, i.e., group contribution (GC), molecular structure descriptors (MSD), and hybrid GC-MSD are used in our three models. The performance of our developed models were rigorously evaluated using mean absolute error (MAE), coefficient of determination (R2), and mean relative error (MRE) (i.e., relative deviation in percentage), with each model subjected to multiple tests employing different random state parameters. The dataset underwent partitioning into training and testing sets at an 80:20 ratio, with additional splits at various ratios to assess prediction performance sensitivity. Overall, all models exhibited proficient CO2 solubility prediction in ILs, with performance varying based on descriptor type. Notably, the hybrid GC-MSD consistently outperformed others, attributed to GC-MSD incorporates a broader array of molecular feature information. Particularly, the CatBoost-GC-MSD model excelled, achieving an impressive R2 of 0.9925, MAE of 0.0122, and MRE of 11.1550%. Comparing our models to previous studies revealed the superior performance of CatBoost-GCMSD across all descriptor types. Furthermore, our model interpretation, employing shapley additive explanation (SHAP) analysis, identified pressure, temperature, Chi0, Kappa2, and EState_VSA10 as the top five influential input features. These findings provide valuable insights into the underlying molecular features affecting CO2 solubility in ILs and lay the foundation for future research in this field.
引用
收藏
页数:10
相关论文
共 37 条
[1]   Computer-aided molecular design: An introduction and review of tools, applications, and solution techniques [J].
Austin, Nick D. ;
Sahinidis, Nikolaos V. ;
Trahan, Daniel W. .
CHEMICAL ENGINEERING RESEARCH & DESIGN, 2016, 116 :2-26
[2]   Graph Neural Networks and Structural Information on Ionic Liquids: A Cheminformatics Study on Molecular Physicochemical Property Prediction [J].
Baran, Karol ;
Kloskowski, Adam .
JOURNAL OF PHYSICAL CHEMISTRY B, 2023, 127 (49) :10542-10555
[3]   LightGBM-PPI: Predicting protein-protein interactions through LightGBM with multi-information fusion [J].
Chen, Cheng ;
Zhang, Qingmei ;
Ma, Qin ;
Yu, Bin .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 191 :54-64
[4]   Computer-aided design of ionic liquids for hybrid process schemes [J].
Chen, Yuqiu ;
Koumaditi, Evangelia ;
Gani, Rafiqul ;
Kontogeorgis, Georgios M. ;
Woodley, John M. .
COMPUTERS & CHEMICAL ENGINEERING, 2019, 130
[5]   Integrated ionic liquid and process design involving azeotropic separation processes [J].
Chen, Yuqiu ;
Gani, Rafiqul ;
Kontogeorgis, Georgios M. ;
Woodley, John M. .
CHEMICAL ENGINEERING SCIENCE, 2019, 203 :402-414
[6]   Machine Learning Derived Quantitative Structure Property Relationship (QSPR) to Predict Drug Solubility in Binary Solvent Systems [J].
Chinta, Sivadurgaprasad ;
Rengaswamy, Raghunathan .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2019, 58 (08) :3082-3092
[7]   Gaussian Process Regression for Materials and Molecules [J].
Deringer, Volker L. ;
Bartok, Albert P. ;
Bernstein, Noam ;
Wilkins, David M. ;
Ceriotti, Michele ;
Csanyi, Gabor .
CHEMICAL REVIEWS, 2021, 121 (16) :10073-10141
[8]   Developing machine learning models for ionic conductivity of imidazolium-based ionic liquids [J].
Dhakal, Pratik ;
Shah, Jindal K. .
FLUID PHASE EQUILIBRIA, 2021, 549
[9]   Fast solvent screening via quantum chemistry: COSMO-RS approach [J].
Eckert, F ;
Klamt, A .
AICHE JOURNAL, 2002, 48 (02) :369-385
[10]   GROUP-CONTRIBUTION ESTIMATION OF ACTIVITY-COEFFICIENTS IN NONIDEAL LIQUID-MIXTURES [J].
FREDENSLUND, A ;
JONES, RL ;
PRAUSNITZ, JM .
AICHE JOURNAL, 1975, 21 (06) :1086-1099