Advancing hydrogen storage predictions in metal-organic frameworks: A comparative study of LightGBM and random forest models with data enhancement

被引:28
作者
Seyyedattar, Masoud [1 ]
Zendehboudi, Sohrab [1 ]
Ghamartale, Ali [2 ]
Afshar, Majid [3 ]
机构
[1] Mem Univ, Fac Engn & Appl Sci, St John, NF, Canada
[2] Univ Alberta, Dept Mech Engn, Edmonton, AB, Canada
[3] Univ Windsor, Sch Comp Sci, Windsor, ON, Canada
关键词
Hydrogen storage; Metal-Organic Frameworks (MOFs); Ensemble learning; Feature engineering; LightGBM; Random forest; ENERGY; STRATEGIES; ABSORPTION; ADSORPTION; SYSTEMS; CO2;
D O I
10.1016/j.ijhydene.2024.04.230
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
The escalating consumption of fossil fuels has given rise to a substantial upsurge in greenhouse gas concentrations and global temperatures, which, in turn, has triggered severe climate-related consequences. The critical imperative to reduce CO2 emissions and combat global warming has spurred extensive investigations into clean energy alternatives, with hydrogen emerging as a compelling zero-emission energy source. As a pivotal component of clean energy strategies, hydrogen requires designing compact, lightweight, and efficient storage systems. This study focuses on the development and evaluation of machine learning models for predicting the efficiency of Metal-Organic Frameworks (MOFs) in hydrogen storage, a key aspect of advancing clean energy technologies. MOFs, a class of nanoporous materials, show remarkable potential for hydrogen storage due to their high surface area and porosity. However, selecting the most suitable MOF for this application from a vast array of possible structures is a daunting task. In this context, machine learning algorithms offer an efficient alternative for predicting MOF suitability by considering their structural and chemical properties. We used ensemble learning methods, specifically Light Gradient Boosting Machine (LightGBM) and Random Forest (RF), to predict hydrogen uptake of MOFs based on a dataset of 219 experimentally tested samples. Two modeling scenarios were considered: one using the entire dataset, and the other involving strategic data pre-processing, including outlier removal and feature engineering. The results demonstrate that the measures taken to refine the dataset significantly enhance the predictive performance of the developed models, reducing prediction errors and improving overall goodness of fit. Specifically, the Mean Absolute Error (MAE) values for both the LightGBM and random forest models were reduced from 0.48 and 0.94, respectively, to 0.16 for both models, and the coefficients of determination (R-2) increased substantially from 0.84 and 0.72 to 0.95, in both cases. Moreover, feature importance analysis unveiled that pressure-related features make the most significant contributions to the formation of tree ensembles during the model training process. A parametric sensitivity analysis was conducted revealing that H-2 uptake in MOFs is most sensitive to changes in adsorption enthalpy, followed by surface area and temperature, while showing lower sensitivity to variations in pressure, consistent with established literature. These results underscore the pivotal role of data enhancement methods in refining machine learning models and can be instrumental in accelerating the development and optimization of MOF materials for clean energy applications.
引用
收藏
页码:158 / 172
页数:15
相关论文
共 71 条
[1]   Predicting hydrogen storage in MOFs via machine learning [J].
Ahmed, Alauddin ;
Siegel, Donald J. .
PATTERNS, 2021, 2 (07)
[2]   An assessment of strategies for the development of solid-state adsorbents for vehicular hydrogen storage [J].
Allendorf, Mark D. ;
Hulvey, Zeric ;
Gennett, Thomas ;
Ahmed, Alauddin ;
Autrey, Tom ;
Camp, Jeffrey ;
Cho, Eun Seon ;
Furukawa, Hiroyasu ;
Haranczyk, Maciej ;
Head-Gordon, Martin ;
Jeong, Sohee ;
Karkamkar, Abhi ;
Liu, Di-Jia ;
Long, Jeffrey R. ;
Meihaus, Katie R. ;
Nayyar, Iffat H. ;
Nazarov, Roman ;
Siegel, Donald J. ;
Stavila, Vitalie ;
Urban, Jeffrey J. ;
Veccham, Srimukh Prasad ;
Wood, Brandon C. .
ENERGY & ENVIRONMENTAL SCIENCE, 2018, 11 (10) :2784-2812
[3]  
Alsabti K, 1998, 4 KNOWL DISC DAT MIN
[4]  
[Anonymous], 2020, Hydrogen Strategy For Canada
[5]  
[Anonymous], 2020, FUEL CELLS B, V2020, P12, DOI 10.1016/S1464-2859(20)30532-0
[6]   Molecular modelling and machine learning for high-throughput screening of metal-organic frameworks for hydrogen storage [J].
Bobbitt, N. Scott ;
Snurr, Randall Q. .
MOLECULAR SIMULATION, 2019, 45 (14-15) :1069-1081
[7]   High-Throughput Screening of Metal-Organic Frameworks for Hydrogen Storage at Cryogenic Temperature [J].
Bobbitt, N. Scott ;
Chen, Jiayi ;
Snurr, Randall Q. .
JOURNAL OF PHYSICAL CHEMISTRY C, 2016, 120 (48) :27328-27341
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[10]   Direct air capture of CO2 via aqueous-phase absorption and crystalline-phase release using concentrated solar power [J].
Brethome, Flavien M. ;
Williams, Neil J. ;
Seipp, Charles A. ;
Kidder, Michelle K. ;
Custelcean, Radu .
NATURE ENERGY, 2018, 3 (07) :553-559