Machine learning and deep learning enabled fuel sooting tendency prediction from molecular structure

被引:10
作者
Li, Runzhao [1 ]
Herreros, Jose Martin [1 ]
Tsolakis, Athanasios [1 ]
Yang, Wenzhao [2 ]
机构
[1] Univ Birmingham, Coll Engn & Phys Sci, Sch Engn, Dept Mech Engn, Birmingham B15 2TT, W Midlands, England
[2] Shenzhen Gas Corp Ltd, 268 Meiao 1st Rd, Shenzhen 518049, Peoples R China
基金
英国工程与自然科学研究理事会; “创新英国”项目;
关键词
YSI prediction; Molecular structure; Machine learning; Quantitative structure-property relationship; Deep learning; Convolution neural network; HYDROCARBON FUELS; NEURAL-NETWORKS; FORMULATION; INDEX;
D O I
10.1016/j.jmgm.2021.108083
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Soot formation models become increasingly important in advanced renewable fuels formulation for soot reduction benefit. This work evaluates performance of machine learning (ML) and deep learning (DL) to predict yield sooting index (YSI) from chemical structure and proposes a tailor-made convolution neural network (CNN)SDSeries38 for regression problem. In ML, a novel quantitative structure-property relationship (QSPR) is developed for feature extraction and the relationship between molecular structure and YSI is built by ML algorithm. In DL, SDSeries38 contains 9 feature learning modules, 1 regression module for automated feature learning and regression. It adopts standard series network architecture and modular structure, each feature learning module is a stack of convolution, batch normalization, activation, pooling layers. ML-QSPR model outperforms SDSeries38 in accuracy (RMSE = 7.563 vs 19.58), computational speed and the former applies to fuel mixtures. In DL, SDSeries38 network exceeds 10 classical CNN and provides a generic architecture enabling transfer application to other regression problem. DL application to regression is still in its infancy and there is no complete guide on how to develop specific CNN architectures for regression. Some gaps need to be filled: (1) Specially developed CNN architectures for regression are required; (2) The performances of direct transfer learning the classical CNN architectures from classification to regression are modest. A modular structure with typical function modules may provide an ideal solution; (3) Going deeper into the sequence of convolution layers improves predictive accuracy, but bears in mind to keep the number of layers below the threshold to avoid vanishing gradient.
引用
收藏
页数:18
相关论文
共 55 条
  • [1] [Anonymous], 2019, D132219 ASTM
  • [2] [Anonymous], 2020, Statistics and Machine Learning Toolbox (R2020b)
  • [3] [Anonymous], 2019, CHOOS REGR MOD OPT
  • [4] [Anonymous], COEFFICIENT DETERMIN
  • [5] [Anonymous], Deep learning toolbox - matlab
  • [6] Group additivity in soot formation for the example of C-5 oxygenated hydrocarbon fuels
    Barrientos, Eduardo J.
    Lapuerta, Magin
    Boehman, Andre L.
    [J]. COMBUSTION AND FLAME, 2013, 160 (08) : 1484 - 1498
  • [7] Beale M.H., 2020, Deep Learning Toolbox™ User's Guide
  • [8] Systematic performance evaluation of gasoline molecules based on quantitative structure-property relationship models
    Cai, Guangqing
    Liu, Zhefu
    Zhang, Linzhou
    Shi, Quan
    Zhao, Suoqi
    Xu, Chunming
    [J]. CHEMICAL ENGINEERING SCIENCE, 2021, 229
  • [9] EFFECT OF MOLECULAR-STRUCTURE ON INCIPIENT SOOT FORMATION
    CALCOTE, HF
    MANOS, DM
    [J]. COMBUSTION AND FLAME, 1983, 49 (1-3) : 289 - 304
  • [10] Chen J.-H., 2021, BRIEF BIOINFORM, V22