Variable selection and data fusion for diesel cetane number prediction

被引:2
作者
Buendia-Garcia, J. [1 ,3 ]
Lacoue-Negre, M. [1 ,3 ]
Gornay, J. [1 ]
Mas-Garcia, S. [2 ,3 ]
Bendoula, R. [2 ,3 ]
Roger, J. M. [2 ,3 ]
机构
[1] IFP Energies Nouvelles, Solaize, France
[2] Univ Montpellier, Inst Agro, ITAP, INRAE, Montpellier, France
[3] ChemHouse Res Grp, Montpellier, France
关键词
Variable selection; Near-Infrared (NIR); Process variables; Data fusion; Hydrocracking; Diesel fuel; Cetane number; MULTIVARIATE CALIBRATION; SPECTROSCOPY; ALGORITHM; MODEL;
D O I
10.1016/j.fuel.2022.126297
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
This study evaluates the potential of variable selection to improve the performance of data fusion modelling to estimate diesel cetane number from NIR spectroscopy information acquired on total effluent samples obtained from the hydrocracking process and their operating variables. The evaluation conducted in this research was divided into four steps. First, predictive models were developed using each data block separately. Next, seven variable selection methods were applied on the NIR block, and eleven methods were applied on the process variable block. Then, with each data set generated from the variable selection analysis, single prediction models were generated and compared with those developed in the first step. Finally, data fusion was performed once the best variable selection method was defined for each data block. Two data fusion models were generated, a first using all the variables in the two blocks and a second using only the previously selected variables. In addition, the potential of the sequential and orthogonalized covariance selection (SO-CovSel) method was also analyzed. The results showed that the data fusion modelling using all variables from each data block improves the estimation of the diesel cetane number compared to single models (about 20% reduction of the RMSEP). However, using variable selection analysis before data fusion significantly improves the estimation of this property and leads to greater model stability regarding the RMSE's and r's (about 47% of the RMSEP). The Covariance Selection (CovSel) method was the most efficient in the NIR data block, while for the process variable data block, it was the sequential backward floating feature selection method (SBFFS) that gave the best performance. The advantages offered by the variable selection resulted not only in having a more accurate prediction of the property but also in improving the analysis and understanding of the process by determining the variables that significantly impact the property studied.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Development of performance and emission characteristics on coated diesel engine fuelled by biodiesel with cetane number enhancing additive
    Musthafa, M. Mohamed
    ENERGY, 2017, 134 : 234 - 239
  • [42] Clustering and variable selection for categorical multivariate data
    Bontemps, Dominique
    Toussile, Wilson
    ELECTRONIC JOURNAL OF STATISTICS, 2013, 7 : 2344 - 2371
  • [43] DFT studies of the hydrogen abstraction from primary alcohols by O2 in relation with cetane number data
    Abou-Rachid, H
    El Marrouni, K
    Kaliaguine, S
    JOURNAL OF MOLECULAR STRUCTURE-THEOCHEM, 2003, 631 : 241 - 250
  • [44] AN ALTERNATIVE APPROACH TO VARIABLE SELECTION FOR PREDICTION
    CHATTERJEE, SK
    SAMANTA, SK
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1994, 23 (08) : 2157 - 2174
  • [45] Cetane number prediction for hydrocarbons from molecular structural descriptors based on active subspace methodology
    Guan, Cheng
    Zhai, Jiaqi
    Han, Dong
    FUEL, 2019, 249 : 1 - 7
  • [46] Variable Selection on Reflectance NIR Spectra for the Prediction of TSS in Intact Berries of Thompson Seedless Grapes
    Chariskou, Chrysanthi
    Vrochidou, Eleni
    Daniels, Andries J.
    Kaburlasos, Vassilis G.
    AGRONOMY-BASEL, 2022, 12 (09):
  • [47] Prediction of cetane number of biodiesel fuel from the fatty acid methyl ester (FAME) composition
    Bamgboye, A. I.
    Hansen, A. C.
    INTERNATIONAL AGROPHYSICS, 2008, 22 (01) : 21 - 29
  • [48] Fuzzy logic method for the prediction of cetane number using carbon number, double bounds, iodic, and saponification values of biodiesel fuels
    Ardabili, Sina Faizollahzadeh
    Najafi, Bahman
    Shamshirband, Shahaboddin
    ENVIRONMENTAL PROGRESS & SUSTAINABLE ENERGY, 2019, 38 (02) : 584 - 599
  • [49] A modified Nadaraya-Watson procedure for variable selection and nonparametric prediction with missing data
    Cheung, Kin Yap
    Lee, Stephen M. S.
    JOURNAL OF NONPARAMETRIC STATISTICS, 2024, 36 (03) : 825 - 862
  • [50] Study on cetane number dependence of diesel surrogates/air weak flames in a micro flow reactor with a controlled temperature profile
    Suzuki, Satoshi
    Hori, Mikito
    Nakamura, Hisashi
    Tezuka, Takuya
    Hasegawa, Susumu
    Maruta, Kaoru
    PROCEEDINGS OF THE COMBUSTION INSTITUTE, 2013, 34 : 3411 - 3417