Multisource Latent Feature Selective Ensemble Modeling Approach for Small-Sample High-Dimensional Process Data in Applications

被引：3

作者：

Tang, Jian ^{[1
,2
]}

Zhang, Jian ^{[3
]}

Yu, Gang ^{[4
]}

Zhang, Wenping ^{[5
]}

Yu, Wen ^{[6
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

[2] Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China

[3] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China

[4] State Beijing Key Lab Proc Automat Min & Met, Beijing 102600, Peoples R China

[5] Shandong Gold Min Technol Co Ltd, Met Lab Branch, Jinan 250014, Peoples R China

[6] CINVESTAV IPN Natl Polytech Inst, Dept Control Automat, Mexico City 07360, DF, Mexico

来源：

IEEE ACCESS | 2020年 / 8卷

基金：

美国国家科学基金会; 北京市自然科学基金;

关键词：

Feature extraction; Data models; Adaptation models; Pollution measurement; Training; Data mining; Analytical models; Multisource feature extraction; multi-layered feature selection; selective ensemble modeling; hyperparameter selection; high dimensional process data; EXTREME LEARNING-MACHINE; OPTIMIZATION; PARAMETERS; VIBRATION; SIZE;

D O I：

10.1109/ACCESS.2020.3015875

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Several difficult-to-measure production qualities or environment pollution indices of industrial process must be measured using offline laboratory instruments. Soft measurement method is often used to perform online prediction of such parameters. Only small-sample modeling data with high-dimensional input features can be obtained due to the limitations and complex characteristics of the measurement device and process, respectively. Therefore, a new multisource latent feature selective ensemble (SEN) modeling approach is proposed in this study. First, input features are divided into different subgroups according to the characteristics of the modeling data. Second, the extracted multisource latent features evolve from the multi-layered selection algorithms, which are specified by feature reduction ratio, feature contribution ratio and mutual information value orderly for each subgroup. Finally, in order to construct candidate sub-models, an adaptive hyper-parameter selection algorithm based on the multi-step grid search is employed in terms of the reduced features. Sequentially, the optimized ensemble submodels with their weighting strategies are adaptively determined to build the final SEN model. The proposed method is verified by using benchmark near-infrared data, high dimensional mechanical frequency spectrum data and industrial dioxin emission concentration data.

引用

页码：148475 / 148488

页数：14

共 42 条

[1]

35, 2017, RENEW ENERG RESOUR, V35, P1107

[2]

[Anonymous], 2019, 2019 IEEE 1 INT

[3]

[Anonymous], 2019, NEUROCOMPUTING, DOI DOI 10.1016/J.NEUCOM.2018.11.067

[4]

[Anonymous], 2018, CHIN CONTR CONF

[5]

[Anonymous], 2016, P 2016 5 IIAI INT C, DOI DOI 10.1109/IIAI-AAI.2016.19

[6]

Brown G., 2005, Information Fusion, V6, P5, DOI 10.1016/j.inffus.2004.04.004

[7]

[柴天佑 Chai Tianyou], 2013, [自动化学报, Acta Automatica Sinica], V39, P1744

[8] A Novel Evolutionary Algorithm for Dynamic Constrained Multiobjective Optimization Problems [J].

Chen, Qingda ;

Ding, Jinliang ;

Yang, Shengxiang ;

Chai, Tianyou .

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2020, 24 (04) :792-806

[9] Analysis of feature selection stability on high dimension and small sample data [J].

Dernoncourt, David ;

Hanczar, Blaise ;

Zucker, Jean-Daniel .

COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 :681-693

[10] Combining meta-learning and search techniques to select parameters for support vector machines [J].

Gomes, Taciana A. F. ;

Prudencio, Ricardo B. C. ;

Soares, Carlos ;

Rossi, Andre L. D. ;

Carvalho, Andre .

NEUROCOMPUTING, 2012, 75 (01) :3-13

← 1 2 3 4 5 →