A Monte Carlo and Kernel Density Estimation based virtual sample generation method for small data modeling problem

被引:0
作者
Zhu, Qun-Xiong [1 ]
Wang, Zhi-Hui [1 ]
He, Yan-Lin [1 ]
Xu, Yuan [1 ]
机构
[1] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing, Peoples R China
来源
2020 CHINESE AUTOMATION CONGRESS (CAC 2020) | 2020年
基金
中国国家自然科学基金;
关键词
kernel density estimation; Monte Carlo; neural network; bat algorithm; virtual sample generation; energy prediction; PREDICTION; ACCURACY;
D O I
10.1109/cac51589.2020.9326486
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In early industrial production, due to the limited resources, enterprises need to use the limited data to analyze the production status and product quality in order to reduce the waste of resources and funds. This requires building a model with high accuracy. Due to the small amount of data, the accuracy of the model based on small samples is low. The technologY of generating virtual sample is often used, according to the information interval between sample data to fill in it with an effective way to expand the amount of sample data. A novel kernel density estimation based on distribution with sample output variables is proposed. Monte Carlo sampling is used to fill the gap between sample distribution and realize the uniform distribution of samples. Combined with Bagging-RBF neural network and bat algorithm (BA), effective virtual samples are generated. Two experiments, MLCC and PTA, show that the virtual samples arc more effective.
引用
收藏
页码:1123 / 1128
页数:6
相关论文
共 15 条
[1]  
[Anonymous], 2018, INT C EM TRENDS ENG, DOI DOI 10.1145/3279996.3279998
[2]   A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting [J].
Chang, Che-Jung ;
Li, Der-Chiang ;
Dai, Wen-Li ;
Chen, Chien-Chih .
NEUROCOMPUTING, 2014, 129 :343-349
[3]   Research on Traffic Flow Prediction in the Big Data Environment Based on the Improved RBF Neural Network [J].
Chen, Dawei .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (04) :2000-2008
[4]   A PSO based virtual sample generation method for small sample sets: Applications to regression datasets [J].
Chen, Zhong-Sheng ;
Zhu, Bao ;
He, Yan-Lin ;
Yu, Le-An .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2017, 59 :236-243
[5]  
Fedorova IJ, 2018, INT J PURE APPL MATH, V119, P2617
[6]   Modeling and prediction of ultrasonic attenuations in liquid-solid dispersions containing mixed particles with Monte Carlo method [J].
Gu, Jianfei ;
Fan, Fengxian ;
Li, Yunsi ;
Yang, Huinan ;
Su, Mingxu ;
Cai, Xiaoshu .
PARTICUOLOGY, 2019, 43 :84-91
[7]   A new method to help diagnose cancers for small sample size [J].
Li, Der-Chiang ;
Hsu, Hung-Chang ;
Tsai, Tung-I ;
Lu, Te-Jung ;
Hu, Susan C. .
EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (02) :420-424
[8]   Improving learning accuracy by using synthetic samples for small datasets with non-linear attribute dependency [J].
Li, Der-Chiang ;
Lin, Liang-Sian ;
Peng, Li-Jhong .
DECISION SUPPORT SYSTEMS, 2014, 59 :286-295
[9]   A new approach to assess product lifetime performance for small data sets [J].
Li, Der-Chiang ;
Lin, Liang-Sian .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2013, 230 (02) :290-298
[10]   ESTIMATION OF A PROBABILITY DENSITY-FUNCTION AND MODE [J].
PARZEN, E .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (03) :1065-&