A novel virtual sample generation method based on Gaussian distribution

被引：124

作者：

Yang, Jing ^{[1
]}

Yu, Xu ^{[1
]}

Xie, Zhi-Qiang ^{[1
,2
]}

Zhang, Jian-Pei ^{[1
]}

机构：

[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin, Peoples R China

[2] Harbin Univ Sci & Technol, Coll Comp Sci & Technol, Harbin, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2011年 / 24卷 / 06期

基金：

中国博士后科学基金; 黑龙江省自然科学基金; 中国国家自然科学基金;

关键词：

Virtual sample; Regularization theory; Cost-sensitive learning; Gaussian distribution; Prior knowledge;

D O I：

10.1016/j.knosys.2010.12.010

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Traditional machine learning algorithms are not with satisfying generalization ability on noisy, imbalanced, and small sample training set. In this work, a novel virtual sample generation (VSG) method based on Gaussian distribution is proposed. Firstly, the method determines the mean and the standard error of Gaussian distribution. Then, virtual samples can be generated by such Gaussian distribution. Finally, a new training set is constructed by adding the virtual samples to the original training set. This work has shown that training on the new training set is equivalent to a form of regularization regarding small sample problems, or cost-sensitive learning regarding imbalanced sample problems. Experiments show that given a suitable number of virtual sample replicates, the generalization ability of the classifiers on the new training sets can be better than that on the original training sets. (C) 2011 Published by Elsevier B.V.

引用

页码：740 / 748

页数：9

共 28 条

[1]

AN IG, 1996, NEURAL COMPUT, P643

[2] Survey and critique of techniques for extracting rules from trained artificial neural networks [J].

Andrews, R ;

Diederich, J ;

Tickle, AB .

KNOWLEDGE-BASED SYSTEMS, 1995, 8 (06) :373-389

[3]

[Anonymous], CHIN J COMPUT

[4]

[Anonymous], 1977, Solution of illposed problems

[5] TRAINING WITH NOISE IS EQUIVALENT TO TIKHONOV REGULARIZATION [J].

BISHOP, CM .

NEURAL COMPUTATION, 1995, 7 (01) :108-116

[6] A tutorial on Support Vector Machines for pattern recognition [J].

Burges, CJC .

DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167

[7] SUPPORT-VECTOR NETWORKS [J].

CORTES, C ;

VAPNIK, V .

MACHINE LEARNING, 1995, 20 (03) :273-297

[8]

DONG CY, 2004, THESIS XIDIAN U CHIN

[9] A neural network approach for solving linear bilevel programming problem [J].

Hu, Tiesong ;

Guo, Xuning ;

Fu, Xiang ;

Lv, Yibing .

KNOWLEDGE-BASED SYSTEMS, 2010, 23 (03) :239-242

[10]

Kohavi R, 1995, P 14 INT JOINT C ART, V2, P1137, DOI DOI 10.5555/1643031.1643047

← 1 2 3 →