Particle swarm optimization-based feature selection in sentiment classification

被引:0
作者
Lin Shang
Zhe Zhou
Xing Liu
机构
[1] Nanjing University,Department of Computer Science and Technology, State Key Laboratory of Novel Software Technology
来源
Soft Computing | 2016年 / 20卷
关键词
Sentiment classification; Feature selection; Particle swarm optimization; Binary particle swarm optimization;
D O I
暂无
中图分类号
学科分类号
摘要
Sentiment classification is one of the important tasks in text mining, which is to classify documents according to their opinion or sentiment. Documents in sentiment classification can be represented in the form of feature vectors, which are employed by machine learning algorithms to perform classification. For the feature vectors, the feature selection process is necessary. In this paper, we will propose a feature selection method called fitness proportionate selection binary particle swarm optimization (F-BPSO). Binary particle swarm optimization (BPSO) is the binary version of particle swam optimization and can be applied to feature selection domain. F-BPSO is a modification of BPSO and can overcome the problems of traditional BPSO including unreasonable update formula of velocity and lack of evaluation on every single feature. Then, some detailed changes are made on the original F-BPSO including using fitness sum instead of average fitness in the fitness proportionate selection step. The modified method is, thus, called fitness sum proportionate selection binary particle swarm optimization (FS-BPSO). Moreover, further modifications are made on the FS-BPSO method to make it more suitable for sentiment classification-oriented feature selection domain. The modified method is named as SCO-FS-BPSO where SCO stands for “sentiment classification-oriented”. Experimental results show that in benchmark datasets original F-BPSO is superior to traditional BPSO in feature selection performance and FS-BPSO outperforms original F-BPSO. Besides, in sentiment classification domain, SCO-FS-BPSO which is modified specially for sentiment classification is superior to traditional feature selection methods on subjective consumer review datasets.
引用
收藏
页码:3821 / 3834
页数:13
相关论文
共 41 条
[1]  
Abbasi A(2008)Sentiment analysis in multiple languages: feature selection for opinion classification in web forums ACM Trans Inf Syst 26 12-156
[2]  
Chen H(1997)Feature selection for classification Intell Data Anal 1 131-1305
[3]  
Salem A(2003)An extensive empirical study of feature selection metrics for text classification J Mach Learn Res 3 1289-1948
[4]  
Dash M(1995)Particle swarm optimization Proc IEEE Int Conf Neural Netw 4 1942-324
[5]  
Liu H(1997)Wrappers for feature subset selection Artif Intell 97 273-1166
[6]  
Forman G(2008)Modified binary particle swarm optimization Progr Nat Sci 18 1161-3111
[7]  
Kennedy J(2007)A discrete version of particle swarm optimization for flowshop scheduling problems Computers Oper Res 34 3099-135
[8]  
Eberhart R(2008)Opinion mining and sentiment analysis Found Trends Inf Retr 2 1-176
[9]  
Kohavi R(2007)Particle swarm optimization-based algorithms for tsp and generalized tsp Inf Process Lett 103 169-14
[10]  
John GH(2013)A fast clustering-based feature subset selection algorithm for high-dimensional data IEEE Trans Knowl Data Eng 25 1-20