Particle swarm optimization-based feature selection in sentiment classification

被引:67
作者
Shang, Lin [1 ]
Zhou, Zhe [1 ]
Liu, Xing [1 ]
机构
[1] Nanjing Univ, Dept Comp Sci & Technol, State Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Sentiment classification; Feature selection; Particle swarm optimization; Binary particle swarm optimization;
D O I
10.1007/s00500-016-2093-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment classification is one of the important tasks in text mining, which is to classify documents according to their opinion or sentiment. Documents in sentiment classification can be represented in the form of feature vectors, which are employed by machine learning algorithms to perform classification. For the feature vectors, the feature selection process is necessary. In this paper, we will propose a feature selection method called fitness proportionate selection binary particle swarm optimization (F-BPSO). Binary particle swarm optimization (BPSO) is the binary version of particle swam optimization and can be applied to feature selection domain. F-BPSO is a modification of BPSO and can overcome the problems of traditional BPSO including unreasonable update formula of velocity and lack of evaluation on every single feature. Then, some detailed changes are made on the original F-BPSO including using fitness sum instead of average fitness in the fitness proportionate selection step. The modified method is, thus, called fitness sum proportionate selection binary particle swarm optimization (FS-BPSO). Moreover, further modifications are made on the FS-BPSO method to make it more suitable for sentiment classification-oriented feature selection domain. The modified method is named as SCO-FS-BPSO where SCO stands for "sentiment classification-oriented". Experimental results show that in benchmark datasets original F-BPSO is superior to traditional BPSO in feature selection performance and FS-BPSO outperforms original F-BPSO. Besides, in sentiment classification domain, SCO-FS-BPSO which is modified specially for sentiment classification is superior to traditional feature selection methods on subjective consumer review datasets.
引用
收藏
页码:3821 / 3834
页数:14
相关论文
共 34 条
[1]   Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums [J].
Abbasi, Ahmed ;
Chen, Hsinchun ;
Salem, Arab .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2008, 26 (03)
[2]  
[Anonymous], 2005, Fundamentals of Computational Swarm Intelligence
[3]  
[Anonymous], 2003, J. Econ. Soc. Res
[4]  
Baojun Qiu, 2011, Proceedings of the 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and IEEE Third International Conference on Social Computing (PASSAT/SocialCom 2011), P274, DOI 10.1109/PASSAT/SocialCom.2011.127
[5]   Effective Text Classification by a Supervised Feature Selection Approach [J].
Basu, Tanmay ;
Murthy, C. A. .
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, :918-925
[6]  
Cervante Liam, 2012, AI 2012: Advances in Artificial Intelligence. 25th Australasian Conference. Proceedings, P313, DOI 10.1007/978-3-642-35101-3_27
[7]  
Dash M., 1997, Intelligent Data Analysis, V1
[8]  
Dong Z, 2000, HOWNET
[9]  
Eberhart R., 1996, Computational intelligence PC tools
[10]  
Forman G., 2003, Journal of Machine Learning Research, V3, P1289, DOI 10.1162/153244303322753670