A feature selection model based on genetic rank aggregation for text sentiment classification

被引:312
作者
Onan, Aytug [1 ]
Korukoglu, Serdar [2 ]
机构
[1] Celal Bayar Univ, Manisa, Turkey
[2] Ege Univ, Izmir, Turkey
关键词
Feature selection; rank aggregation; sentiment classification; TRAVELING SALESMAN PROBLEM; ALGORITHMS;
D O I
10.1177/0165551515613226
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis is an important research direction of natural language processing, text mining and web mining which aims to extract subjective information in source materials. The main challenge encountered in machine learning method-based sentiment classification is the abundant amount of data available. This amount makes it difficult to train the learning algorithms in a feasible time and degrades the classification accuracy of the built model. Hence, feature selection becomes an essential task in developing robust and efficient classification models whilst reducing the training time. In text mining applications, individual filter-based feature selection methods have been widely utilized owing to their simplicity and relatively high performance. This paper presents an ensemble approach for feature selection, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained. In order to aggregate the individual feature lists, a genetic algorithm has been utilized. Experimental evaluations indicated that the proposed aggregation model is an efficient method and it outperforms individual filter-based feature selection methods on sentiment classification.
引用
收藏
页码:25 / 38
页数:14
相关论文
共 48 条
[1]  
Aggarwal C. C., 2012, MINING TEXT DATA, P163, DOI [DOI 10.1007/978-1-4614-3223-46, DOI 10.1007/978-1-4614-3223-4, 10.1007/978-1-4614-3223-4]
[2]   INSTANCE-BASED LEARNING ALGORITHMS [J].
AHA, DW ;
KIBLER, D ;
ALBERT, MK .
MACHINE LEARNING, 1991, 6 (01) :37-66
[3]   A feature selection technique for classificatory analysis [J].
Ahmad, A ;
Dey, L .
PATTERN RECOGNITION LETTERS, 2005, 26 (01) :43-56
[4]   Tackling the rank aggregation problem with evolutionary algorithms [J].
Aledo, Juan A. ;
Gamez, Jose A. ;
Molina, David .
APPLIED MATHEMATICS AND COMPUTATION, 2013, 222 :632-644
[5]  
[Anonymous], 1997, ICML
[6]  
[Anonymous], 2006, TENCON 2006 2006 IEE
[7]  
Batista GEAPA., 2009, ARGENTINE S ARTIFICI, P1
[8]  
Bo Pang, 2008, Foundations and Trends in Information Retrieval, V2, P1, DOI 10.1561/1500000001
[9]  
Bouaguel Waad, 2013, Mining Intelligence and Knowledge Exploration. First International Conference, MIKE 2013. Proceedings: LNCS 8284, P7, DOI 10.1007/978-3-319-03844-5_2
[10]  
Bouaguel Waad, 2013, KDIR & KMIS 2013. Proceedings of the International Conference on Knowledge Discovery and Information Retrieval and the International Conference on Knowledge Management and Information Sharing, P74