Text Classification Using Ensemble Features Selection and Data Mining Techniques

被引:1
作者
Shravankumar, B. [1 ,2 ]
Ravi, Vadlamani [1 ]
机构
[1] Inst Dev & Res Banking Technol, Ctr Excellence CRM & Analyt, Hyderabad 500057, Andhra Pradesh, India
[2] Univ Hyderabad, SCIS, Hyderabad 500046, Andhra Pradesh, India
来源
SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, SEMCCO 2014 | 2015年 / 8947卷
关键词
Text mining; Document classification; Feature selection; Classification models;
D O I
10.1007/978-3-319-20294-5_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text categorization is a task of text mining/analytics which involves extracting useful information from unstructured resources followed by categorizing these documents. In this paper, we classify the TechTC dataset collected from various Web directories. We employed feature selection methods such as Gini index, chi-square, t-statistic, correlation which drastically reduced the model building time. Various neural network models such as probabilistic neural network, group method of data handling, multi layer perceptron yielded higher accuracies compared to other techniques applied in literature.
引用
收藏
页码:176 / 186
页数:11
相关论文
共 38 条
[1]  
Aggarwal CC, 2011, SOCIAL NETWORK DATA ANALYTICS, P353
[2]  
[Anonymous], 2006, TEXT MINING HDB ADV
[3]  
[Anonymous], 1912, VARIABILITY MUTABILI
[4]  
[Anonymous], 1962, PRINCIPLES NEURODYNA
[5]  
[Anonymous], 2003, Investigative Data Mining for Security and Criminal Detection
[6]  
[Anonymous], 2008, Introduction to information retrieval
[7]  
[Anonymous], 2004, ECML
[8]  
[Anonymous], P PAKDD 99 WORKSH KN
[9]  
[Anonymous], 1998, EUR C MACH LEARN
[10]  
Breiman L, 1984, OLSHEN STONE CLASSIF, DOI 10.1201/9781315139470