Text Classification Using Ensemble Features Selection and Data Mining Techniques

被引：1

作者：

Shravankumar, B. ^{[1
,2
]}

Ravi, Vadlamani ^{[1
]}

机构：

[1] Inst Dev & Res Banking Technol, Ctr Excellence CRM & Analyt, Hyderabad 500057, Andhra Pradesh, India

[2] Univ Hyderabad, SCIS, Hyderabad 500046, Andhra Pradesh, India

来源：

SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, SEMCCO 2014 | 2015年 / 8947卷

关键词：

Text mining; Document classification; Feature selection; Classification models;

D O I：

10.1007/978-3-319-20294-5_16

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text categorization is a task of text mining/analytics which involves extracting useful information from unstructured resources followed by categorizing these documents. In this paper, we classify the TechTC dataset collected from various Web directories. We employed feature selection methods such as Gini index, chi-square, t-statistic, correlation which drastically reduced the model building time. Various neural network models such as probabilistic neural network, group method of data handling, multi layer perceptron yielded higher accuracies compared to other techniques applied in literature.

引用

页码：176 / 186

页数：11

共 38 条

[1]

Aggarwal CC, 2011, SOCIAL NETWORK DATA ANALYTICS, P353

[2]

[Anonymous], 2006, TEXT MINING HDB ADV

[3]

[Anonymous], 1912, VARIABILITY MUTABILI

[4]

[Anonymous], 1962, PRINCIPLES NEURODYNA

[5]

[Anonymous], 2003, Investigative Data Mining for Security and Criminal Detection

[6]

[Anonymous], 2008, Introduction to information retrieval

[7]

[Anonymous], 2004, ECML

[8]

[Anonymous], P PAKDD 99 WORKSH KN

[9]

[Anonymous], 1998, EUR C MACH LEARN

[10]

Breiman L, 1984, OLSHEN STONE CLASSIF, DOI 10.1201/9781315139470

← 1 2 3 4 →