Semantic feature selection using WordNet

被引:14
作者
Chua, S
Kulathuramaiyer, N
机构
来源
IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS | 2004年
关键词
D O I
10.1109/WI.2004.10115
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web has caused an explosion of documents, requiring the need for an automated text categorization system. This paper explores the notion of semantic feature selection by employing WordNet [1], a lexical database. The proposed semantic approach employs noun synonyms and word senses for feature selection to select terms that are semantically representative of a category of documents. The categorical sense disambiguation extends the use of WordNet, which hay been typically used for text retrieval and word sense disambiguation [2]. Our experiments on the Reuters-21578 dataset have shown that automated semantic feature selection is able to perform better than well known statistical feature selection methods, Information Gain and Chi-Square as a feature selection method.
引用
收藏
页码:166 / 172
页数:7
相关论文
共 11 条
[1]  
AAS K, 1999, 941 NORW COMP CENT
[2]  
[Anonymous], 1997, Proceedings of the fourteenth international conference on machine learning, DOI DOI 10.1016/J.ESWA.2008.05.026
[3]   AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION [J].
APTE, C ;
DAMERAU, F ;
WEISS, SM .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) :233-251
[4]  
Gonzalo J., 1998, Usage of WordNet in Natural Language Processing Systems, P38
[5]  
Masuyama T, 2002, 13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, P241
[6]  
MIHALCEA R, 2000, P ACL WORKSH IR NLP
[7]  
Miller George A, 1990, Internationaljournaloflexicography3, V3, P235, DOI [10.1093/ijl/3.4.235, DOI 10.1093/IJL/3.4.235]
[8]  
Sebastiani F., 1999, Proceedings of 1st Argentinean Symposium on Artificial Intelligence, P7
[9]   Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques 2nd editionSan Francisco: Morgan Kaufmann Publishers; 2005:560. ISBN 0-12-088407-0, £34.99 [J].
Francisco Azuaje .
BioMedical Engineering OnLine, 5 (1)
[10]  
Xiaobin Li, 1995, IJCAI-95. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, P1368