Using an integrated ontology database to categorize web pages

被引:0
作者
Bai, Rujiang [1 ]
Wang, Xiaoyue [1 ]
机构
[1] Shandong Univ Technol, Zibo 255049, Peoples R China
关键词
text classification; ontology; RDF; SVM;
D O I
10.1080/02533839.2012.679031
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
As we know, current classification methods are mostly based on the vector space model, which only accounts for term frequency in the documents, and ignores important semantic relationships between key terms. We have proposed a system that uses integrated ontologies and natural language processing techniques to index texts. The traditional words matrix is replaced by a concepts-based matrix. For this purpose, we have developed fully automated methods for mapping keywords to their corresponding ontology concepts. Support vector machine, a successful machine learning technique, is used for classification. Experimental results show that the proposed method improves text classification performance significantly.
引用
收藏
页码:509 / 514
页数:6
相关论文
共 13 条
[1]  
[Anonymous], 1989, Building large knowledge-based systems: Representation and inference in the Cyc project
[2]  
[Anonymous], P 17 NAT C ART INT A
[3]  
[Anonymous], P 7 INT C PRINC KNOW
[4]  
Fellbaum C., 1998, WordNet, DOI DOI 10.7551/MITPRESS/7287.001.0001
[5]   A comparison of word- and sense-based text categorization using several classification algorithms [J].
Kehagias, A ;
Petridis, V ;
Kaburlasos, VG ;
Fragkou, P .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2003, 21 (03) :227-247
[6]  
MIHALCEA R, 2000, P 13 INT FLOR ART IN
[7]  
Moschitti A., 2004, LNCS
[8]  
NOY N, 2002, P WORKSH EV ONT TOOL
[9]  
NOY NF, 1999, P KAW 99 BANFF ALB C
[10]  
Sahlgren M, 2004, 20 INT C COMP LING C