Domain ontology graph model and its application in Chinese text classification

被引:8
作者
Liu, James N. K. [1 ]
He, Yu-lin [2 ]
Lim, Edward H. Y. [1 ]
Wang, Xi-zhao [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Hebei Univ, Coll Math & Comp Sci, Baoding 071002, Peoples R China
关键词
Domain ontology graph; Knowledge representation; Text classification; Ontology; SEMANTIC WEB; FRAMEWORK; UNCERTAINTY; EXTRACTION;
D O I
10.1007/s00521-012-1272-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an ontology learning method which is used to generate a graphical ontology structure called ontology graph. The ontology graph defines the ontology and knowledge conceptualization model, and the ontology learning process defines the method of semiautomatic learning and generates ontology graphs from Chinese texts of different domains, the so-called domain ontology graph (DOG). Meanwhile, we also define two other ontological operations-document ontology graph generation and ontology graph-based text classification, which can be carried out with the generated DOG. This research focuses on Chinese text data, and furthermore, we conduct two experiments: the DOG generation and ontology graph-based text classification, with Chinese texts as the experimental data. The first experiment generates ten DOGs as the ontology graph instances to represent ten different domains of knowledge. The generated DOGs are then further used for the second experiment to provide performance evaluation. The ontology graph-based approach is able to achieve high text classification accuracy (with 92.3 % in f-measure) over other text classification approaches (such as 86.8 % in f-measure for tf-idf approach). The better performance in the comparative experiments reveals that the proposed ontology graph knowledge model, the ontology learning and generation process, and the ontological operations are feasible and effective.
引用
收藏
页码:779 / 798
页数:20
相关论文
共 45 条
[1]   Automatic ontology-based knowledge extraction from web documents [J].
Alani, H ;
Kim, S ;
Millard, DE ;
Weal, MJ ;
Hall, W ;
Lewis, PH ;
Shadbolt, NR .
IEEE INTELLIGENT SYSTEMS, 2003, 18 (01) :14-21
[2]  
[Anonymous], INFORM RETRIEVAL RES
[3]  
Besana P, 2008, LECT NOTES ARTIF INT, V5327, P41
[4]  
Buitelaar P, 2008, FRONT ARTIF INTEL AP, V167, pV
[5]   Improving automatic text classification by integrated feature analysis [J].
Busagala, Lazaro S. P. ;
Ohyama, Wataru ;
Wakabayashi, Tetsushi ;
Kimura, Fumitaka .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (04) :1101-1109
[6]  
Chen WQ, 1999, FRONT ARTIF INTEL AP, V55, P95
[7]   Learning concept hierarchies from text corpora using formal concept analysis [J].
Cimiano, P ;
Hotho, A ;
Staab, S .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 (24) :305-339
[8]   TextOntoEx: Automatic ontology construction from natural English text [J].
Dahab, Mohamed Yehia ;
Hassan, Hesham A. ;
Rafea, Ahmed .
EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (02) :1474-1480
[9]  
DONG Z., 2006, Hownet and the computation of meaning
[10]   Unsupervised named-entity extraction from the Web: An experimental study [J].
Etzioni, O ;
Cafarella, M ;
Downey, D ;
Popescu, AM ;
Shaked, T ;
Soderland, S ;
Weld, DS ;
Yates, A .
ARTIFICIAL INTELLIGENCE, 2005, 165 (01) :91-134