Research on Classification Algorithm of News Pages Based on Domain Ontology

被引:0
作者
Xie, Caiyun [1 ]
Hu, Xiaorong [1 ]
机构
[1] Nanchang Teachers Coll, Dept Informat Sci, Nanchang, Peoples R China
来源
INDUSTRIAL INSTRUMENTATION AND CONTROL SYSTEMS II, PTS 1-3 | 2013年 / 336-338卷
关键词
Classification Algorithm; Ontology; Integrated Correlation Degree;
D O I
10.4028/www.scientific.net/AMM.336-338.2217
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes the classification algorithm of news pages based on domain Ontology. In order to improve the shortage of current classification algorithm that only considers the content similarity, this paper presents the semantic classification method which considers both content similarity and structural correlation. Firstly, it parses the Ontology to get Ontology category vector, extracts keywords of news pages' texts and drops semantic dimension. At this time, finding out the same vocabulary and ontology category vector in page texts to constitute the text expectation vector, and then calculating the content similarity between ontology category vector and expectation vector of text by using the law of cosines. Secondly, the common vocabularies are mapped to the ontology hierarchy chart, and the structural relevancy is obtained by calculating weighted path of this directed acyclic graph. Finally, it calculates the correlation degree of the news pages and Ontology by combining both, and determines the category of news pages by judging the size relationship between the result and the initial threshold value.
引用
收藏
页码:2217 / 2220
页数:4
相关论文
共 9 条
  • [1] [Anonymous], P 3 INT C WEB INF SY
  • [2] Cheng Gong, WWW2008, P1101
  • [3] Gao MX, 2005, Third International Conference on Information Technology and Applications, Vol 1, Proceedings, P256
  • [4] Huang Guan-wei, 2007, COMPUTER ENG DESIGN, V28
  • [5] Jiang Hua, 2009, APPL SOFTWARE, V26
  • [6] Muhopadhyay Debajyoti, 2008, INT C INF TECHN
  • [7] Muhopadhyay Debajyoti, 2007, INT C INF TECHN
  • [8] Song MH, 2005, P 12 AS PAC SOFTW EN
  • [9] Wu Guoyang, 2007, COMPUTER SCI, V34