Named entity recognition using acyclic weighted digraphs: A semi-supervised statistical method

被引:0
作者
Kim, Kono [2 ]
Yoon, Yeohoon [3 ]
Kim, Harksoo [1 ]
Seo, Jungyun [4 ]
机构
[1] Kangwon Natl Univ, Coll Informat Technol, Program Comp & Commun Engn, 192 1, Hyoja 2 i dong, Chunchon 200701, South Korea
[2] Sogang Univ, Dept Comp Sci, Natl Language Proc Lab, 1 Sinsu-Dong,Mapo-Gu, Seoul 121742, South Korea
[3] NHN Corp, Seongname City 463 844, Gyeonggi Do, South Korea
[4] Sogang Univ, Dept Comp Sci, Interdisciplinary Program Integrated Biotechnol, Seoul 121742, South Korea
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS | 2007年 / 4426卷
关键词
named entity recognition; semi-supervised statistical method; acyclic weighted digraph;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a NE (Named Entity) recognition system using a semi-supervised statistical method. In training time, the NE recognition system builds error-prone training data only using a conventional POS (Part-Of- Speech) tagger and a NE dictionary that semi-automatically is constructed. Then, the NE recognition system generates a co-occurrence similarity matrix from the error-prone training corpus. In running time, the NE recognition system constructs AWDs (Acyclic Weighted Digraphs) based on the co-occurrence similarity matrix. Then, the NE recognition system detects NE candidates and assigns categories to the NE candidates using Viterbi searching on the AWDs. In the preliminary experiments on PLO (Person, Location and Organization) recognition, the proposed system showed 81.32% on average F1-measure.
引用
收藏
页码:571 / +
页数:2
相关论文
共 6 条
[1]  
Bikel D.M., 1997, Proceedings of the fifth conference on Applied natural language processing. Association for Computational Linguistics, P194
[2]  
Borthwick A., 1997, P 7 MESS UND C
[3]  
Cohen William W, 2004, P 10 ACM SIGKDD INT
[4]  
Sekine S., 1998, P 6 WORKSH VAR LARG
[5]  
SEON CN, 2001, P 6 NAT LANG PROC PA