Named Entity Recognition for Tibetan Texts Using Case-auxiliary Grammars

被引:0
作者
Yu, Hongzhi [1 ]
Jiang, Tao [1 ]
Ma, Ning [1 ]
机构
[1] NW Univ Nationalities, Key Lab Natl Languages Informat Technol, Lanzhou 730030, Gansu, Peoples R China
来源
INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III | 2010年
关键词
named entity recognition; case-auxiliary words; Tibetan texts; name lexicon;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Tibetan named entity recognition, which is an important part in multilingual processing, is one of the challenging tasks of text mining. In this paper, we present a recognition system of Tibetan person name using a rule-based model based on case-auxiliary word and lexicon, and also adapt boundary information list static from large corpus to improve recognition. In addition, our experiments shows that recall rate and precise rate are respectively 90.13% and 94.02% in the newspaper corpus, 85.67% and 88.20% in the website text.
引用
收藏
页码:601 / 604
页数:4
相关论文
共 10 条
[1]   An algorithm that learns what's in a name [J].
Bikel, DM ;
Schwartz, R ;
Weischedel, RM .
MACHINE LEARNING, 1999, 34 (1-3) :211-231
[2]  
Borhwick A., 1999, THESIS NEW YORK U
[3]  
CHEN GS, 2004, DICT COMMON TIBETAN
[4]  
Gulila A., 2005, J CHINESE LANGUAGE C, V15, P219
[5]  
Guo H., 2004, 1 INT JOINT C NAT LA, P294
[6]  
Isozaki Hideki, 2002, P 19 INT C COMP LING, P953
[7]  
Sun J., 2002, P 19 INT C COMP LING, P967
[8]  
WANG G, 1991, STUDY TIBETAN NAME
[9]  
Zhang H.P., 2003, International Journal of Computational Linguistics and Chinese language processing, V8, P29
[10]  
Zhou GD, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P473