Named Entity Recognition in Vietnamese Text Using Label Propagation

被引:0
作者
Huong Thanh Le [1 ]
Rathany Chan Sam [1 ]
Hoan Cong Nguyen [1 ]
Thuy Thanh Nguyen [2 ]
机构
[1] Hanoi Univ Sci & Technol, Hanoi, Vietnam
[2] Univ Engn & Technol, Hanoi, Vietnam
来源
2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR) | 2013年
关键词
Named entity recognition; labeled propagation; semisupervised learning; words similarity;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents our named entity recognition system for Vietnamese text using labeled propagation. In here we propose: (i) a method of choosing noun phrases as the named entity candidates; (ii) a method to measure the word similarity; and (iii) a method of decreasing the effect of high frequency labels in labeled documents. Experimental results show that our labeled propagate method achieves higher accuracy than the old one [12]. In addition, when the number of the labeled data is small, its accuracy is higher than when using conditional random fields.
引用
收藏
页码:366 / 370
页数:5
相关论文
共 14 条
[1]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[2]  
Bontcheva K., 2002, P TALN 2002 WORKSH N
[3]  
Borthwick A., 1999, THESIS
[4]  
Cao T.H., J NEW GENERATION COM, V25, P277
[5]  
Chen JX, 2006, COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, P129
[6]  
Ghahramani Z., CMUCALD02107
[7]   Unsupervised learning by probabilistic latent semantic analysis [J].
Hofmann, T .
MACHINE LEARNING, 2001, 42 (1-2) :177-196
[8]  
Liao W., 2009, P NAACL HLT WORKSH S, P28
[9]  
McCallum A., 2003, Proceedings of CoNLL, P188
[10]  
Mohit B., 2005, Proceedings of the ACL 2005 on Interactive poster and demonstration sessions, P57