Research on Chinese named entity recognition using combined boundary-PoS feature

被引:0
作者
Qiang, Bao-Hua [1 ]
Huang, Jun [1 ]
Wang, Yu-Feng [2 ]
Wang, Sai [1 ]
Wang, Yong [1 ]
机构
[1] Gulin Univ Elect Technol, Coll Comp Sci Engn, Guilin, Guangxi, Peoples R China
[2] Technol Grp Corp, Res Inst China Elect 54, Shijiazhuang 050000, Peoples R China
来源
DESIGN, MANUFACTURING AND MECHATRONICS (ICDMM 2015) | 2016年
关键词
Named entity recognition; conditional random field; Part of Speech (PoS); word boundary; boundary-PoS feature; feature templates;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Named entity recognition (NER) on complex organizations (ORG), locations (LOC) and person names (PER) is a difficult issue in natural language processing. We propose a novel 6 tags method which combines boundary and Part of Speech (6-tag-boundary-PoS) to improve the recognition rate for NER based on conditional random field. Large-scale open tests on real corpus indicate that the 6-tag-boundary-PoS feature can effectively shorten the training time, and the F1 value, precision (P), recall (R) of ORG and LOC are 98.25%, 97.82%, 98.68% and 91.81%, 89.76%, 93.97%, respectively.
引用
收藏
页码:839 / 848
页数:10
相关论文
共 9 条
[1]  
Feng Yuan-yong, 2008, Acta Electronica Sinica, V36, P1833
[2]  
[冯元勇 FENG Yuanyong], 2008, [中文信息学报, Journal of Chinese Information Processing], V22, P104
[3]  
Lafferty John, 2001, INT C MACH LEARN ICM
[4]  
[李丽双 Li Lishuang], 2006, [中文信息学报, Journal of Chinese Information Processing], V20, P51
[5]  
[李中国 Li Zhongguo], 2006, [中文信息学报, Journal of Chinese Information Processing], V20, P44
[6]  
Shi Shuicai, 2013, Computer Engineering and Applications, V49, P147, DOI 10.3778/j.issn.1002-8331.1109-0592
[7]  
Yang Xiao-dong, 2011, Computer Engineering, V37, P169, DOI 10.3969/j.issn.1000-3428.2011.08.058
[8]  
ZHANG Zhu-yu, 2008, 4 NAT INF RETR CONT, P111
[9]  
Zhou Jun-sheng, 2006, Acta Electronica Sinica, V34, P804