Using Non-Local Features to Improve Named Entity Recognition Recall

被引:0
作者
Mao, Xinnian [1 ]
Xu, Wei [1 ]
Dong, Yuan [1 ,2 ]
He, Saike [2 ]
Wang, Haila [1 ]
机构
[1] France Telecom R&D Ctr Beijing, Beijing 100080, Peoples R China
[2] Univ Posts & Telecommun, Beijing 100876, Peoples R China
来源
PACLIC 21: THE 21ST PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, PROCEEDINGS | 2007年
关键词
Named Entity Recognition; Non-local Feature; Conditional Random Field;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Named Entity Recognition (NER) is always limited by its lower recall resulting from the asymmetric data distribution where the NONE class dominates the entity classes. This paper presents an approach that exploits non-local information to improve the NER recall. Several kinds of non-local features encoding entity token occurrence, entity boundary and entity class are explored under Conditional Random Fields (CRFs) framework. Experiments on SIGHAN 2006 MSRA (CityU) corpus indicate that non-local features can effectively enhance the recall of the state-of-the-art NER systems. Incorporating the non-local features into the NER systems using local features alone, our best system achieves a 23.56% (25.26%) relative error reduction on the recall and 17.10% (11.36%) relative error reduction on the F1 score; the improved F1 score 89.38% (90.09%) is significantly superior to the best NER system with F1 of 86.51% (89.03%) participated in the closed track.
引用
收藏
页码:303 / +
页数:2
相关论文
共 13 条
[1]  
Bunescu R., 2004, P 42 ANN M ASS COMP, P439
[2]   REPRESENTATIONS OF QUASI-NEWTON MATRICES AND THEIR USE IN LIMITED MEMORY METHODS [J].
BYRD, RH ;
NOCEDAL, J ;
SCHNABEL, RB .
MATHEMATICAL PROGRAMMING, 1994, 63 (02) :129-156
[3]   GENERALIZED ITERATIVE SCALING FOR LOG-LINEAR MODELS [J].
DARROCH, JN ;
RATCLIFF, D .
ANNALS OF MATHEMATICAL STATISTICS, 1972, 43 (05) :1470-&
[4]   Inducing features of random fields [J].
DellaPietra, S ;
DellaPietra, V ;
Lafferty, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (04) :380-393
[5]  
Finkel JR., 2005, P 43 ANN M ASS COMP, P363, DOI [DOI 10.3115/1219840.12198852,5,9, DOI 10.3115/1219840.1219885]
[6]  
KAMBHATLA N, 2006, P 44 ACL, P460
[7]  
KRISHNAN V, 2006, P 21 INT C COMP LING, P1121
[8]  
Lafferty J., 2001, PROC 18 INT C MACHIN, DOI [DOI 10.1038/NPROT.2006.61, 10.1038/nprot.2006.61]
[9]  
Levow G.-A., 2006, P 5 SIGHAN WORKSHOP, P108
[10]  
Malouf R., 2002, P 6 C NAT LANG LEARN, V20, P1, DOI [DOI 10.3115/1118853.1118871, 10.3115/1118853.1118871]