Domain Adaptation for Named Entity Recognition Using CRFs

被引:0
作者
Tian, Tian [1 ,2 ]
Dinarelli, Marco [1 ]
Tellier, Isabelle [1 ]
Cardoso, Pedro Dias [2 ]
机构
[1] Univ Sorbonne Nouvelle Paris 3, USPC, PSL Res Univ, LaTTiCe,UMR 8094,CNRS,ENS Paris, 1 Maurice Arnoux, F-92120 Montrouge, France
[2] Synthesio, 8-10 Rue Villedo, F-75001 Paris, France
来源
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2016年
关键词
Domain Adaptation; Social Media; CRFs; Machine Learning;
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
In this paper we explain how we created a labelled corpus in English for a Named Entity Recognition (NER) task from multi-source and multi-domain data, for an industrial partner. We explain the specificities of this corpus with examples and describe some baseline experiments. We present some results of domain adaptation on this corpus using a labelled Twitter corpus (Ritter et al., 2011). We tested a semi-supervised method from (Garcia-Fernandez et al., 2014) combined with a supervised domain adaptation approach proposed in (Raymond and Fayolle, 2010) for machine learning experiments with CRFs (Conditional Random Fields). We use the same technique to improve the NER results on the Twitter corpus (Ritter et al., 2011). Our contributions thus consist in an industrial corpus creation and NER performance improvements.
引用
收藏
页码:561 / 565
页数:5
相关论文
共 17 条
  • [1] [Anonymous], 2010, P ACL
  • [2] [Anonymous], 2007, ACL
  • [3] [Anonymous], 2011, P 2011 C EMPIRICAL M
  • [4] [Anonymous], 2011, P ACL
  • [5] Arnold Andrew., 2008, Proceedings of ACL-08: HLT, P245
  • [6] Blitzer J., 2006, P EMNLP
  • [7] Foster J., 2011, proceedings of the Workshop On Analyzing Microtext (AAAI 2011), P20
  • [8] Freitag D., 2004, C EMPIRICAL METHODS, P262
  • [9] GarciaFernandez A., 2014, P 9 INT C LANG RES E
  • [10] Guo H., 2009, P HUM LANG TECHN 200, P281