CycleNER: An Unsupervised Training Approach for Named Entity Recognition

被引:19
|
作者
Iovine, Andrea [1 ]
Fang, Anjie [2 ]
Fetahu, Besnik [2 ]
Rokhlenko, Oleg [2 ]
Malmasi, Shervin [2 ]
机构
[1] Univ Bari Aldo Moro, Bari, Italy
[2] Amazoncom Inc, Bellevue, WA USA
来源
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22) | 2022年
关键词
natural language processing; named entity recognition; cycleconsistency; training; unsupervised training;
D O I
10.1145/3485447.3512012
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Named Entity Recognition (NER) is a crucial natural language understanding task for many down-stream tasks such as question answering and retrieval. Despite significant progress in developing NER models for multiple languages and domains, scaling to emerging and/or low-resource domains still remains challenging, due to the costly nature of acquiring training data. We propose CycleNER, an unsupervised approach based on cycle-consistency training that uses two functions: (i) sentence-to-entity - S2E and (ii) entity-to-sentence - E2S, to carry out the NER task. CycleNER does not require annotations but a set of sentences with no entity labels and another independent set of entity examples. Through cycle-consistency training, the output from one function is used as input for the other (e.g. S2E. E2S) to align the representation spaces of both functions and therefore enable unsupervised training. Evaluation on several domains comparing CycleNER against supervised and unsupervised competitors shows that CycleNER achieves highly competitive performance with only a few thousand input sentences. We demonstrate competitive performance against supervised models, achieving 73% of supervised performance without any annotations on CoNLL03, while significantly outperforming unsupervised approaches.
引用
收藏
页码:2916 / 2924
页数:9
相关论文
共 50 条
  • [21] Named Entity Recognition for Sinhala Language
    Dahanayaka, J. K.
    Weerasinghe, A. R.
    14TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER) 2014, 2014, : 215 - 220
  • [22] Named Entity Recognition Based on Reinforcement Learning and Adversarial Training
    Peng, Shi
    Zhang, Yong
    Yu, Yuanfang
    Zuo, Haoyang
    Zhang, Kai
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 191 - 202
  • [23] Adversarial training for named entity recognition of rail fault text
    Qu, J.
    Su, S.
    Li, R.
    Wang, G.
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 1353 - 1358
  • [24] On the Use of Parsing for Named Entity Recognition
    Alonso, Miguel A.
    Gomez-Rodriguez, Carlos
    Vilares, Jesus
    APPLIED SCIENCES-BASEL, 2021, 11 (03): : 1 - 24
  • [25] Hungarian named entity recognition with a maximum entropy approach
    Varga, Daniel
    Simon, Eszter
    ACTA CYBERNETICA, 2007, 18 (02): : 293 - 301
  • [26] Fighting Against the Repetitive Training and Sample Dependency Problem in Few-Shot Named Entity Recognition
    Tian, Chang
    Yin, Wenpeng
    Li, Dan
    Moens, Marie-Francine
    IEEE ACCESS, 2024, 12 : 37600 - 37614
  • [27] Nested Named Entity Recognition: A Survey
    Wang, Yu
    Tong, Hanghang
    Zhu, Ziye
    Li, Yun
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (06)
  • [28] Self-Training With Double Selectors for Low-Resource Named Entity Recognition
    Fu, Yingwen
    Lin, Nankai
    Yu, Xiaohui
    Jiang, Shengyi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1265 - 1275
  • [29] NAMED ENTITY RECOGNITION FOR POLISH
    Marcinczuk, Michal
    Wawer, Aleksander
    POZNAN STUDIES IN CONTEMPORARY LINGUISTICS, 2019, 55 (02): : 239 - 269
  • [30] A Hybrid Method for Persian Named Entity Recognition
    Ahmadi, Farid
    Moradi, Hamed
    2015 7th Conference on Information and Knowledge Technology (IKT), 2015,