Towards a Unified Multi-Domain Multilingual Named Entity Recognition Model

被引:0
|
作者
Kulkarni, Mayank [2 ]
Preotiuc-Pietro, Daniel [1 ]
Radhakrishnan, Karthik [1 ]
Winata, Genta Indra [1 ]
Wu, Shijie [1 ]
Xie, Lingjue [1 ]
Yang, Shaohua [1 ]
机构
[1] Bloomberg, New York, NY 10022 USA
[2] Amazon Alexa AI, Boston, MA USA
来源
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition is a key Natural Language Processing task whose performance is sensitive to choice of genre and language. A unified NER model across multiple genres and languages is more practical and efficient through leveraging commonalities across genres or languages. In this paper, we propose a novel setup for NER which includes multi-domain and multilingual training and evaluation across 13 domains and 4 languages. We explore a range of approaches to building a unified model using domain and language adaptation techniques. Our experiments highlight multiple nuances to consider while building a unified model, including that naive data pooling fails to obtain good performance, that domain-specific adaptations are more important than language-specific ones and that including domain-specific adaptations in a unified model can reach performance close to training multiple dedicated monolingual models at a fraction of their parameter count.
引用
收藏
页码:2210 / 2219
页数:10
相关论文
共 50 条
  • [1] Multi-domain evaluation framework for named entity recognition tools
    Abdallah, Zahraa S.
    Carman, Mark
    Haffari, Gholamreza
    COMPUTER SPEECH AND LANGUAGE, 2017, 43 : 34 - 55
  • [2] KIND: an Italian Multi-Domain Dataset for Named-Entity Recognition
    Paccosi, Teresa
    Aprosio, Alessio Palmero
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 501 - 507
  • [3] TeluguNER: Leveraging Multi-Domain Named Entity Recognition with Deep Transformers
    Duggenpudi, Suma Reddy
    Oota, Subba Reddy
    Marreddy, Mounika
    Mamidi, Radhika
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 262 - 272
  • [4] A Double Adversarial Network Model for Multi-Domain and Multi-Task Chinese Named Entity Recognition
    Hu, Yun
    Zheng, Changwen
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (07) : 1744 - 1752
  • [5] Multi-domain adaptation for named entity recognition with multi-aspect relevance learning
    Li, Jiarui
    Liu, Jian
    Chen, Yufeng
    Xu, Jinan
    LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (02) : 803 - 818
  • [6] Multi-domain adaptation for named entity recognition with multi-aspect relevance learning
    Jiarui Li
    Jian Liu
    Yufeng Chen
    Jinan Xu
    Language Resources and Evaluation, 2023, 57 : 803 - 818
  • [7] An Empirical Study of Multi-domain and Multi-task Learning in Chinese Named Entity Recognition
    Hu, Yun
    Liao, Mingxue
    Lv, Pin
    Zheng, Changwen
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 743 - 754
  • [8] TMD-NER: Turkish multi-domain named entity recognition for informal texts
    Yilmaz, Selim F.
    Mutlu, Furkan B.
    Balaban, Ismail
    Kozat, Suleyman S.
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2255 - 2263
  • [9] TMD-NER: Turkish multi-domain named entity recognition for informal texts
    Selim F. Yilmaz
    Furkan B. Mutlu
    Ismail Balaban
    Suleyman S. Kozat
    Signal, Image and Video Processing, 2024, 18 : 2255 - 2263
  • [10] Multilingual Transformers for Named Entity Recognition
    Viksna, Rinalds
    Skadin, Inguna
    BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 457 - 469