Towards a Unified Multi-Domain Multilingual Named Entity Recognition Model

被引:0
作者
Kulkarni, Mayank [2 ]
Preotiuc-Pietro, Daniel [1 ]
Radhakrishnan, Karthik [1 ]
Winata, Genta Indra [1 ]
Wu, Shijie [1 ]
Xie, Lingjue [1 ]
Yang, Shaohua [1 ]
机构
[1] Bloomberg, New York, NY 10022 USA
[2] Amazon Alexa AI, Boston, MA USA
来源
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition is a key Natural Language Processing task whose performance is sensitive to choice of genre and language. A unified NER model across multiple genres and languages is more practical and efficient through leveraging commonalities across genres or languages. In this paper, we propose a novel setup for NER which includes multi-domain and multilingual training and evaluation across 13 domains and 4 languages. We explore a range of approaches to building a unified model using domain and language adaptation techniques. Our experiments highlight multiple nuances to consider while building a unified model, including that naive data pooling fails to obtain good performance, that domain-specific adaptations are more important than language-specific ones and that including domain-specific adaptations in a unified model can reach performance close to training multiple dedicated monolingual models at a fraction of their parameter count.
引用
收藏
页码:2210 / 2219
页数:10
相关论文
共 50 条
[1]   Multi-Domain Named Entity Recognition for Robotic Process Automation [J].
Denisiuk, Aleksander ;
Ganzha, Maria ;
Sowinski, Piotr ;
Wasielewska-Michniewska, Katarzyna ;
Paprzycki, Marcin .
PROCEEDINGS OF THE 56TH ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2023, :940-949
[2]   Multi-domain evaluation framework for named entity recognition tools [J].
Abdallah, Zahraa S. ;
Carman, Mark ;
Haffari, Gholamreza .
COMPUTER SPEECH AND LANGUAGE, 2017, 43 :34-55
[3]   KIND: an Italian Multi-Domain Dataset for Named-Entity Recognition [J].
Paccosi, Teresa ;
Aprosio, Alessio Palmero .
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, :501-507
[4]   TeluguNER: Leveraging Multi-Domain Named Entity Recognition with Deep Transformers [J].
Duggenpudi, Suma Reddy ;
Oota, Subba Reddy ;
Marreddy, Mounika ;
Mamidi, Radhika .
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, :262-272
[5]   A Double Adversarial Network Model for Multi-Domain and Multi-Task Chinese Named Entity Recognition [J].
Hu, Yun ;
Zheng, Changwen .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (07) :1744-1752
[6]   Multi-domain adaptation for named entity recognition with multi-aspect relevance learning [J].
Li, Jiarui ;
Liu, Jian ;
Chen, Yufeng ;
Xu, Jinan .
LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (02) :803-818
[7]   Multi-domain adaptation for named entity recognition with multi-aspect relevance learning [J].
Jiarui Li ;
Jian Liu ;
Yufeng Chen ;
Jinan Xu .
Language Resources and Evaluation, 2023, 57 :803-818
[8]   An Empirical Study of Multi-domain and Multi-task Learning in Chinese Named Entity Recognition [J].
Hu, Yun ;
Liao, Mingxue ;
Lv, Pin ;
Zheng, Changwen .
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 :743-754
[9]   TMD-NER: Turkish multi-domain named entity recognition for informal texts [J].
Yilmaz, Selim F. ;
Mutlu, Furkan B. ;
Balaban, Ismail ;
Kozat, Suleyman S. .
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) :2255-2263
[10]   TMD-NER: Turkish multi-domain named entity recognition for informal texts [J].
Selim F. Yilmaz ;
Furkan B. Mutlu ;
Ismail Balaban ;
Suleyman S. Kozat .
Signal, Image and Video Processing, 2024, 18 :2255-2263