TMD-NER: Turkish multi-domain named entity recognition for informal texts

被引:1
|
作者
Yilmaz, Selim F. [1 ]
Mutlu, Furkan B. [2 ]
Balaban, Ismail [3 ]
Kozat, Suleyman S. [2 ]
机构
[1] Imperial Coll London, Dept Elect & Elect Engn, London, England
[2] Bilkent Univ, Dept Elect & Elect Engn, Ankara, Turkiye
[3] Middle East Tech Univ, Dept Stat, Ankara, Turkiye
关键词
Named entity recognition; Turkish language; Bidirectional long short-term memory; Conditional random fields;
D O I
10.1007/s11760-023-02898-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We examine named entity recognition (NER), an essential and commonly used first step in many natural language processing tasks, including chatbots and language translation. We focus on the application of NER to texts that have a lot of noise, such as tweets, which is difficult due to the casual and unstructured language often used in these mediums. In this study, we make use of the largest available labeled data sets for Turkish NER, specifically targeting three informal platforms, namely Twitter, Facebook and Donanimhaber. We choose Turkish as a representative agglutinative language, which has a significantly different structure than other well-known languages such as English, French, and German. We emphasize that the methodologies and insights gained from this study can be extended to other agglutinative languages, like Finnish, Hungarian, Japanese, and Korean. We apply NER to these datasets using 16 different named entity tags through a framework that employs bidirectional long short-term memory (BiLSTM) networks followed by conditional random fields (CRF), known together as the BiLSTM-CRF model. Our experiments show an F1 score of 84% on a combined dataset, which indicates that deep learning models can also be effectively used for business applications in informal settings in agglutinative languages such as Turkish.
引用
收藏
页码:2255 / 2263
页数:9
相关论文
共 50 条
  • [1] TMD-NER: Turkish multi-domain named entity recognition for informal texts
    Selim F. Yilmaz
    Furkan B. Mutlu
    Ismail Balaban
    Suleyman S. Kozat
    Signal, Image and Video Processing, 2024, 18 : 2255 - 2263
  • [2] Named Entity Recognition Experiments on Turkish Texts
    Kuecuek, Dilek
    Yazici, Adnan
    FLEXIBLE QUERY ANSWERING SYSTEMS: 8TH INTERNATIONAL CONFERENCE, FQAS 2009, 2009, 5822 : 524 - 535
  • [3] Multi-domain evaluation framework for named entity recognition tools
    Abdallah, Zahraa S.
    Carman, Mark
    Haffari, Gholamreza
    COMPUTER SPEECH AND LANGUAGE, 2017, 43 : 34 - 55
  • [4] Towards a Unified Multi-Domain Multilingual Named Entity Recognition Model
    Kulkarni, Mayank
    Preotiuc-Pietro, Daniel
    Radhakrishnan, Karthik
    Winata, Genta Indra
    Wu, Shijie
    Xie, Lingjue
    Yang, Shaohua
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2210 - 2219
  • [5] Named-entity recognition in Turkish legal texts
    Cetindag, Can
    Yazicioglu, Berkay
    Koc, Aykut
    NATURAL LANGUAGE ENGINEERING, 2023, 29 (03) : 615 - 642
  • [6] KIND: an Italian Multi-Domain Dataset for Named-Entity Recognition
    Paccosi, Teresa
    Aprosio, Alessio Palmero
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 501 - 507
  • [7] TeluguNER: Leveraging Multi-Domain Named Entity Recognition with Deep Transformers
    Duggenpudi, Suma Reddy
    Oota, Subba Reddy
    Marreddy, Mounika
    Mamidi, Radhika
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 262 - 272
  • [8] Multi-domain adaptation for named entity recognition with multi-aspect relevance learning
    Li, Jiarui
    Liu, Jian
    Chen, Yufeng
    Xu, Jinan
    LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (02) : 803 - 818
  • [9] Multi-domain adaptation for named entity recognition with multi-aspect relevance learning
    Jiarui Li
    Jian Liu
    Yufeng Chen
    Jinan Xu
    Language Resources and Evaluation, 2023, 57 : 803 - 818
  • [10] Named Entity Recognition (NER) for Nepali
    Maharjan, Gopal
    Bal, Bal Krishna
    Regmi, Santosh
    CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, PT II, 2019, 1084 : 71 - 80