A Benchmark Dataset and a Framework for Urdu Multimodal Named Entity Recognition

被引:0
作者
Ahmad, Hussain [1 ]
Zeng, Qingyang [1 ]
Wan, Jing [1 ]
机构
[1] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing 100029, Peoples R China
关键词
Visualization; Named entity recognition; Benchmark testing; Translation; Annotations; Adaptation models; Social networking (online); Image resolution; Data models; Data collection; Multimodal named entity recognition; Urdu; social media; Urdu multimodal NER dataset; cross-modal attention;
D O I
10.1109/ACCESS.2025.3576784
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of multimodal content, particularly text and images on social media, has positioned Multimodal Named Entity Recognition (MNER) as an increasingly important area of research within Natural Language Processing. Despite progress in high-resource languages such as English, MNER remains underexplored for low-resource languages like Urdu. The primary challenges include the scarcity of annotated multimodal datasets and the lack of standardized baselines. To address these challenges, we introduce the U-MNER framework and release the Twitter2015-Urdu dataset, a pioneering resource for Urdu MNER. Adapted from the widely used Twitter2015 dataset, it is annotated with Urdu-specific grammar rules. We establish benchmark baselines by evaluating both text-based and multimodal models on this dataset, providing comparative analyses to support future research on Urdu MNER. The U-MNER framework integrates textual and visual context using Urdu-BERT for text embeddings and ResNet for visual feature extraction, with a Cross-Modal Fusion Module to align and fuse information. Our model achieves state-of-the-art performance on the Twitter2015-Urdu dataset, laying the groundwork for further MNER research in low-resource languages.
引用
收藏
页码:100904 / 100919
页数:16
相关论文
共 30 条
[1]   Enriching Urdu NER with BERT Embedding, Data Augmentation, and Hybrid Encoder-CNN Architecture [J].
Ahmed, Anil ;
Huang, Degen ;
Arafat, Syed Yasser ;
Hameed, Imran .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (04)
[2]   Pattern Based Comprehensive Urdu Stemmer and Short Text Classification [J].
Ali, Mubashir ;
Khalid, Shehzad ;
Aslam, Muhammad Haseeb .
IEEE ACCESS, 2018, 6 :7374-7389
[3]   Sentiment Analysis in Low-Resource Settings: A Comprehensive Review of Approaches, Languages, and Data Sources [J].
Aliyu, Yusuf ;
Sarlan, Aliza ;
Danyaro, Kamaluddeen Usman ;
Rahman, Abdullahi Sani B. A. ;
Abdullahi, Mujaheed .
IEEE ACCESS, 2024, 12 :66883-66909
[4]  
[Anonymous], 2016, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, DOI [10 . 18653 / v1 / N16-1030, DOI 10.18653/V1/N16-1030]
[5]   MPMRC-MNER: A Unified MRC framework for Multimodal Named Entity Recognition based Multimodal Prompt [J].
Bao, Xigang ;
Tian, Mengyuan ;
Zha, Zhiyuan ;
Qin, Biao .
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, :47-56
[6]  
Chen X., P FIND ASS COMP LING
[7]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]  
Huang ZH, 2015, Arxiv, DOI [arXiv:1508.01991, DOI 10.48550/ARXIV.1508.01991]
[9]  
Jia MHZ, 2023, AAAI CONF ARTIF INTE, P8032
[10]   Urdu Named Entity Recognition: Corpus Generation and Deep Learning Applications [J].
Kanwal, Safia ;
Malik, Kamran ;
Shahzad, Khurram ;
Aslam, Faisal ;
Nawaz, Zubair .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)