ParsBERT: Transformer-based Model for Persian Language Understanding

被引:74
作者
Farahani, Mehrdad [1 ]
Gharachorloo, Mohammad [2 ]
Farahani, Marzieh [3 ]
Manthouri, Mohammad [4 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, North Tehran Branch, Tehran, Iran
[2] Queensland Univ Technol, Sch Elect Engn & Robot, Brisbane, Qld, Australia
[3] Umea Univ, Dept Comp Sci, Umea, Sweden
[4] Shahed Univ, Dept Elect & Elect Engn, Tehran, Iran
关键词
Persian; Transformers; BERT; Language Models; NLP; NLU;
D O I
10.1007/s11063-021-10528-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones and gathered ones, and improves the state-of-the-art performance by outperforming both multilingual BERT and other prior works in Sentiment Analysis, Text Classification, and Named Entity Recognition tasks.
引用
收藏
页码:3831 / 3847
页数:17
相关论文
共 50 条
[21]   SignNet II: A Transformer-Based Two-Way Sign Language Translation Model [J].
Chaudhary, Lipisha ;
Ananthanarayana, Tejaswini ;
Hoq, Enjamamul ;
Nwogu, Ifeoma .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) :12896-12907
[22]   An Ensemble of Vision-Language Transformer-Based Captioning Model With Rotatory Positional Embeddings [J].
Sathyanarayana, K. B. ;
Naik, Dinesh .
IEEE ACCESS, 2025, 13 :59841-59865
[23]   Transformer-based Approaches for Personality Detection using the MBTI Model [J].
Lazo Vasquez, Ricardo ;
Ochoa-Luna, Jose .
2021 XLVII LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2021), 2021,
[24]   Bringing order into the realm of Transformer-based language models for artificial intelligence and law [J].
Greco, Candida M. ;
Tagarelli, Andrea .
ARTIFICIAL INTELLIGENCE AND LAW, 2024, 32 (04) :863-1010
[25]   AMMU: A survey of transformer-based biomedical pretrained language models [J].
Kalyan, Katikapalli Subramanyam ;
Rajasekharan, Ajit ;
Sangeetha, Sivanesan .
JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 126
[26]   Pre-trained transformer-based language models for Sundanese [J].
Wilson Wongso ;
Henry Lucky ;
Derwin Suhartono .
Journal of Big Data, 9
[27]   Pre-trained transformer-based language models for Sundanese [J].
Wongso, Wilson ;
Lucky, Henry ;
Suhartono, Derwin .
JOURNAL OF BIG DATA, 2022, 9 (01)
[28]   Transformer-Based Composite Language Models for Text Evaluation and Classification [J].
Skoric, Mihailo ;
Utvic, Milos ;
Stankovic, Ranka .
MATHEMATICS, 2023, 11 (22)
[29]   TMD-BERT: A Transformer-Based Model for Transportation Mode Detection [J].
Drosouli, Ifigenia ;
Voulodimos, Athanasios ;
Mastorocostas, Paris ;
Miaoulis, Georgios ;
Ghazanfarpour, Djamchid .
ELECTRONICS, 2023, 12 (03)
[30]   Causal and Masked Language Modeling of Java']Javanese Language using Transformer-based Architectures [J].
Wongso, Wilson ;
Setiawan, David Samuel ;
Suhartono, Derwin .
13TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS 2021), 2021, :29-35