ParsBERT: Transformer-based Model for Persian Language Understanding

被引:72
作者
Farahani, Mehrdad [1 ]
Gharachorloo, Mohammad [2 ]
Farahani, Marzieh [3 ]
Manthouri, Mohammad [4 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, North Tehran Branch, Tehran, Iran
[2] Queensland Univ Technol, Sch Elect Engn & Robot, Brisbane, Qld, Australia
[3] Umea Univ, Dept Comp Sci, Umea, Sweden
[4] Shahed Univ, Dept Elect & Elect Engn, Tehran, Iran
关键词
Persian; Transformers; BERT; Language Models; NLP; NLU;
D O I
10.1007/s11063-021-10528-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones and gathered ones, and improves the state-of-the-art performance by outperforming both multilingual BERT and other prior works in Sentiment Analysis, Text Classification, and Named Entity Recognition tasks.
引用
收藏
页码:3831 / 3847
页数:17
相关论文
共 50 条
[31]   WavFace: A Multimodal Transformer-Based Model for Depression Screening [J].
Flores, Ricardo ;
Tlachac, M. L. ;
Shrestha, Avantika ;
Rundensteiner, Elke A. .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2025, 29 (05) :3632-3641
[32]   TRANSQL: A Transformer-based Model for Classifying SQL Queries [J].
Tahmasebi, Shirin ;
Payberah, Amir H. ;
Paragraph, Ahmet Soylu ;
Roman, Dumitru ;
Matskin, Mihhail .
2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, :788-793
[33]   The Adaptability of a Transformer-Based OCR Model for Historical Documents [J].
Strobel, Phillip Benjamin ;
Hodel, Tobias ;
Boente, Walter ;
Volk, Martin .
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2023 WORKSHOPS, PT I, 2023, 14193 :34-48
[34]   Enriching Transformer-Based Embeddings for Emotion Identification in an Agglutinative Language: Turkish [J].
Uymaz, Hande Aka ;
Metin, Senem Kumova .
IT PROFESSIONAL, 2023, 25 (04) :67-73
[35]   A review on the applications of Transformer-based language models for nucleotide sequence analysis [J].
Ghosh, Nimisha ;
Santoni, Daniele ;
Saha, Indrajit ;
Felici, Giovanni .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2025, 27 :1244-1254
[36]   Pre-training and Evaluating Transformer-based Language Models for Icelandic [J].
Daoason, Jon Friorik ;
Loftsson, Hrafn .
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, :7386-7391
[37]   Enhancing Address Data Integrity using Transformer-Based Language Models [J].
Kurklu, Omer Faruk ;
Akagiunduz, Erdem .
32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
[38]   CTRAN: CNN-Transformer-based network for natural language understanding [J].
Rafiepour, Mehrdad ;
Sartakhti, Javad Salimi .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
[39]   Testing Stimulus Equivalence in Transformer-Based Agents [J].
Carrillo, Alexis ;
Betancort, Moises .
FUTURE INTERNET, 2024, 16 (08)
[40]   Quantifying the Bias of Transformer-Based Language Models for African American English in Masked Language Modeling [J].
Salutari, Flavia ;
Ramos, Jerome ;
Rahmani, Hossein A. ;
Linguaglossa, Leonardo ;
Lipani, Aldo .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT I, 2023, 13935 :532-543