ParsBERT: Transformer-based Model for Persian Language Understanding

被引:64
作者
Farahani, Mehrdad [1 ]
Gharachorloo, Mohammad [2 ]
Farahani, Marzieh [3 ]
Manthouri, Mohammad [4 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, North Tehran Branch, Tehran, Iran
[2] Queensland Univ Technol, Sch Elect Engn & Robot, Brisbane, Qld, Australia
[3] Umea Univ, Dept Comp Sci, Umea, Sweden
[4] Shahed Univ, Dept Elect & Elect Engn, Tehran, Iran
关键词
Persian; Transformers; BERT; Language Models; NLP; NLU;
D O I
10.1007/s11063-021-10528-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones and gathered ones, and improves the state-of-the-art performance by outperforming both multilingual BERT and other prior works in Sentiment Analysis, Text Classification, and Named Entity Recognition tasks.
引用
收藏
页码:3831 / 3847
页数:17
相关论文
共 50 条
  • [21] SignNet II: A Transformer-Based Two-Way Sign Language Translation Model
    Chaudhary, Lipisha
    Ananthanarayana, Tejaswini
    Hoq, Enjamamul
    Nwogu, Ifeoma
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12896 - 12907
  • [22] Transformer-based Approaches for Personality Detection using the MBTI Model
    Lazo Vasquez, Ricardo
    Ochoa-Luna, Jose
    2021 XLVII LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2021), 2021,
  • [23] Bringing order into the realm of Transformer-based language models for artificial intelligence and law
    Greco, Candida M.
    Tagarelli, Andrea
    ARTIFICIAL INTELLIGENCE AND LAW, 2024, 32 (04) : 863 - 1010
  • [24] AMMU: A survey of transformer-based biomedical pretrained language models
    Kalyan, Katikapalli Subramanyam
    Rajasekharan, Ajit
    Sangeetha, Sivanesan
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 126
  • [25] Pre-trained transformer-based language models for Sundanese
    Wilson Wongso
    Henry Lucky
    Derwin Suhartono
    Journal of Big Data, 9
  • [26] Pre-trained transformer-based language models for Sundanese
    Wongso, Wilson
    Lucky, Henry
    Suhartono, Derwin
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [27] Transformer-Based Composite Language Models for Text Evaluation and Classification
    Skoric, Mihailo
    Utvic, Milos
    Stankovic, Ranka
    MATHEMATICS, 2023, 11 (22)
  • [28] TMD-BERT: A Transformer-Based Model for Transportation Mode Detection
    Drosouli, Ifigenia
    Voulodimos, Athanasios
    Mastorocostas, Paris
    Miaoulis, Georgios
    Ghazanfarpour, Djamchid
    ELECTRONICS, 2023, 12 (03)
  • [29] TRANSQL: A Transformer-based Model for Classifying SQL Queries
    Tahmasebi, Shirin
    Payberah, Amir H.
    Paragraph, Ahmet Soylu
    Roman, Dumitru
    Matskin, Mihhail
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 788 - 793
  • [30] Causal and Masked Language Modeling of Java']Javanese Language using Transformer-based Architectures
    Wongso, Wilson
    Setiawan, David Samuel
    Suhartono, Derwin
    13TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS 2021), 2021, : 29 - 35