Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

被引:17
作者
Kowsher, M. [1 ]
Sami, Abdullah A. S. [2 ]
Prottasha, Nusrat Jahan [3 ]
Arefin, Mohammad Shamsul [3 ,4 ]
Dhar, Pranab Kumar [4 ]
Koshiba, Takeshi [5 ]
机构
[1] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
[2] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chattogram 4349, Bangladesh
[3] Daffodil Int Univ, Dept Comp Sci & Engn, Dhaka 1207, Bangladesh
[4] Chittagong Univ Engn & Technol, Chattogram 4349, Bangladesh
[5] Waseda Univ, Shinjuku Ku, Tokyo 1698050, Japan
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Bit error rate; Learning systems; Transformers; Data models; Computational modeling; Internet; Transfer learning; Bangla NLP; BERT-base; large corpus; transformer;
D O I
10.1109/ACCESS.2022.3197662
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of pre-trained language models has directed a new era of Natural Language Processing (NLP), enabling us to create powerful language models. Among these models, Transformer-based models like BERT have grown in popularity due to their cutting-edge effectiveness. However, these models heavily rely on resource-intensive languages, forcing other languages into multilingual models(mBERT). The two fundamental challenges with mBERT become significantly more challenging in a resource-constrained language like Bangla. It was trained on a limited and organized dataset and contained weights for all other languages. Besides, current research on other languages suggests that a language-specific BERT model will exceed multilingual ones. This paper introduces Bangla-BERT,a a monolingual BERT model for the Bangla language. Despite the limited data available for NLP tasks in Bangla, we perform pre-training on the largest Bangla language model dataset, BanglaLM, which we constructed using 40 GB of text data. Bangla-BERT achieves the highest results in all datasets and vastly improves the state-of-the-art performance in binary linguistic classification, multilabel extraction, and named entity recognition, outperforming multilingual BERT and other previous research. The pre-trained model is assessed against several non-contextual models such as Bangla fasttext and word2vec the downstream tasks. Finally, this model is evaluated by transfer learning based on hybrid deep learning models such as LSTM, CNN, and CRF in NER, and it is observed that Bangla-BERT outperforms state-of-the-art methods. The proposed Bangla-BERT model is assessed by using benchmark datasets, including Banfakenews, Sentiment Analysis on Bengali News Comments, and Cross-lingual Sentiment Analysis in Bengali. Finally, it is concluded that Bangla-BERT surpasses all prior state-of-the-art results by 3.52%, 2.2%, and 5.3%.
引用
收藏
页码:91855 / 91870
页数:16
相关论文
共 50 条
  • [31] A Transformer-Based Thermal Surrogate Model for Cooling Control in Data Centers
    Zhou, Hanchen
    Mu, Ni
    Jia, Qing-Shan
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (01): : 644 - 651
  • [32] Transformer-Based Music Language Modelling and Transcription
    Zonios, Christos
    Pavlopoulos, John
    Likas, Aristidis
    PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [33] BERT, XLNet or RoBERTa: The Best Transfer Learning Model to Detect Clickbaits
    Rajapaksha, Praboda
    Farahbakhsh, Reza
    Crespi, Noel
    IEEE ACCESS, 2021, 9 : 154704 - 154716
  • [34] PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition
    Wang, Yuxin
    Xie, Hongtao
    Fang, Shancheng
    Xing, Mengting
    Wang, Jing
    Zhu, Shenggao
    Zhang, Yongdong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5585 - 5598
  • [35] Automatic summarization of cooking videos using transfer learning and transformer-based models
    P. M. Alen Sadique
    R. V. Aswiga
    Discover Artificial Intelligence, 5 (1):
  • [36] Transfer Learning in Transformer-Based Demand Forecasting For Home Energy Management System
    Gokhale, Gargya
    Van Gompel, Jonas
    Claessens, Bert
    Develder, Chris
    PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2023, 2023, : 458 - 462
  • [37] Convolutional Transformer-Based Cross Subject Model for SSVEP-Based BCI Classification
    Liu, Jiawei
    Wang, Ruimin
    Yang, Yuankui
    Zong, Yuan
    Leng, Yue
    Zheng, Wenming
    Ge, Sheng
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (11) : 6581 - 6593
  • [38] Transformer-Based Language-Person Search With Multiple Region Slicing
    Li, Hui
    Xiao, Jimin
    Sun, Mingjie
    Lim, Eng Gee
    Zhao, Yao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1624 - 1633
  • [39] Enhancing performance of transformer-based models in natural language understanding through word importance embedding
    Hong, Seung-Kyu
    Jang, Jae-Seok
    Kwon, Hyuk-Yoon
    KNOWLEDGE-BASED SYSTEMS, 2024, 304
  • [40] TemproNet: A transformer-based deep learning model for seawater temperature prediction
    Chen, Qiaochuan
    Cai, Candong
    Chen, Yaoran
    Zhou, Xi
    Zhang, Dan
    Peng, Yan
    OCEAN ENGINEERING, 2024, 293