Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

被引:17
作者
Kowsher, M. [1 ]
Sami, Abdullah A. S. [2 ]
Prottasha, Nusrat Jahan [3 ]
Arefin, Mohammad Shamsul [3 ,4 ]
Dhar, Pranab Kumar [4 ]
Koshiba, Takeshi [5 ]
机构
[1] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
[2] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chattogram 4349, Bangladesh
[3] Daffodil Int Univ, Dept Comp Sci & Engn, Dhaka 1207, Bangladesh
[4] Chittagong Univ Engn & Technol, Chattogram 4349, Bangladesh
[5] Waseda Univ, Shinjuku Ku, Tokyo 1698050, Japan
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Bit error rate; Learning systems; Transformers; Data models; Computational modeling; Internet; Transfer learning; Bangla NLP; BERT-base; large corpus; transformer;
D O I
10.1109/ACCESS.2022.3197662
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of pre-trained language models has directed a new era of Natural Language Processing (NLP), enabling us to create powerful language models. Among these models, Transformer-based models like BERT have grown in popularity due to their cutting-edge effectiveness. However, these models heavily rely on resource-intensive languages, forcing other languages into multilingual models(mBERT). The two fundamental challenges with mBERT become significantly more challenging in a resource-constrained language like Bangla. It was trained on a limited and organized dataset and contained weights for all other languages. Besides, current research on other languages suggests that a language-specific BERT model will exceed multilingual ones. This paper introduces Bangla-BERT,a a monolingual BERT model for the Bangla language. Despite the limited data available for NLP tasks in Bangla, we perform pre-training on the largest Bangla language model dataset, BanglaLM, which we constructed using 40 GB of text data. Bangla-BERT achieves the highest results in all datasets and vastly improves the state-of-the-art performance in binary linguistic classification, multilabel extraction, and named entity recognition, outperforming multilingual BERT and other previous research. The pre-trained model is assessed against several non-contextual models such as Bangla fasttext and word2vec the downstream tasks. Finally, this model is evaluated by transfer learning based on hybrid deep learning models such as LSTM, CNN, and CRF in NER, and it is observed that Bangla-BERT outperforms state-of-the-art methods. The proposed Bangla-BERT model is assessed by using benchmark datasets, including Banfakenews, Sentiment Analysis on Bengali News Comments, and Cross-lingual Sentiment Analysis in Bengali. Finally, it is concluded that Bangla-BERT surpasses all prior state-of-the-art results by 3.52%, 2.2%, and 5.3%.
引用
收藏
页码:91855 / 91870
页数:16
相关论文
共 50 条
  • [41] Transformer-based transfer learning on self-reported voice recordings for Parkinson's disease diagnosis
    Tougui, Ilias
    Zakroum, Mehdi
    Karrakchou, Ouassim
    Ghogho, Mounir
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [42] BERT-Caps: A Transformer-Based Capsule Network for Tweet Act Classification
    Saha, Tulika
    Ramesh Jayashree, Srivatsa
    Saha, Sriparna
    Bhattacharyya, Pushpak
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2020, 7 (05): : 1168 - 1179
  • [43] Exploring Transformer-Based Learning for Negation Detection in Biomedical Texts
    Althari, Ghadeer
    Alsulmi, Mohammad
    IEEE ACCESS, 2022, 10 : 83813 - 83825
  • [44] BERT Learns From Electroencephalograms About Parkinson's Disease: Transformer-Based Models for Aid Diagnosis
    Nogales, Alberto
    Garcia-Tejedor, Alvaro J.
    Maitin, Ana M.
    Perez-Morales, Antonio
    Dolores Del Castillo, Maria
    Pablo Romero, Juan
    IEEE ACCESS, 2022, 10 : 101672 - 101682
  • [45] Memory-efficient Transformer-based network model for Traveling Salesman Problem
    Yang, Hua
    Zhao, Minghao
    Yuan, Lei
    Yu, Yang
    Li, Zhenhua
    Gu, Ming
    NEURAL NETWORKS, 2023, 161 : 589 - 597
  • [46] RadBERT: Adapting Transformer-based Language Models to Radiology
    Yan, An
    McAuley, Julian
    Lu, Xing
    Du, Jiang
    Chang, Eric Y.
    Gentili, Amilcare
    Hsu, Chun-Nan
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2022, 4 (04)
  • [47] Transformer-Based Cache Replacement Policy Learning
    Yang, Meng
    Yang, Chenxu
    Shao, Jie
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2022, 2022, 13724 : 493 - 500
  • [48] A Deep Learning Model Based on BERT and Sentence Transformer for Semantic Keyphrase Extraction on Big Social Data
    Devika, R.
    Vairavasundaram, Subramaniyaswamy
    Mahenthar, C. Sakthi Jay
    Varadarajan, Vijayakumar
    Kotecha, Ketan
    IEEE ACCESS, 2021, 9 : 165252 - 165261
  • [49] ETDNet: Efficient Transformer-Based Detection Network for Surface Defect Detection
    Zhou, Hantao
    Yang, Rui
    Hu, Runze
    Shu, Chang
    Tang, Xiaochu
    Li, Xiu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [50] SignNet II: A Transformer-Based Two-Way Sign Language Translation Model
    Chaudhary, Lipisha
    Ananthanarayana, Tejaswini
    Hoq, Enjamamul
    Nwogu, Ifeoma
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12896 - 12907