Leveraging the meta-embedding for text classification in a resource-constrained language

被引:14
|
作者
Hossain, Md. Rajib [1 ]
Hoque, Mohammed Moshiul [1 ]
Siddique, Nazmul [2 ]
机构
[1] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chittagong 4349, Bangladesh
[2] Ulster Univ, Sch Comp Engn & Intelligent Syst, Belfast, North Ireland
关键词
Natural language processing; Text classification; Text corpora; Semantic feature extraction; Meta-embedding; Deep learning;
D O I
10.1016/j.engappai.2023.106586
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes an intelligent text classification framework for a resource-constrained language like Bengali, which is considered a challenging task due to the lack of standard corpora, appropriate hyper-parameter tuning method, and pre-trained language-specific embedding. The proposed framework comprises an average meta-embedding feature fusion module and a convolutions neural network module called AVG-M+CNN. This work also proposes an algorithm, i.e., automatic hyperparameter tuning and selection, for enhancing the performance of the AVG-M+CN N technique. A l l meta-embedding models are evaluated using the intrinsic, e.g., semantic, syntactic, relatedness word similarity, analog y tasks and extrinsic evaluators. The intrinsic evaluator evaluates 200 Bengali semantic, syntactic and relatedness word pairs. Spearman (o), Pearson (?) and cosine similarity correlations are used to evaluate 18 individual embedding and 9 meta-embedding models. The 3COSADD and 3COSMU L evaluators evaluate the 300 analog y tasks. The extrinsic evaluator evaluates a total of 156 classification models on four corpora: BARD, IndicNLP, Prothom-Alo and BTCC 11 (a newly developed corpus having eleven distinct categories). Among these, the AVG-M+CN N model achieves the highest accuracy regarding four Bengal i corpora: 95.92 & PLUSMN;.001% for BARD, 93.10 & PLUSMN;.001% for Prothom-Alo, 90.07 & PLUSMN;.001% for BTCC 11 and 87.44 & PLUSMN;.001% for IndicNLP, respectively.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] An Open-Source Tool for Classification Models in Resource-Constrained Hardware
    da Silva, Lucas Tsutsui
    Souza, Vinicius M. A.
    Batista, Gustavo E. A. P. A.
    TsutsuidaSilva, Lucas
    IEEE SENSORS JOURNAL, 2022, 22 (01) : 544 - 554
  • [22] Resource-Constrained Implementation and Optimization of a Deep Neural Network for Vehicle Classification
    Xie, Renjie
    Huttunen, Heikki
    Lin, Shuoxin
    Bhattacharyya, Shuvra S.
    Takala, Jarmo
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 1862 - 1866
  • [23] New meta-heuristics for the resource-constrained project scheduling problem
    Lim, Andrew
    Ma, Hong
    Rodrigues, Brian
    Tan, Sun Teck
    Xiao, Fei
    FLEXIBLE SERVICES AND MANUFACTURING JOURNAL, 2013, 25 (1-2) : 48 - 73
  • [24] New meta-heuristics for the resource-constrained project scheduling problem
    Andrew Lim
    Hong Ma
    Brian Rodrigues
    Sun Teck Tan
    Fei Xiao
    Flexible Services and Manufacturing Journal, 2013, 25 : 48 - 73
  • [25] Efficient Language-Guided Reinforcement Learning for Resource-Constrained Autonomous Systems
    Shiri, Aidin
    Navardi, Mozhgan
    Manjunath, Tejaswini
    Waytowich, Nicholas R.
    Mohsenin, Tinoosh
    IEEE MICRO, 2022, 42 (06) : 107 - 114
  • [26] Dhivehi Speech Recognition: A Multimodal Approach for Dhivehi Language in Resource-Constrained Settings
    Mehra, Sunakshi
    Ranga, Virender
    Agarwal, Ritu
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, : 2020 - 2040
  • [27] Optimizing Convolutional Neural Networks for Image Classification on Resource-Constrained Microcontroller Units
    Brockmann, Susanne
    Schlippe, Tim
    COMPUTERS, 2024, 13 (07)
  • [28] Multiple Instance Learning for Efficient Sequential Data Classification on Resource-constrained Devices
    Dennis, Don Kurian
    Pabbaraju, Chirag
    Simhadri, Harsha Vardhan
    Jain, Prateek
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [29] Lightweight CNN-based malware image classification for resource-constrained applications
    Hota, Ashlesha
    Panja, Subir
    Nag, Amitava
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2022,
  • [30] Binarized ResNet: Enabling Robust Automatic Modulation Classification at the Resource-Constrained Edge
    Shankar, Nitin Priyadarshini
    Sadhukhan, Deepsayan
    Nayak, Nancy
    Tholeti, Thulasi
    Kalyani, Sheetal
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (05) : 1913 - 1927