Leveraging the meta-embedding for text classification in a resource-constrained language

被引：14

作者：

Hossain, Md. Rajib ^{[1
]}

Hoque, Mohammed Moshiul ^{[1
]}

Siddique, Nazmul ^{[2
]}

机构：

[1] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chittagong 4349, Bangladesh

[2] Ulster Univ, Sch Comp Engn & Intelligent Syst, Belfast, North Ireland

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 124卷

关键词：

Natural language processing; Text classification; Text corpora; Semantic feature extraction; Meta-embedding; Deep learning;

D O I：

10.1016/j.engappai.2023.106586

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes an intelligent text classification framework for a resource-constrained language like Bengali, which is considered a challenging task due to the lack of standard corpora, appropriate hyper-parameter tuning method, and pre-trained language-specific embedding. The proposed framework comprises an average meta-embedding feature fusion module and a convolutions neural network module called AVG-M+CNN. This work also proposes an algorithm, i.e., automatic hyperparameter tuning and selection, for enhancing the performance of the AVG-M+CN N technique. A l l meta-embedding models are evaluated using the intrinsic, e.g., semantic, syntactic, relatedness word similarity, analog y tasks and extrinsic evaluators. The intrinsic evaluator evaluates 200 Bengali semantic, syntactic and relatedness word pairs. Spearman (o), Pearson (?) and cosine similarity correlations are used to evaluate 18 individual embedding and 9 meta-embedding models. The 3COSADD and 3COSMU L evaluators evaluate the 300 analog y tasks. The extrinsic evaluator evaluates a total of 156 classification models on four corpora: BARD, IndicNLP, Prothom-Alo and BTCC 11 (a newly developed corpus having eleven distinct categories). Among these, the AVG-M+CN N model achieves the highest accuracy regarding four Bengal i corpora: 95.92 & PLUSMN;.001% for BARD, 93.10 & PLUSMN;.001% for Prothom-Alo, 90.07 & PLUSMN;.001% for BTCC 11 and 87.44 & PLUSMN;.001% for IndicNLP, respectively.

引用

页数：18

共 50 条

[21] An Open-Source Tool for Classification Models in Resource-Constrained Hardware
da Silva, Lucas Tsutsui
Souza, Vinicius M. A.
Batista, Gustavo E. A. P. A.
TsutsuidaSilva, Lucas
IEEE SENSORS JOURNAL, 2022, 22 (01) : 544 - 554
[22] Resource-Constrained Implementation and Optimization of a Deep Neural Network for Vehicle Classification
Xie, Renjie
Huttunen, Heikki
Lin, Shuoxin
Bhattacharyya, Shuvra S.
Takala, Jarmo
2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 1862 - 1866
[23] New meta-heuristics for the resource-constrained project scheduling problem
Lim, Andrew
Ma, Hong
Rodrigues, Brian
Tan, Sun Teck
Xiao, Fei
FLEXIBLE SERVICES AND MANUFACTURING JOURNAL, 2013, 25 (1-2) : 48 - 73
[24] New meta-heuristics for the resource-constrained project scheduling problem
Andrew Lim
Hong Ma
Brian Rodrigues
Sun Teck Tan
Fei Xiao
Flexible Services and Manufacturing Journal, 2013, 25 : 48 - 73
[25] Efficient Language-Guided Reinforcement Learning for Resource-Constrained Autonomous Systems
Shiri, Aidin
Navardi, Mozhgan
Manjunath, Tejaswini
Waytowich, Nicholas R.
Mohsenin, Tinoosh
IEEE MICRO, 2022, 42 (06) : 107 - 114
[26] Dhivehi Speech Recognition: A Multimodal Approach for Dhivehi Language in Resource-Constrained Settings
Mehra, Sunakshi
Ranga, Virender
Agarwal, Ritu
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, : 2020 - 2040
[27] Optimizing Convolutional Neural Networks for Image Classification on Resource-Constrained Microcontroller Units
Brockmann, Susanne
Schlippe, Tim
COMPUTERS, 2024, 13 (07)
[28] Multiple Instance Learning for Efficient Sequential Data Classification on Resource-constrained Devices
Dennis, Don Kurian
Pabbaraju, Chirag
Simhadri, Harsha Vardhan
Jain, Prateek
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[29] Lightweight CNN-based malware image classification for resource-constrained applications
Hota, Ashlesha
Panja, Subir
Nag, Amitava
INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2022,
[30] Binarized ResNet: Enabling Robust Automatic Modulation Classification at the Resource-Constrained Edge
Shankar, Nitin Priyadarshini
Sadhukhan, Deepsayan
Nayak, Nancy
Tholeti, Thulasi
Kalyani, Sheetal
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (05) : 1913 - 1927

← 1 2 3 4 5 →