A Hybrid Attention-Based Transformer Model for Arabic News Classification Using Text Embedding and Deep Learning

被引：0

作者：

Hossain, Md. Mithun ^{[1
]}

Hossain, Md. Shakil ^{[1
]}

Safran, Mejdl ^{[2
]}

Alfarhood, Sultan ^{[3
]}

Alfarhood, Meshal ^{[3
]}

F. Mridha, M. ^{[4
]}

机构：

[1] Bangladesh Univ Business & Technol, Dept Comp Sci & Engn, Dhaka 1216, Bangladesh

[2] King Saud Univ, Coll Comp & Informat Sci, Res Chair Online Dialogue & Cultural Commun, Dept Comp Sci, Riyadh 11543, Saudi Arabia

[3] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh 11543, Saudi Arabia

[4] Amer Int Univ Bangladesh, Dept Comp Sci, Dhaka 1229, Bangladesh

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Deep learning; Accuracy; Analytical models; Text categorization; Transformers; Sentiment analysis; Data models; Tokenization; Predictive models; Syntactics; hybrid transformer; Arabic text classifications; Arabic news classifications; SENTIMENT ANALYSIS;

D O I：

10.1109/ACCESS.2024.3522061

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Efficient classification of Arabic news items has become more crucial for efficient information management and analysis due to the fast growth of online news material. This paper proposes a hybrid Attention-Based Transformer Model (ABTM) for Arabic news categorization that uses deep learning and classical text representations to improve classification accuracy and interpretability. Given the increasing amount of Arabic news materials, robust categorization systems are crucial for properly managing and analyzing this information. To deal with the complexities of the Arabic language and enrich the dataset, we used a thorough preparation pipeline that includes text cleaning, tokenization, lemmatization, and data augmentation approaches. We combined a bespoke attention embedder with classic TF-IDF and Bag-of-Words features to provide a comprehensive feature set that includes both the text's contextual and statistical aspects. We benchmarked our technique using cutting-edge Arabic language models, such as AraBERTv1-base and asafaya/bert-base-arabic. We use (local interpretable model agnostic explanation) text explainer to offer insights into model predictions, improving our findings' interpretability. Our results show that the ABTM strategy considerably enhances classification performance, with high accuracy and reasonable explanations for model decisions. This classification includes a wide range of news categories, including politics, sports, culture, the economy, and a variety of themes, representing the diversity of Arabic news. This study contributes to the field of Arabic natural language processing by offering a novel method that combines deep learning with traditional techniques, thereby advancing the state of Arabic news classification. Enhanced classification accuracy and interpretability facilitate better management and understanding of the rich and growing Arabic news content, supporting informed decision-making and knowledge discovery.

引用

页码：198046 / 198066

页数：21

共 50 条

[21] A BERT-Based Hybrid Short Text Classification Model Incorporating CNN and Attention-Based BiGRU
Bao, Tong
Ren, Ni
Luo, Rui
Wang, Baojia
Shen, Gengyu
Guo, Ting
JOURNAL OF ORGANIZATIONAL AND END USER COMPUTING, 2021, 33 (06)
[22] A Proposed Deep Learning based Framework for Arabic Text Classification
Sayed, Mostafa
Abdelkader, Hatem
Khedr, Ayman E.
Salem, Rashed
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 305 - 313
[23] Arabic News Classification Based on the Country of Origin Using Machine Learning and Deep Learning Techniques
Zamzami, Nuha
Himdi, Hanen
Sabbeh, Sahar F.
APPLIED SCIENCES-BASEL, 2023, 13 (12):
[24] An attention-based deep learning for acute lymphoblastic leukemia classification
Jawahar, Malathy
Anbarasi, L. Jani
Narayanan, Sathiya
Gandomi, Amir H.
SCIENTIFIC REPORTS, 2024, 14 (01):
[25] Federated deep active learning for attention-based transaction classification
Usman Ahmed
Jerry Chun-Wei Lin
Philippe Fournier-Viger
Applied Intelligence, 2023, 53 : 8631 - 8643
[26] Multimodal attention-based deep learning for automatic modulation classification
Han, Jia
Yu, Zhiyong
Yang, Jian
FRONTIERS IN ENERGY RESEARCH, 2023, 10
[27] Federated deep active learning for attention-based transaction classification
Ahmed, Usman
Lin, Jerry Chun-Wei
Fournier-Viger, Philippe
APPLIED INTELLIGENCE, 2023, 53 (08) : 8631 - 8643
[28] Word embedding and text classification based on deep learning methods
Li, Saihan
Gong, Bing
2020 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE COMMUNICATION AND NETWORK SECURITY (CSCNS2020), 2021, 336
[29] Mobile traffic prediction with attention-based hybrid deep learning
Wang, Li
Che, Linxiao
Lam, Kwok-Yan
Liu, Wenqiang
Li, Feng
PHYSICAL COMMUNICATION, 2024, 66
[30] A Deep Learning Approach for Arabic Text Classification
Sundus, Katrina
Al-Haj, Fatima
Hammo, Bassam
2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 258 - 264

← 1 2 3 4 5 →