Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)

被引：6

作者：

Eang, Chanthol ^{[1
]}

Lee, Seungjae ^{[1
]}

机构：

[1] Sun Moon Univ, Intelligent Robot Res Inst, Dept Comp Sci & Engn, Asan 31460, South Korea

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 18期

关键词：

RNN_Bert_based; KNN_Bert_based; Bert; RNN; SST-2; dataset; SENTIMENT ANALYSIS;

D O I：

10.3390/app14188388

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert's bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy.

引用

页数：26

共 39 条

[1] Leveraging Knowledge-Based Features With Multilevel Attention Mechanisms for Short Arabic Text Classification [J].

Alagha, Iyad .

IEEE ACCESS, 2022, 10 :51908-51921

[2]

[Anonymous], 2017, Int. J. Recent Trends Eng. Res, V3, P48

[3]

[Anonymous], 2018, Int. J. Recent Trends Eng. Res, V4, P81

[4] Chinese Multilabel Short Text Classification Method Based on GAN and Pinyin Embedding [J].

Bai, Jinpeng ;

Li, Xinfu .

IEEE ACCESS, 2024, 12 :83323-83329

[5] Solving Data Imbalance in Text Classification With Constructing Contrastive Samples [J].

Chen, Xi ;

Zhang, Wei ;

Pan, Shuai ;

Chen, Jiayin .

IEEE ACCESS, 2023, 11 :90554-90562

[6] A Long-Text Classification Method of Chinese News Based on BERT and CNN [J].

Chen, Xinying ;

Cong, Peimin ;

Lv, Shuo .

IEEE ACCESS, 2022, 10 :34046-34057

[7] One-Class Learning for AI-Generated Essay Detection [J].

Corizzo, Roberto ;

Leal-Arenas, Sebastian .

APPLIED SCIENCES-BASEL, 2023, 13 (13)

[8] A Survey on Bias in Deep NLP [J].

Garrido-Munoz, Ismael ;

Montejo-Raez, Arturo ;

Martinez-Santiago, Fernando ;

Urena-Lopez, L. Alfonso .

APPLIED SCIENCES-BASEL, 2021, 11 (07)

[9] Joint Representations of Texts and Labels with Compositional Loss for Short Text Classification [J].

Hao, Ming ;

Wang, Weijing ;

Zhou, Fang .

JOURNAL OF WEB ENGINEERING, 2021, 20 (03) :669-687

[10] Text Sentiment Analysis of Douban Film Short Comments Based on BERT-CNN-BiLSTM-Att Model [J].

He, Aixiang ;

Abisado, Mideth .

IEEE ACCESS, 2024, 12 :45229-45237

← 1 2 3 4 →