Research on Long Text Classification Model Based on Multi-Feature Weighted Fusion

被引:2
作者
Yue, Xi [1 ,2 ]
Zhou, Tao [1 ]
He, Lei [1 ,2 ]
Li, Yuxia [3 ]
机构
[1] Chengdu Univ Informat Technol, Sch Software Engn, Chengdu 610225, Peoples R China
[2] Sichuan Prov Engn Technol Res Ctr Support Softwar, Chengdu 610225, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Software Engn, Chengdu 611731, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 13期
关键词
text classification; multi-feature fusion; pretraining model; neural networks;
D O I
10.3390/app12136556
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Text classification in the long-text domain has become a development challenge due to the significant increase in text data, complexity enhancement, and feature extraction of long texts in various domains of the Internet. A long text classification model based on multi-feature weighted fusion is proposed for the problems of contextual semantic relations, long-distance global relations, and multi-sense words in long text classification tasks. The BERT model is used to obtain feature representations containing global semantic and contextual feature information of text, convolutional neural networks to obtain features at different levels and combine attention mechanisms to obtain weighted local features, fuse global contextual features with weighted local features, and obtain classification results by equal-length convolutional pooling. The experimental results show that the proposed model outperforms other models in terms of accuracy, precision, recall, F1 value, etc., under the same data set conditions compared with traditional deep learning classification models, and it can be seen that the model has more obvious advantages in long text classification.
引用
收藏
页数:15
相关论文
共 45 条
[1]   Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion [J].
Abdi, Asad ;
Shamsuddin, Siti Mariyam ;
Hasan, Shafaatunnur ;
Piran, Jalil .
INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (04) :1245-1259
[2]  
[Anonymous], 2015, P PACLIC 2015
[3]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[4]   Novel Efficient RNN and LSTM-Like Architectures: Recurrent and Gated Broad Learning Systems and Their Applications for Text Classification [J].
Du, Jie ;
Vong, Chi-Man ;
Chen, C. L. Philip .
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) :1586-1597
[5]   Arabic text classification using deep learning models [J].
Elnagar, Ashraf ;
Al-Debsi, Ridhwan ;
Einea, Omar .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (01)
[6]   Application of Support Vector Machine (SVM) in the Sentiment Analysis of Twitter DataSet [J].
Han, Kai-Xu ;
Chien, Wei ;
Chiu, Chien-Ching ;
Cheng, Yu-Ting .
APPLIED SCIENCES-BASEL, 2020, 10 (03)
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]   Reducing the dimensionality of data with neural networks [J].
Hinton, G. E. ;
Salakhutdinov, R. R. .
SCIENCE, 2006, 313 (5786) :504-507
[9]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269
[10]   Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism [J].
Jang, Beakcheol ;
Kim, Myeonghwi ;
Harerimana, Gaspard ;
Kang, Sang-ug ;
Kim, Jong Wook .
APPLIED SCIENCES-BASEL, 2020, 10 (17)