Attention-based BiLSTM fused CNN with gating mechanism model for Chinese long text classi fi cation

被引：104

作者：

Deng, Jianfeng ^{[1
]}

Cheng, Lianglun ^{[1
,2
]}

Wang, Zhuowei ^{[2
]}

机构：

[1] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China

[2] Guangdong Univ Technol, Sch Comp, Guangzhou 510006, Peoples R China

来源：

COMPUTER SPEECH AND LANGUAGE | 2021年 / 68卷

关键词：

Attention mechanism; BiLSTM; CNN; Gating mechanism; Text classification;

D O I：

10.1016/j.csl.2020.101182

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural networks have been widely used in the field of text classification, and have achieved good results on various Chinese datasets. However, for long text classification, there are a lot of redundant information in text data, and some of the redundant information may involve other topic information, which makes long text classification challenging. To solve the above problems, this paper proposes a new text classification model, called attention-based BiLSTM fused CNN with gating mechanism(ABLG-CNN). In ABLG-CNN, word2vec is used to train word vector representation. The attention mechanism is used to calculate context vector of words to derive keyword information. Bidirectional long short-term memory network (BiLSTM) captures context features. Based on this, convolutional neural network(CNN) captures topic salient features. In view of the possible existence of sentences involving other topic information in long texts, a gating mechanism is introduced to assign weights to BiLSTM and CNN output features to obtain text fusion features that are favorable for classification. ABLG-CNN can capture text context semantics and local phrase features, and perform experimental verification on two long text news datasets. The experimental results show that ABLG-CNN's classification performance is better than other latest text classification methods. (c) 2021 Elsevier Ltd. All rights reserved.

引用

页数：12

共 33 条

[1]

[Anonymous], 2018, P 56 ANN M ASS COMP

[2]

Cho K., 2014, C EMP METH NAT LANG, P1724, DOI [10.3115/v1/D14-1179, DOI 10.3115/V1/D14-1179]

[3] Attention pooling-based convolutional neural network for sentence modelling [J].

Er, Meng Joo ;

Zhang, Yong ;

Wang, Ning ;

Pratama, Mahardhika .

INFORMATION SCIENCES, 2016, 373 :388-403

[4]

Fan G, 2019, 2 INT C IM VID PROC

[5] APPROXIMATION OF DYNAMICAL-SYSTEMS BY CONTINUOUS-TIME RECURRENT NEURAL NETWORKS [J].

FUNAHASHI, K ;

NAKAMURA, Y .

NEURAL NETWORKS, 1993, 6 (06) :801-806

[6] Efficient Deep Learning Model for Text Classification Based on Recurrent and Convolutional Layers [J].

Hassan, Abdalraouf ;

Mahmood, Ausif .

2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, :1108-1113

[7]

He Huang, 2019, 2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS), P632, DOI 10.1109/IUCC/DSCI/SmartCNS.2019.00132

[8]

Hochreiter S., 1997, Neural Computation, V9, P1735

[9]

Hoi S., ARXIV PREPRINT ARXIV

[10] An Efficient Character-Level and Word-Level Feature Fusion Method for Chinese Text Classification [J].

Jin Wenzhen ;

Zhu Hong ;

Yang Guocai .

2019 3RD INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2019), 2019, 1229

← 1 2 3 4 →