Two-stage three-way enhanced technique for ensemble learning in inclusive policy text classification

被引:43
作者
Liang, Decui [1 ]
Yi, Bochun [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Management & Econ, Chengdu 610054, Peoples R China
基金
中国国家自然科学基金;
关键词
Three-way decisions; Decision support; Ensemble learning; Convolutional neural networks; Inclusive policy text classification; CONVOLUTIONAL NEURAL-NETWORK; DECISION; REPRESENTATIONS; MODELS;
D O I
10.1016/j.ins.2020.08.051
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of the social economy, small and medium-sized enterprises (SMEs) play a vital role in promoting economic development. Multiple local governments in China are developing policy recommended platforms in order to help SMEs better understand the inclusive policy. However, these online platforms manually extract the key information from the inclusive policy texts, which takes a lot of time and causes low efficiency. The policy text is composed of some paragraphs and each paragraph corresponds to a topic. When we classify the paragraphs into different topics, there exists a decision risk of text misclassification. Therefore, we design two-stage based three-way enhanced technique to automatically classify these text paragraphs into the predefined categories. At the first stage, by using ensemble learning algorithms, we construct an ensemble convolution neural network (CNN) model in order to ensure the generalization ability and stability of text classification results. Meanwhile, we develop a new weight determination method to integrate the prediction results of all base classifiers according to the accuracy and classification confidence. With the help of three-way decisions (3WD), we assign the samples with poor resolution to the boundary area for secondary classification, which can reduce the decision risk. At the second stage, in order to classify the boundary region samples and improve the overall classification results, we further utilize traditional machine learning method as the secondary classifier. Finally, we develop some comparison experiments to verify our proposed method. The experimental results show that the two-stage three-way enhanced classification framework is valid and obtains a better performance. Our proposed method can effectively support the designment of policy recommended platforms and serve SMEs. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:271 / 288
页数:18
相关论文
共 48 条
[1]   Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion [J].
Abdi, Asad ;
Shamsuddin, Siti Mariyam ;
Hasan, Shafaatunnur ;
Piran, Jalil .
INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (04) :1245-1259
[2]  
Aldweesh A., 2019, KNOWLEDGE BASED SYST, P1
[3]  
[Anonymous], 2014, C EMPIRICAL METHODS
[4]   Category Classification and Topic Discovery of Japanese and English News Articles [J].
Bracewell, David B. ;
Yan, Jiajun ;
Ren, Fuji ;
Kuroiwa, Shingo .
ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2009, 225 :51-65
[5]   Predicting corporate financial distress based on integration of decision tree classification and logistic regression [J].
Chen, Mu-Yen .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (09) :11261-11272
[6]   An improved SVM classifier based on double chains quantum genetic algorithm and its application in analogue circuit diagnosis [J].
Chen, Peng ;
Yuan, Lifen ;
He, Yigang ;
Luo, Shuai .
NEUROCOMPUTING, 2016, 211 :202-211
[7]  
Ciaparrone G., 2019, NEUROCOMPUTING, P1
[8]   Arabic text classification using deep learning models [J].
Elnagar, Ashraf ;
Al-Debsi, Ridhwan ;
Einea, Omar .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (01)
[9]   Machine learning of syntactic parse trees for search and classification of text [J].
Galitsky, Boris .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (03) :1072-1091
[10]   People counting based on head detection combining Adaboost and CNN in crowded surveillance environment [J].
Gao, Chenqiang ;
Li, Pei ;
Zhang, Yajun ;
Liu, Jiang ;
Wang, Lan .
NEUROCOMPUTING, 2016, 208 :108-116