An Improved Bees Algorithm for Training Deep Recurrent Networks for Sentiment Classification

被引:13
作者
Zeybek, Sultan [1 ]
Pham, Duc Truong [2 ]
Koc, Ebubekir [3 ]
Secer, Aydin [4 ]
机构
[1] Fatih Sultan Mehmet Vakif Univ, Dept Comp Engn, TR-34445 Istanbul, Turkey
[2] Univ Birmingham, Dept Mech Engn, Birmingham B15 2TT, W Midlands, England
[3] Fatih Sultan Mehmet Vakif Univ, Dept Biomed Engn, TR-34445 Istanbul, Turkey
[4] Yildiz Tech Univ, Dept Math Engn, TR-34220 Istanbul, Turkey
来源
SYMMETRY-BASEL | 2021年 / 13卷 / 08期
关键词
bees algorithm; training deep neural networks; metaheuristics; opinion mining; recurrent neural networks; sentiment classification; natural language processing; NEURAL-NETWORKS; ENERGY; OPTIMIZATION;
D O I
10.3390/sym13081347
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recurrent neural networks (RNNs) are powerful tools for learning information from temporal sequences. Designing an optimum deep RNN is difficult due to configuration and training issues, such as vanishing and exploding gradients. In this paper, a novel metaheuristic optimisation approach is proposed for training deep RNNs for the sentiment classification task. The approach employs an enhanced Ternary Bees Algorithm (BA-3+), which operates for large dataset classification problems by considering only three individual solutions in each iteration. BA-3+ combines the collaborative search of three bees to find the optimal set of trainable parameters of the proposed deep recurrent learning architecture. Local learning with exploitative search utilises the greedy selection strategy. Stochastic gradient descent (SGD) learning with singular value decomposition (SVD) aims to handle vanishing and exploding gradients of the decision parameters with the stabilisation strategy of SVD. Global learning with explorative search achieves faster convergence without getting trapped at local optima to find the optimal set of trainable parameters of the proposed deep recurrent learning architecture. BA-3+ has been tested on the sentiment classification task to classify symmetric and asymmetric distribution of the datasets from different domains, including Twitter, product reviews, and movie reviews. Comparative results have been obtained for advanced deep language models and Differential Evolution (DE) and Particle Swarm Optimization (PSO) algorithms. BA-3+ converged to the global minimum faster than the DE and PSO algorithms, and it outperformed the SGD, DE, and PSO algorithms for the Turkish and English datasets. The accuracy value and F1 measure have improved at least with a 30-40% improvement than the standard SGD algorithm for all classification datasets. Accuracy rates in the RNN model trained with BA-3+ ranged from 80% to 90%, while the RNN trained with SGD was able to achieve between 50% and 60% for most datasets. The performance of the RNN model with BA-3+ has as good as for Tree-LSTMs and Recursive Neural Tensor Networks (RNTNs) language models, which achieved accuracy results of up to 90% for some datasets. The improved accuracy and convergence results show that BA-3+ is an efficient, stable algorithm for the complex classification task, and it can handle the vanishing and exploding gradients problem of deep RNNs.
引用
收藏
页数:26
相关论文
共 96 条
[1]   Integrating Elman recurrent neural network with particle swarm optimization algorithms for an improved hybrid training of multidisciplinary datasets [J].
Ab Aziz, Mohamad Firdaus ;
Mostafa, Salama A. ;
Foozy, Cik Feresa Mohd ;
Mohammed, Mazin Abed ;
Elhoseny, Mohamed ;
Abualkishik, Abedallah Zaid .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183
[2]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[3]  
[Anonymous], 2011, NUMERICAL ANAL, DOI DOI 10.1017/CBO9781107415324.004
[4]  
[Anonymous], 2021, P MACHINE LEARNING R
[5]  
[Anonymous], 2011, P INT C MACH LEARN
[6]  
[Anonymous], 2009, INT C ART INT STAT, DOI DOI 10.1145/3301282
[7]  
[Anonymous], 2013, Training recurrent neural networks
[8]  
[Anonymous], 2007, Journal of Machine Learning Research
[9]  
[Anonymous], 2010, P ICML 2010 P 27 INT
[10]  
Asghar Nabiha, 2016, arXiv preprint arXiv:1605.05362