Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm

被引:10
作者
Jain, Vipin [1 ]
Kashyap, Kanchan Lata [1 ]
机构
[1] VIT Univ Bhopal, SCSE, Bhopal 466114, Madhya Pradesh, India
关键词
COVID-19; Sentiment; Grey wolf; Optimization; Deep learning; Ensemble learning;
D O I
10.1007/s11042-022-13937-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A SARS-CoV-2 virus has spread around the globe since March 2020. Millions of people infected worldwide with coronavirus. People from every country expressed their sentiments about coronavirus on social media. The aim of this work is to determine the general public opinion of Indian Twitter users about coronavirus. The Hindi tweets posted about COVID-19 is used as input data for sentiment analysis. The natural language processing is applied on input data for feature extraction. Further, the optimal features are selected from the pre-processed data using the metaheuristic based Grey wolf optimization technique. Finally, a hybrid of convolution neural network(CNN) and a long short-term memory (LSTM) model pair is employed to categorize the sentiments as positive, negative, and neutral. The outcome of the proposed model is compared with other machine learning techniques, namely, Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, Support vector machine (SVM), CNN, LSTM, LSTM-CNN, and CNN-LSTM. The highest accuracy of 87.75%, 88.41%, 87.89%, 85.54%, 89.11%, 91.46%, 88.72%, 91.54%, and 92.34% is obtained by Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, SVM, CNN, LSTM, LSTM-CNN, and CNN-LSTM, respectively. The proposed ensemble hybrid model gives the highest 95.54%, 91.44%, 89.63%, and 90.87% classification accuracy, precision, recall, and F-score, respectively.
引用
收藏
页码:16839 / 16859
页数:21
相关论文
共 44 条
[31]   Grey Wolf Optimizer [J].
Mirjalili, Seyedali ;
Mirjalili, Seyed Mohammad ;
Lewis, Andrew .
ADVANCES IN ENGINEERING SOFTWARE, 2014, 69 :46-61
[32]  
Mittal N., 2013, P INT JOINT C NAT LA, P45
[33]   Social media sentiment analysis based on COVID-19 [J].
Nemes, Laszlo ;
Kiss, Attila .
JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2021, 5 (01) :1-15
[34]   Lessons learned from the 2019-nCoV epidemic on prevention of future infectious diseases [J].
Pan, Xingchen ;
Ojcius, David M. ;
Gao, Tianyue ;
Li, Zhongsheng ;
Pan, Chunhua ;
Pan, Chungen .
MICROBES AND INFECTION, 2020, 22 (02) :86-91
[35]  
Prabhakar KailaR., 2020, INT J ADV RES ENG TE, V11, P128, DOI [10.34218/IJARET.11.3.2020.011, DOI 10.34218/IJARET.11.3.2020.011]
[36]   Measuring the Outreach Efforts of Public Health Authorities and the Public Response on Facebook During the COVID-19 Pandemic in Early 2020: Cross-Country Comparison [J].
Raamkumar, Aravind Sesagiri ;
Tan, Soon Guan ;
Wee, Hwee Lin .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (05)
[37]   Deep Belief Networks with Feature Selection for Sentiment Classification [J].
Ruangkanokmas, Patrawut ;
Achalakul, Tiranee ;
Akkarajitsakul, Khajonpong .
2016 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION (ISMS), 2016, :9-14
[38]  
Sai Ambati L, 2020, INFLUENCE DIGITAL DI
[39]   COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification [J].
Samuel, Jim ;
Ali, G. G. Md Nawaz ;
Rahman, Md Mokhlesur ;
Esawi, Ek ;
Samuel, Yana .
INFORMATION, 2020, 11 (06)
[40]   Arabic text clustering using improved clustering algorithms with dimensionality reduction [J].
Sangaiah, Arun Kumar ;
Fakhry, Ahmed E. ;
Abdel-Basset, Mohamed ;
El-henawy, Ibrahim .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (02) :S4535-S4549