Deep learning based bi-polar sentiment classification of movie reviews in Hindi

被引:2
作者
Sharma, Ankita [1 ]
Ghose, Udayan [1 ]
机构
[1] Guru Gobind Singh Indraprastha Univ, Univ Sch Informat Commun & Technol, Sect 16C, New Delhi 110078, India
关键词
Deep learning; Ensemble; Hindi; Sentiment classification; Word embeddings; INDIAN LANGUAGES;
D O I
10.47974/JSMS-1030
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
With evolving technology, Hindi web content is growing considerably and catching popularity as the larger audience feels more connected and heard using their native language. A volcanic growth in online movie reviews (MRs) in Hindi has been observed lately; manually analyzing them is impossible. Hence, the research problem of automatic organization and classification of Hindi reviews is apparent as this can help viewers decide whether a movie is worth watching or not. This work focuses on developing a deep learning-based system for bi-polar sentiment classification of MRs for resource deficient language - Hindi. To this end, a primary Hindi movie review (MR) corpus is made and manually annotated with binary polarity class labels - positive or negative. The corpus is preprocessed using the preprocessing steps, and Random Word Embeddings (WEs) are utilized for feature extraction. This paper proposes an ensemble CNN_BiGRU, which is an integration of 1D CNN with BiGRU for the bipolar classification of Hindi MRs. To prove our ensemble's efficacy; other widely used mainstream deep learning models (DLMs) such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) based models - Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), are also applied and compared with proposed ensemble using average classification accuracy. Empirical results show the effectiveness of the proposed ensemble in achieving a reasonably good average accuracy of 89.366%. The results indicate that proposed CNN_BiGRU compares favorably to the state-of-the-art DLMs applied and hence gives an effective solution for sentence-level bi-polar classification of MRs in a resource deficient scenario.
引用
收藏
页码:59 / 86
页数:28
相关论文
共 32 条
[1]  
Akhtar M. S., 2016, P COLING 2016 26 IN, P482
[2]   Neural Network-Based Architecture for Sentiment Analysis in Indian Languages [J].
Bhargava, Rupal ;
Arora, Shivangi ;
Sharma, Yashvardhan .
JOURNAL OF INTELLIGENT SYSTEMS, 2019, 28 (03) :361-375
[3]  
Chen X., 2021, DISCRETE DYN NAT SOC
[4]  
Chung JY, 2014, Arxiv, DOI arXiv:1412.3555
[5]  
Gatt A, 2018, J ARTIF INTELL RES, V61, P65, DOI 10.1613/jair.5477
[6]  
Ghosh S., 2018, Journal of Operations and Strategic Planning, V1, P15
[7]  
Jhanwar MG, 2018, Arxiv, DOI [arXiv:1806.04450, DOI 10.48550/ARXIV.1806.04450]
[8]  
Hafiz A., 2020, International Journal of Intelligent Systems and Applications in Engineering, V8, P94
[9]   Sentiment classification of twitter data belonging to renewable energy using machine learning [J].
Jain, Achin ;
Jain, Vanita .
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2019, 40 (02) :521-533
[10]  
Joshi A., 2016, P COLING 2016 26 INT, P2482