SSSA: low data sentiment analysis using boosting semi-supervised approach and deep feature learning network

被引:1
作者
Rashidi, Shima [1 ]
Tanha, Jafar [2 ]
Sharifi, Arash [1 ]
Hosseinzadeh, Mehdi [3 ,4 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Tehran, Iran
[2] Univ Tabriz, Fac Elect & Comp Engn, Tabriz, Iran
[3] Duy Tan Univ, Sch Comp Sci, Da Nang, Vietnam
[4] Jadara Univ, Res Ctr, Irbid, Jordan
关键词
Sentiment analysis; Low data sentiment analysis; Semi-supervised; Embedding update mechanism; Pseudo labeling approach;
D O I
10.1007/s10489-024-06071-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis is the process of determining the expressive direction of the user reviews. Recently, sentiment analysis gets more attention. However, low data sentiment analysis receives less attention. The existing works try to augment the samples to consider this issue. In this study, we have utilized a semi-supervised approach to propose a new approach for low-data sentiment analysis. To do so, we have utilized pre-trained XLNet as a feature extractor network to initialize the feature vector for each tweet. Next, these initial representations are fed into the embedding update module to map features into the new space by optimizing the contrastive loss. Then, we utilized a semi-supervised boosting method to assign pseudo labels to unlabeled data. The iteration between the semi-supervised module and the embedding update module is done until convergence is happened. During these iterations, the embedding update module propagates the error-correcting signals to a semi-supervised module. To evaluate the proposed approach, we have applied it to the SemEval2017dataset (task 4), Sentiment 140, and IMDB Movie Reviews. We have designed many different experiment settings to validate the proposed approach's different modules. On SemEval2017dataset (task 4), we have got 75.9% and 77.1% in AvgRec and F1PN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${F}_{1}<^>{PN}$$\end{document} respectively. Also, when only 10% of the training samples as labeled samples are used, we get the 71.8% and 73.6% in AvgRec and F1PN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${F}_{1}<^>{PN}$$\end{document} respectively. The results show that our approach significantly improves with respect to the comparable methods. Also, on IMDB Movie Reviews and Sentiment 140, the proposed approach demonstrates improved performance compared to comparable methods.
引用
收藏
页数:13
相关论文
共 42 条
[1]   Sector-level sentiment analysis with deep learning [J].
Almalis, Ioannis ;
Kouloumpris, Eleftherios ;
Vlahavas, Ioannis .
KNOWLEDGE-BASED SYSTEMS, 2022, 258
[2]  
Baziotis C., 2017, P 11 INT WORKSHOP SE, P747
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]  
Chen Ting., 2020, Advances in neural information processing systems, V33, P22243, DOI 10.48550/arXiv.2006.10029
[5]   Retrieve-and-Edit Domain Adaptation for End2End Aspect Based Sentiment Analysis [J].
Chen, Zhuang ;
Qian, Tieyun .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 :659-672
[6]  
Cliche Mathieu., 2017, BBtwtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs, P573, DOI [DOI 10.18653/V1/S17-2094, 10.18653/v1/S17-2094]
[7]   Sentiment Analysis Based on Deep Learning: A Comparative Study [J].
Dang, Nhan Cach ;
Moreno-Garcia, Maria N. ;
De la Prieta, Fernando .
ELECTRONICS, 2020, 9 (03)
[8]  
Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[9]  
Esuli A., 2006, 5th Conference on Language Resources and Evaluation LREC '06, P417, DOI DOI 10.1155/2015/715730
[10]  
Go A., 2009, Twitter sentiment classification using distant supervision, P1