A classified feature representation three-way decision model for sentiment analysis

被引:23
作者
Chen, Jie [1 ,2 ]
Chen, Yue [1 ,2 ]
He, Yechen [3 ]
Xu, Yang [1 ,2 ]
Zhao, Shu [1 ,2 ]
Zhang, Yanping [1 ,2 ]
机构
[1] Minist Educ, Key Lab Intelligent Comp & Signal Proc, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
[3] Beijing Univ Posts & Telecommun, Beijing 10045, Peoples R China
基金
中国国家自然科学基金;
关键词
Sentiment analysis; Feature selection; A classified feature representation; Three-way decision; REVIEWS; SETS;
D O I
10.1007/s10489-021-02809-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Binary sentiment analysis uses sentiment dictionaries, TF-IDF, word2vec, and BERT to convert text documents such as product and movie reviews into vectors. Dimensionality reduction by feature selection can effectively reduce the complexity of sentiment analysis. Existing feature selection methods put all samples together and ignore the difference in the feature representation between different categories. For binary sentiment analysis, there are some reviews with uncertain sentiment polarity, three-way decision divides samples into positive (POS) region, negative (NEG) region, and uncertain region (UNC). The model based on the three-way decision is beneficial to process the UNC and improve the effect of binary sentiment analysis. However, how to obtain the optimal feature representation in certain regions respectively to process the uncertain samples is a challenge. In this paper, a classified feature representation three-way decision model is proposed to obtain the optimal feature representation of the positive and negative domains for sentiment analysis. In the positive domain and the negative domain, m- and n-layer feature representations are obtained. The optimal layer with the best performance is selected as the optimal feature representation. The POS region and the NEG region in the testing set are processed by the optimal feature representation, the UNC region is processed by the original feature representation. Experiments on IMDB and Amazon show that the performance of our proposed method in terms of classification accuracy in sentiment analysis is significantly higher than that of the chi-square, principal component analysis, and mutual information methods.
引用
收藏
页码:7995 / 8007
页数:13
相关论文
共 48 条
[1]   Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums [J].
Abbasi, Ahmed ;
Chen, Hsinchun ;
Salem, Arab .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2008, 26 (03)
[2]   RETRACTED: Three-way decisions based on neutrosophic sets and AHP-QFD framework for supplier selection problem (Retracted article. See vol. 128, pg. 569, 2022) [J].
Abdel-Basset, Mohamed ;
Manogaran, Gunasekaran ;
Mohamed, Mai ;
Chilamkurti, Naveen .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 89 :19-30
[3]   A three-way clustering approach for handling missing data using GTRS [J].
Afridi, Mohammad Khan ;
Azam, Nouman ;
Yao, JingTao ;
Alanazi, Eisa .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2018, 98 :11-24
[4]   A review of feature selection techniques in sentiment analysis [J].
Ahmad, Siti Rohaidah ;
Abu Bakar, Azuraliza ;
Yaakub, Mohd Ridzwan .
INTELLIGENT DATA ANALYSIS, 2019, 23 (01) :159-189
[5]   A single-source shortest path algorithm for dynamic graphs [J].
Alshammari, Muteb ;
Rezgui, Abdelmounaam .
AKCE INTERNATIONAL JOURNAL OF GRAPHS AND COMBINATORICS, 2020, 17 (03) :1063-1068
[6]   Hybrid attribute based sentiment classification of online reviews for consumer intelligence [J].
Bansal, Barkha ;
Srivastava, Sangeet .
APPLIED INTELLIGENCE, 2019, 49 (01) :137-149
[7]  
Chen, 2020, INT J ADV RES
[8]   Exploration of social media for sentiment analysis using deep learning [J].
Chen, Liang-Chu ;
Lee, Chia-Meng ;
Chen, Mu-Yen .
SOFT COMPUTING, 2020, 24 (11) :8187-8197
[9]  
Chung J., 2014, NIPS 2014 WORKSH DEE, DOI DOI 10.48550/ARXIV.1412.3555
[10]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171