A classified feature representation three-way decision model for sentiment analysis

被引:23
作者
Chen, Jie [1 ,2 ]
Chen, Yue [1 ,2 ]
He, Yechen [3 ]
Xu, Yang [1 ,2 ]
Zhao, Shu [1 ,2 ]
Zhang, Yanping [1 ,2 ]
机构
[1] Minist Educ, Key Lab Intelligent Comp & Signal Proc, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
[3] Beijing Univ Posts & Telecommun, Beijing 10045, Peoples R China
基金
中国国家自然科学基金;
关键词
Sentiment analysis; Feature selection; A classified feature representation; Three-way decision; REVIEWS; SETS;
D O I
10.1007/s10489-021-02809-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Binary sentiment analysis uses sentiment dictionaries, TF-IDF, word2vec, and BERT to convert text documents such as product and movie reviews into vectors. Dimensionality reduction by feature selection can effectively reduce the complexity of sentiment analysis. Existing feature selection methods put all samples together and ignore the difference in the feature representation between different categories. For binary sentiment analysis, there are some reviews with uncertain sentiment polarity, three-way decision divides samples into positive (POS) region, negative (NEG) region, and uncertain region (UNC). The model based on the three-way decision is beneficial to process the UNC and improve the effect of binary sentiment analysis. However, how to obtain the optimal feature representation in certain regions respectively to process the uncertain samples is a challenge. In this paper, a classified feature representation three-way decision model is proposed to obtain the optimal feature representation of the positive and negative domains for sentiment analysis. In the positive domain and the negative domain, m- and n-layer feature representations are obtained. The optimal layer with the best performance is selected as the optimal feature representation. The POS region and the NEG region in the testing set are processed by the optimal feature representation, the UNC region is processed by the original feature representation. Experiments on IMDB and Amazon show that the performance of our proposed method in terms of classification accuracy in sentiment analysis is significantly higher than that of the chi-square, principal component analysis, and mutual information methods.
引用
收藏
页码:7995 / 8007
页数:13
相关论文
共 48 条
[21]   Cost-sensitive dual-bidirectional linear discriminant analysis [J].
Li, Huaxiong ;
Zhang, Libo ;
Huang, Bing ;
Zhou, Xianzhong .
INFORMATION SCIENCES, 2020, 510 :283-303
[22]   Cost-sensitive sequential three-way decision modeling using a deep neural network [J].
Li, Huaxiong ;
Zhang, Libo ;
Zhou, Xianzhong ;
Huang, Bing .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 85 :68-78
[23]   On modeling similarity and three-way decision under incomplete information in rough set theory [J].
Luo, Junfang ;
Fujita, Hamido ;
Yao, Yiyu ;
Qin, Keyun .
KNOWLEDGE-BASED SYSTEMS, 2020, 191
[24]   Efficient feature selection techniques for sentiment analysis [J].
Madasu, Avinash ;
Elango, Sivasankar .
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (9-10) :6313-6335
[25]   NICFS: A novel feature selection method applied to lexicon based sentiment analysis [J].
Mehta, Poornima ;
Chandra, Satish .
INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2019, 13 (01) :41-48
[26]   On transformations from semi-three-way decision spaces to three-way decision spaces based on triangular norms and triangular conorms [J].
Qiao, Junsheng ;
Hu, Bao Qing .
INFORMATION SCIENCES, 2018, 432 :22-51
[27]   GAWA-A Feature Selection Method for Hybrid Sentiment Classification [J].
Rasool, Abdur ;
Tao, Ran ;
Kamyab, Marjan ;
Hayat, Shoaib .
IEEE ACCESS, 2020, 8 :191850-191861
[28]  
Sabour S, 2017, ADV NEUR IN, V30
[29]  
Tang D, EFFECTIVE LSTMS TARG
[30]  
Tommasel A, 2018, INFORM FUSION, V40, P1, DOI [10.1016/j.inffus.2017.05:003, 10.1016/j.inffus.2017.05.003]