Weak tagging and imbalanced networks for online review sentiment classification

被引:0
作者
Wei Zhenlin [1 ]
Wang Chuantao [2 ,3 ]
Yang Xuexin [2 ]
机构
[1] Beijing Jiaotong Univ, Sch Traff & Transportat, Beijing, Peoples R China
[2] Beijing Univ Civil Engn & Architecture, Sch Mech Elect & Vehicle Engn, Beijing 100044, Peoples R China
[3] Beijing Engn Res Ctr Monitoring Construct Safety, Beijing, Peoples R China
关键词
Sentiment classification; imbalanced classification; weak tagging; deep learning; INFORMATION; SMOTE;
D O I
10.3233/JIFS-221565
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment classification aims to complete the automatic judgment task of text sentiment tendency. In the sentiment classification task of online reviews, traditional deep learning models require a large number of manually annotated samples of sentiment tendency for supervised training. Faced with massive online review data, the feasibility of manual tagging is worrisome. In addition, the traditional deep learning model ignores the imbalanced distribution of the number of classification samples, which will lead to a decline in classification performance in the practical application of the model. Considering that the online review data contains weak tagging information such as scores and labels, and the distribution is imbalanced, a weak tagging and imbalanced networks for online review sentiment classification is constructed. The experimental results show that the model significantly outperforms the traditional deep learning model in the sentiment classification task of hotel review data.
引用
收藏
页码:185 / 194
页数:10
相关论文
共 21 条
  • [1] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [2] On the use of MapReduce for imbalanced big data using Random Forest
    del Rio, Sara
    Lopez, Victoria
    Manuel Benitez, Jose
    Herrera, Francisco
    [J]. INFORMATION SCIENCES, 2014, 285 : 112 - 137
  • [3] Ding X., 2008, P 2008 INT C WEB SEA, P231, DOI DOI 10.1145/1341531.1341561
  • [4] Deep Learning Structure for Cross-Domain Sentiment Classification Based on Improved Cross Entropy and Weight
    Fei, Rong
    Yao, Quanzhu
    Zhu, Yuanbo
    Xu, Qingzheng
    Li, Aimin
    Wu, Haozheng
    Hu, Bo
    [J]. SCIENTIFIC PROGRAMMING, 2020, 2020
  • [5] Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning
    Han, H
    Wang, WY
    Mao, BH
    [J]. ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 878 - 887
  • [6] Hu MQ, 2004, PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, P755
  • [7] Sentiment Classification from Multi-class Imbalanced Twitter Data Using Binarization
    Krawczyk, Bartosz
    McInnes, Bridget T.
    Cano, Alberto
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2017, 2017, 10334 : 26 - 37
  • [8] Imbalanced text sentiment classification using universal and domain-specific knowledge
    Li, Yijing
    Guo, Haixiang
    Zhang, Qingpeng
    Gu, Mingyun
    Yang, Jianying
    [J]. KNOWLEDGE-BASED SYSTEMS, 2018, 160 : 1 - 15
  • [9] Ling C. X., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P73
  • [10] Maas A. L., 2011, P 49 ANN M ASS COMP, P142, DOI DOI 10.5555/2002472.2002491