Dual Consistency-Enhanced Semi-Supervised Sentiment Analysis Towards COVID-19 Tweets

被引：4

作者：

Sun, Teng ^{[1
]}

Jing, Liqiang ^{[1
]}

Wei, Yinwei ^{[2
]}

Song, Xuemeng ^{[1
]}

Cheng, Zhiyong ^{[3
]}

Nie, Liqiang ^{[4
]}

机构：

[1] Shandong Univ, Dept Comp Sci & Technol, Qingdao 266237, Shandong, Peoples R China

[2] Natl Univ Singapore, Sch Comp, Singapore 119077, Singapore

[3] Qilu Univ Technol, Shandong Artificial Intelligence Inst, Shandong Acad Sci, Jinan 250316, Shandong, Peoples R China

[4] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2023年 / 35卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Index Terms-Semi-supervised text classification; sentiment analysis; social media dataset on COVID-19;

D O I：

10.1109/TKDE.2023.3270940

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the context of COVID-19, numerous people present their opinions through social networks. It is thus highly desired to conduct sentiment analysis towards COVID-19 tweets to learn the public's attitudes, and facilitate the government to make proper guidelines for avoiding the social unrest. Although many efforts have studied the text-based sentiment classification from various domains (e.g., delivery and shopping reviews), it is hard to directly use these classifiers for the sentiment analysis towards COVID-19 tweets due to the domain gap. In fact, developing the sentiment classifier for COVID-19 tweets is mainly challenged by the limited annotated training dataset, as well as the diverse and informal expressions of user-generated posts. To address these challenges, we construct a large-scale COVID-19 dataset from Weibo and propose a dual COnsistency-enhanced semi-superVIseD network for Sentiment Anlaysis (COVID-SA). In particular, we first introduce a knowledge-based augmentation method to augment data and enhance the model's robustness. We then employ BERT as the text encoder backbone for both labeled data, unlabeled data, and augmented data. Moreover, we propose a dual consistency (i.e., label-oriented consistency and instance-oriented consistency) regularization to promote the model performance. Extensive experiments on our self-constructed dataset and three public datasets show the superiority of COVID-SA over state-of-the-art baselines on various applications.

引用

页码：12605 / 12617

页数：13

共 53 条

[1] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning
Abuduweili, Abulikemu
Li, Xingjian
Shi, Humphrey
Xu, Cheng-Zhong
Dou, Dejing
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6919 - 6928
[2] Ahmed H.M., 2021, Ilkogretim Online, V20, P827
[3] Chang M.W., 2008, AAAI, V2, P830
[4] Chen JA, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P2147
[5] Chen Jiaao, 2020, P 3 WORKSHOP AFFECTI, P151
[6] CN-Probase: A Data-driven Approach for Large-scale Chinese Taxonomy Construction
Chen, Jindong
Wang, Ao
Chen, Jiangjie
Xiao, Yanghua
Chu, Zhendong
Liu, Jingping
Liang, Jiaqing
Wang, Wei
[J]. 2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1706 - 1709
[7] Chen MD, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P215
[8] Towards Knowledge-Based Personalized Product Description Generation in E-commerce
Chen, Qibin
Lin, Junyang
Zhang, Yichang
Yang, Hongxia
Zhou, Jingren
Tang, Jie
[J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 3040 - 3050
[9] Chen Ting, 2019, 25 AMERICAS C INFORM
[10] Dai N, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P5997

← 1 2 3 4 5 6 →