Enhancing aspect-category sentiment analysis via syntactic data augmentation and knowledge enhancement

被引:14
作者
Liu, Bin [1 ,2 ]
Lin, Tao [1 ]
Li, Ming [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Sichuan, Peoples R China
[2] Technol Innovat Ctr, State Owned factory 783, Mianyang 621000, Sichuan, Peoples R China
关键词
Aspect -category sentiment analysis; Data augmentation; Natural language inference; Pretrained language model;
D O I
10.1016/j.knosys.2023.110339
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of aspect-category sentiment analysis (ACSA) is to predict the sentiment polarity toward a specific aspect category from reviewers' expressed opinions in a sentence. With the boom of pretrained language models, various relative methods have achieved significant improvements in ACSA. However, two major issues still remain to be solved. First, most of these studies usually follow the canonical method of fine-tuning on limited labeled data, neglecting to leverage external knowledge to further enhance ACSA performance. Second, aspect categories are usually abstract concepts that are mentioned explicitly or implicitly, and the corresponding different sentiment polarities are not easy to accurately recognize. To address these issues, we first transform the ACSA task into a sentence-pair classification task with natural language inference, constructing synthetic sentences as hypotheses based on the predefined aspect categories and the prompt-generation sentence template. Then the model applies a passivization transformation to the synthetic sentences and generates more syntactic data to augment the limited training data. Furthermore, we enhance ACSA with curated knowledge from a common sense knowledge graph. Finally, different representations are synergistically fused with a gating mechanism to output richer sentiment features and enable context-, syntax-, and knowledge-aware predictions. Experimental results on three challenging benchmark datasets show that the proposed model outperforms some competitive baselines.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 51 条
  • [31] Zhang M, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P3540
  • [32] WORDNET - A LEXICAL DATABASE FOR ENGLISH
    MILLER, GA
    [J]. COMMUNICATIONS OF THE ACM, 1995, 38 (11) : 39 - 41
  • [33] Min J, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P2339
  • [34] Peters ME, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P1499, DOI 10.5771/9783845286846
  • [35] Pontiki M, 2014, P 8 INT WORKSH SEM E, P2735, DOI [10.3115/v1/s14-2004, DOI 10.3115/V1/S14-2004, 10.3115/v1/S14-2004]
  • [36] Pontiki M, 2016, P 10 INT WORKSH SEM, P1930, DOI DOI 10.18653/V1/S16-1002
  • [37] Sun C, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P380
  • [38] Sun Y., 2021, ARXIV210702137 CORR
  • [39] Tay Y, 2018, AAAI CONF ARTIF INTE, P5956
  • [40] Vaswani A, 2017, ADV NEUR IN, V30