Enhancing aspect-based sentiment analysis using data augmentation based on back-translation

被引：5

作者：

Taheri, Alireza ^{[1
]}

Zamanifar, Azadeh ^{[1
]}

Farhadi, Amirfarhad ^{[1
]}

机构：

[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Sattari Hwy, Tehran 1477893855, Iran

来源：

INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS | 2025年 / 19卷 / 03期

关键词：

Aspect-based sentiment analysis; Data augmentation; Back-translation; Special character utilization;

D O I：

10.1007/s41060-024-00622-w

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Aspect-based sentiment analysis (ABSA) identifies mentioned aspects and predicts their associated sentiments in sentences. With the rapid growth of users' online activities, ABSA, as a means of automatically interpreting text, shows significant importance. Aspect terms and opinions are dissimilar for different topics; hence, providing more labeled data related to the domain might be required to achieve better performance. Labeling massive amounts of data is expensive and time-consuming, but by using data augmentation, enhancing the performance is possible without collecting new labeled data. In this paper, we present a hybrid data augmentation method to extend the original data and increase ABSA performance. Back-translation has proved helpful in other areas of NLP as a paraphrasing method to augment text, but because of the data structure of ABSA, it has not been used at its full potential in this field yet. We utilize Special Character Insertion (SCI) in back-translation to address this compatibility issue and generate synthetic augmented sentences. By doing so, the output sentences will preserve meaning and sentiments toward the aspect terms, and their location in the sentence will be restored. Using Random Entity Replacement (RER), we can make the back-translated sentence even more diverse to make the model generalize better on the limited data. RER is used to replace named entities with the lowest chance of getting replaced by the back-translator. Different ABSA models and two benchmark datasets, SemEval 2014 in Restaurant and Laptop domains, are used to evaluate our approach, and five different languages as the middle language for back-translation are investigated. Results show that using German as a middle language, our approach on average can increase accuracy and f1 score by 0.78 and 1.75 percent compared to the original dataset.

引用

页码：491 / 516

页数：26

共 57 条

[1]

Akbik A, 2019, NAACL HLT 2019: THE 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE DEMONSTRATIONS SESSION, P54

[2] Aspect-Level Sentiment Analysis Based on Bidirectional-GRU in SIoT [J].

Ali, Waqar ;

Yang, Yuwang ;

Qiu, Xiulin ;

Ke, Yaqi ;

Wang, Yinyin .

IEEE ACCESS, 2021, 9 :69938-69950

[3] Aspect-based sentiment analysis using smart government review data [J].

Alqaryouti, Omar ;

Siyam, Nur ;

Monem, Azza Abdel ;

Shaalan, Khaled .

APPLIED COMPUTING AND INFORMATICS, 2024, 20 (1/2) :142-161

[4] ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis [J].

Basiri, Mohammad Ehsan ;

Nemati, Shahla ;

Abdar, Moloud ;

Cambria, Erik ;

Acharya, U. Rajendra .

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 115 :279-294

[5] Data augmentation in natural language processing: a novel text generation approach for long and short text classifiers [J].

Bayer, Markus ;

Kaufhold, Marc-Andre ;

Buchhold, Bjorn ;

Keller, Marcel ;

Dallmeyer, Joerg ;

Reuter, Christian .

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (01) :135-150

[6]

Beddiar D. R., 2021, Online Social Networks and Media, V24, DOI [DOI 10.1016/J.OSNEM.2021.100153, 10.1016/j.osnem.2021.100153]

[7] Neural multi-task learning for end-to-end Arabic aspect-based sentiment analysis [J].

Bensoltane, Rajae ;

Zaki, Taher .

COMPUTER SPEECH AND LANGUAGE, 2025, 89

[8]

Bethard S., 2022, arXiv

[9] Using back-and-forth translation to create artificial augmented textual data for sentiment analysis models [J].

Body, Thomas ;

Tao, Xiaohui ;

Li, Yuefeng ;

Li, Lin ;

Zhong, Ning .

EXPERT SYSTEMS WITH APPLICATIONS, 2021, 178

[10] Utilizing OpenAI's GPT-4 for written feedback [J].

Carlson, Makenna ;

Pack, Austin ;

Escalante, Juan .

TESOL JOURNAL, 2024, 15 (02)

← 1 2 3 4 5 6 →