IAN-BERT: Combining Post-trained BERT with Interactive Attention Network for Aspect-Based Sentiment Analysis

被引:5
作者
Verma S. [1 ]
Kumar A. [1 ]
Sharan A. [1 ]
机构
[1] School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, New Delhi
关键词
Aspect-based sentiment analysis; Attention mechanism; BERT; Post-trained BERT;
D O I
10.1007/s42979-023-02229-7
中图分类号
学科分类号
摘要
Aspect-based sentiment analysis (ABSA), a task in sentiment analysis, predicts the sentiment polarity of specific aspects mentioned in the input sentence. Recent research has demonstrated the effectiveness of Bidirectional Encoder Representation from Transformers (BERT) and its variants in improving the performance of various Natural Language Processing (NLP) tasks, including sentiment analysis. However, BERT, trained on Wikipedia and BookCorpus dataset, lacks domain-specific knowledge. Also, for the ABSA task, the Attention mechanism leverages the aspect information to determine the sentiment orientation of the aspect within the given sentence. Based on the abovementioned observations, this paper proposes a novel approach called the IAN-BERT model. The IAN-BERT model leverages attention mechanisms to enhance a post-trained BERT representation trained on Amazon and Yelp datasets. The objective is to capture domain-specific knowledge using BERT representation and identify the significance of context words with aspect terms and vice versa. By incorporating attention mechanisms, the IAN-BERT model aims to improve the model’s ability to extract more relevant and informative features from the input text, ultimately leading to better predictions. Experimental evaluations conducted on SemEval-14 (Restaurant and Laptop dataset) and MAMS dataset demonstrate the effectiveness and superiority of the IAN-BERT model in aspect-based sentiment analysis. © 2023, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 36 条
[1]  
Jiang L., Yu M., Zhou M., Liu X., Zhao T., Target-dependent twitter sentiment classification, In: Proceedings of the 49Th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 151-160, (2011)
[2]  
Liu H., Chatterjee I., Zhou M., Lu X.S., Abusorrah A., Aspect-based sentiment analysis: a survey of deep learning methods, IEEE Trans Comput Soc Syst, 7, 6, pp. 1358-1375, (2020)
[3]  
Wang C.-Y., Bochkovskiy A., Liao H.-Y.M., Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors.
[4]  
Gulati A., Qin J., Chiu C.-C., Parmar N., Zhang Y., Yu J., Han W., Wang S., Zhang Z., Wu Y., Et al., Conformer: Convolution-augmented transformer for speech recognition., (2020)
[5]  
Liu Y., Han T., Ma S., Zhang J., Yang Y., Tian J., He H., Li A., He M., Liu Z., Et al., Summary of chatgpt/gpt-4 research and perspective towards the future of large language models.
[6]  
Adeniji O.D., Adeyemi S.O., Ajagbe S.A., An improved bagging ensemble in predicting mental disorder using hybridized random forest-artificial neural network model, Informatica, 46, 4, pp. 543-550, (2022)
[7]  
Ajagbe S.A., Amuda K.A., Oladipupo M.A., Oluwaseyi F.A., Okesola K.I., Multi-classification of Alzheimer disease on magnetic resonance images (MRI) using deep convolutional neural network (dcnn) approaches, Int J Adv Comput Res, 11, 53, (2021)
[8]  
Elman J.L., Finding structure in time, Cogn Sci, 14, 2, pp. 179-211, (1990)
[9]  
Hochreiter S., Schmidhuber J., Long short-term memory, Neural Comput, 9, 8, pp. 1735-1780, (1997)
[10]  
Bengio Y., Simard P., Frasconi P., Learning long-term dependencies with gradient descent is difficult, IEEE Trans Neural Netw, 5, 2, pp. 157-166, (1994)