TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations

被引:24
作者
Azzouza, Noureddine [1 ]
Akli-Astouati, Karima [1 ]
Ibrahim, Roliana [2 ]
机构
[1] Univ Sci & Technol Houari Boumediene, FEI Dept Comp Sci, RIIMA Lab, Algiers, Algeria
[2] Univ Teknol Malaysia UTM, Fac Engn, Sch Comp, Johor Baharu 81310, Johor, Malaysia
来源
EMERGING TRENDS IN INTELLIGENT COMPUTING AND INFORMATICS: DATA SCIENCE, INTELLIGENT INFORMATION SYSTEMS AND SMART COMPUTING | 2020年 / 1073卷
关键词
Twitter Sentiment Analysis; Word embedding; CNN; LSTM; BERT;
D O I
10.1007/978-3-030-33582-3_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis has been a topic of discussion in the exploration domain of language understanding. Yet, the neural networks deployed in it are deficient to some extent. Currently, the majority of the studies proceeds on identifying the sentiments by focusing on vocabulary and syntax. Moreover, the task is recognised in Natural Language Processing (NLP) and, for calculating the noteworthy and exceptional results, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have been employed. In this study, we propose a four-phase framework for Twitter Sentiment Analysis. This setup is based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model as an encoder for generating sentence depictions. For more effective utilisation of this model, we deploy various classification models. Additionally, we concatenate pre-trained representations of word embeddings with BERT representation method to enhance sentiment classification. Experimental results show better implementation when it is evaluated against the baseline framework on all datasets. For example, our best model attains an F1-score of 71.82% on the SemEval 2017 dataset. A comparative analysis on experimental results offers some recommendations on choosing pretraining steps to obtain improved results. The outcomes of the experiment confirm the effectiveness of our system.
引用
收藏
页码:428 / 437
页数:10
相关论文
共 34 条
  • [1] Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter
    Abid, Fazeel
    Alam, Muhammad
    Yasir, Muhammad
    Li, Chen
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 95 : 292 - 308
  • [2] [Anonymous], 2016, P 10 INT WORKSHOP SE, DOI [10.18653/v1/S16-1001, DOI 10.18653/V1/S16-1001]
  • [3] [Anonymous], 2009, Sentiment140
  • [4] [Anonymous], 2017, P 11 INT WORKSH SEM
  • [5] [Anonymous], 2013, 2 JOINT C LEX COMP S
  • [6] [Anonymous], 2015, 9 INT WORKSH SEM EV, DOI [10.18653/v1/S15-2078, DOI 10.18653/V1/S15-2078]
  • [7] [Anonymous], 2010, LREC 10
  • [8] Twitter Sentiment Analysis Experiments Using Word Embeddings on Datasets of Various Scales
    Arslan, Yusuf
    Kucuk, Dilek
    Birturk, Aysenur
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2018), 2018, 10859 : 40 - 47
  • [9] Bahdanau D., 2015, CORR
  • [10] Multitask Learning for Fine-Grained Twitter Sentiment Analysis
    Balikas, Georgios
    Moura, Simon
    Amini, Massih-Reza
    [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1005 - 1008