Sentiment analysis of Chinese stock reviews based on BERT model

被引:60
作者
Li, Mingzheng [1 ]
Chen, Lei [1 ]
Zhao, Jing [1 ]
Li, Qiang [1 ]
机构
[1] Huainan Normal Univ, Coll Comp Sci, Huainan 232038, Peoples R China
关键词
Chinese stock reviews; Sentiment analysis; BERT; Fine-tuning;
D O I
10.1007/s10489-020-02101-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A large number of stock reviews are available on the Internet. Sentiment analysis of stock reviews has strong significance in research on the financial market. Due to the lack of a large amount of labeled data, it is difficult to improve the accuracy of Chinese stock sentiment classification using traditional methods. To address this challenge, in this paper, a novel sentiment analysis model for Chinese stock reviews based on BERT is proposed. This model relies on a pre-trained model to improve the accuracy of classification. The model use a BERT pre-training language model to perform representation of stock reviews on the sentence level, and subsequently feed the obtained feature vector into the classifier layer for classification. In the experiments, we demonstrate that our method has higher precision, recall, and F1 than TextCNN, TextRNN, Att-BLSTM and TextCRNN. Our model can obtain the best results which are indicated to be effective in Chinese stock review sentiment analysis. Meanwhile, Our model has powerful generalization capacity and can perform sentiment analysis in many fields.
引用
收藏
页码:5016 / 5024
页数:9
相关论文
共 35 条
[1]   Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter [J].
Abid, Fazeel ;
Alam, Muhammad ;
Yasir, Muhammad ;
Li, Chen .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 95 :292-308
[2]  
Abualigah L., 2020, Recent Advances in NLP: The Case of Arabic Language, DOI [DOI 10.1007/978-3-030-34614-0_7, 10.1007/978-3-030-34614-0_7]
[3]   Hybrid clustering analysis using improved krill herd algorithm [J].
Abualigah, Laith Mohammad ;
Khader, Ahamad Tajudin ;
Hanandeh, Essam Said .
APPLIED INTELLIGENCE, 2018, 48 (11) :4047-4071
[4]  
Abualigah LMQ, 2019, FEATURE SELECTION EN, DOI [DOI 10.1007/978-3-030-10674-4, 10.1007/978-3-030-10674-4]
[5]   Lexicon-enhanced sentiment analysis framework using rule-based classification scheme [J].
Asghar, Muhammad Zubair ;
Khan, Aurangzeb ;
Ahmad, Shakeel ;
Qasim, Maria ;
Khan, Imran Ali .
PLOS ONE, 2017, 12 (02)
[6]   Crosslingual named entity recognition for clinical de-identification applied to a COVID-19 Italian data set [J].
Catelli, Rosario ;
Gargiulo, Francesco ;
Casola, Valentina ;
De Pietro, Giuseppe ;
Fujita, Hamido ;
Esposito, Massimo .
APPLIED SOFT COMPUTING, 2020, 97
[7]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8]  
Duan D, 2019, BERT BASED RES CLASS, DOI [10.19678/j.issn.1000-3428.0056222, DOI 10.19678/J.ISSN.1000-3428.0056222]
[9]   Role-Based Self-Appointment for Autonomic Management of Resources [J].
Endo, Patricia ;
Palbares, Andre ;
Santos, Marcelo ;
Goncalves, Glauco ;
Sadok, Djamel ;
Kelner, Judith ;
Sefidcon, Azimeh ;
Wuhib, Fetahi .
2014 28TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2014, :696-701
[10]   Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering [J].
Esposito, Massimo ;
Damiano, Ernanuele ;
Minutolo, Aniello ;
De Pietro, Giuseppe ;
Fujita, Hamido .
INFORMATION SCIENCES, 2020, 514 :88-105