ParlVote: A Corpus for Sentiment Analysis of Political Debates

被引:0
作者
Abercrombie, Gavin [1 ]
Batista-Navarro, Riza [1 ]
机构
[1] Univ Manchester, Dept Comp Sci, Kilburn Bldg, Manchester M13 9PL, Lancs, England
来源
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020) | 2020年
关键词
sentiment analysis; parliamentary debates; Hansard;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Debate transcripts from the UK Parliament contain information about the positions taken by politicians towards important topics, but are difficult for people to process manually. While sentiment analysis of debate speeches could facilitate understanding of the speakers' stated opinions, datasets currently available for this task are small when compared to the benchmark corpora in other domains. We present ParlVote, a new, larger corpus of parliamentary debate speeches for use in the evaluation of sentiment analysis systems for the political domain. We also perform a number of initial experiments on this dataset, testing a variety of approaches to the classification of sentiment polarity in debate speeches. These include a linear classifier as well as a neural network trained using a transformer word embedding model (BERT), and fine-tuned on the parliamentary speeches. We find that in many scenarios, a linear classifier trained on a bag-of-words text representation achieves the best results. However, with the largest dataset, the transformer-based model combined with a neural classifier provides the best performance. We suggest that further experimentation with classification models and observations of the debate content and structure are required, and that there remains much room for improvement in parliamentary sentiment analysis.
引用
收藏
页码:5073 / 5078
页数:6
相关论文
共 16 条
  • [1] Abercrombie G., 2018, P 11 INT C LANG RES
  • [2] Abercrombie G, 2018, PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), P4173
  • [3] [Anonymous], 2006, P 2006 C EMPIRICAL M
  • [4] [Anonymous], 2010, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • [5] [Anonymous], 2013, P 14 INT C ART INT L
  • [6] Baccianella S, 2010, LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
  • [7] Balahur A, 2009, LECT NOTES COMPUT SC, V5449, P468, DOI 10.1007/978-3-642-00382-0_38
  • [8] Bhatia S., 2018, P 9 WORKSH COMP APPR, P79
  • [9] Bhavan A, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, P281
  • [10] Burfoot C., 2008, P AUSTR LANG TECHN A, P11