Using EmotiBlog to annotate and analyse subjectivity in the new textual genres

被引:17
作者
Boldrini, Ester [1 ]
Balahur, Alexandra [1 ]
Martinez-Barco, Patricio [1 ]
Montoyo, Andres [1 ]
机构
[1] Univ Alicante, GPLSI, E-03080 Alicante, Spain
关键词
Sentiment analysis; Annotation model; Feature selection; Opinion Mining; New textual genres; AGREEMENT;
D O I
10.1007/s10618-012-0259-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thanks to the increasing amount of subjective data on the Web 2.0, tools to manage and exploit such data become essential. Our research is focused on the creation of EmotiBlog, a fine-grained annotation scheme for labelling subjectivity in non-traditional textual genres. We also present the EmotiBlog corpus; a collection of blog posts composed by 270,000 tokens about 3 topics and in 3 languages: Spanish, English and Italian. Additionally, we carry out a series of experiments focused on checking the robustness of the model and its applicability to Natural Language Processing tasks with regards to the 3 languages. The experiments for the inter-annotator agreement, as well as for feature selection, provided satisfactory results, which have given an impetus to continue working with the model and extend the annotated corpus. In order to check its applicability, we tested different Machine Learning models created using the annotation in EmotiBlog on other corpora in order to see if the obtained annotation is domain and genre independent, obtaining positive results. Finally, we also applied EmotiBlog to Opinion Mining, proving that our resource allows an improvement the performance of systems built for this task.
引用
收藏
页码:603 / 634
页数:32
相关论文
共 60 条
  • [1] [Anonymous], C HUM LANG TECHN EMP
  • [2] [Anonymous], 2004, COLING 2004 P 20 INT
  • [3] [Anonymous], 2006, AAAI 06
  • [4] [Anonymous], 2002, P 40 ANN M ASS COMP
  • [5] [Anonymous], P 19 NAT C ART INT A
  • [6] [Anonymous], 1994, SIGIR
  • [7] Inter-Coder Agreement for Computational Linguistics
    Artstein, Ron
    Poesio, Massimo
    [J]. COMPUTATIONAL LINGUISTICS, 2008, 34 (04) : 555 - 596
  • [8] Balahur A, 2010, P COLING C
  • [9] Balahur A, 2010, OPAL APPL OPINION MI
  • [10] Balahur A, 2009, P ACL SING