SFU ReviewSP-NEG: a Spanish corpus annotated with negation for sentiment analysis. A typology of negation patterns

被引:0
作者
Salud María Jiménez-Zafra
Mariona Taulé
M. Teresa Martín-Valdivia
L. Alfonso Ureña-López
M. Antónia Martí
机构
[1] Universidad de Jaén,Department of Computer Science
[2] University of Barcelona,CLiC, Centre de Llenguatge i Computació, Department of Linguistics
来源
Language Resources and Evaluation | 2018年 / 52卷
关键词
Annotation of negation; Scope of negation; Polarity annotation; Sentiment analysis;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we present SFU ReviewSP-NEG, the first Spanish corpus annotated with negation with a wide coverage freely available. We describe the methodology applied in the annotation of the corpus including the tagset, the linguistic criteria and the inter-annotator agreement tests. We also include a complete typology of negation patterns in Spanish. This typology has the advantage that it is easy to express in terms of a tagset for corpus annotation: the types are clearly defined, which avoids ambiguity in the annotation process, and they provide wide coverage (i.e. they resolved all the cases occurring in the corpus). We use the SFU ReviewSP as a base in order to make the annotations. The corpus consists of 400 reviews, 221,866 words and 9455 sentences, out of which 3022 sentences contain at least one negation structure.
引用
收藏
页码:533 / 569
页数:36
相关论文
共 37 条
[1]  
Afzal Z(2014)Contextd: An algorithm to identify contextual properties of medical terms in a dutch clinical corpus BMC Bioinformatics 15 1-535
[2]  
Pons E(2014)Retrieving implicit positive meaning from negated statements Natural Language Engineering 20 501-920
[3]  
Kang N(2013)The ddi corpus: An annotated corpus with pharmacological substances and drug-drug interactions Journal of Biomedical Informatics 46 914-260
[4]  
Sturkenboom MC(2008)Corpus annotation for mining biomedical events from literature BMC Bioinformatics 9 1-106
[5]  
Schuemie MJ(2012)Modality and negation: An introduction to the special issue Computational Linguistics 38 223-D906
[6]  
Kors JA(2005)The proposition bank: An annotated corpus of semantic roles Computational Linguistics 31 71-undefined
[7]  
Blanco E(2007)Bioinfer: a corpus for information extraction in the biomedical domain BMC Bioinformatics 8 1-undefined
[8]  
Moldovan D(2008)The bioscope corpus: Biomedical texts annotated for uncertainty, negation and their scopes BMC Bioinformatics 9 1-undefined
[9]  
Herrero-Zazo M(2008)Drugbank: A knowledgebase for drugs, drug actions and drug targets Nucleic Acids Research 36 D901-undefined
[10]  
Segura-Bedmar I(undefined)undefined undefined undefined undefined-undefined